Note
Go to the end to download the full example code.
Simulation of panel modelΒΆ
Calculates each contribution to the log likelihood function using simulation. We also calculate the individual parameters.
Michel Bierlaire, EPFL Sat Jun 21 2025, 17:06:31
import sys
from IPython.core.display_functions import display
import biogeme.biogeme_logging as blog
from biogeme.biogeme import BIOGEME
from biogeme.calculator.single_formula import calculate_single_formula_from_expression
from biogeme.expressions import Beta, Draws, MonteCarlo, PanelLikelihoodTrajectory, log
from biogeme.models import logit
from biogeme.results_processing import EstimationResults
from biogeme.second_derivatives import SecondDerivativesMode
See the data processing script: Panel data preparation for Swissmetro.
from swissmetro_panel import (
CAR_AV_SP,
CAR_CO_SCALED,
CAR_TT_SCALED,
CHOICE,
SM_AV,
SM_COST_SCALED,
SM_TT_SCALED,
TRAIN_AV_SP,
TRAIN_COST_SCALED,
TRAIN_TT_SCALED,
database,
)
Define the number of draws to be used for Monte-Carlo integration.
NUMBER_OF_DRAWS = 100_000
logger = blog.get_screen_logger(level=blog.INFO)
logger.info('Example b13panel_simul.py')
Example b13panel_simul.py
Parameters to be estimated.
b_cost = Beta('b_cost', 0, None, None, 0)
Define a random parameter, normally distributed across individuals, designed to be used for Monte-Carlo simulation.
b_time = Beta('b_time', 0, None, None, 0)
It is advised not to use 0 as starting value for the following parameter.
b_time_s = Beta('b_time_s', 1, None, None, 0)
b_time_rnd = b_time + b_time_s * Draws('b_time_rnd', 'NORMAL_ANTI')
We do the same for the constants, to address serial correlation.
asc_car = Beta('asc_car', 0, None, None, 0)
asc_car_s = Beta('asc_car_s', 1, None, None, 0)
asc_car_rnd = asc_car + asc_car_s * Draws('asc_car_rnd', 'NORMAL_ANTI')
asc_train = Beta('asc_train', 0, None, None, 0)
asc_train_s = Beta('asc_train_s', 1, None, None, 0)
asc_train_rnd = asc_train + asc_train_s * Draws('asc_train_rnd', 'NORMAL_ANTI')
asc_sm = Beta('asc_sm', 0, None, None, 0)
asc_sm_s = Beta('asc_sm_s', 1, None, None, 0)
asc_sm_rnd = asc_sm + asc_sm_s * Draws('asc_sm_rnd', 'NORMAL_ANTI')
Definition of the utility functions.
v_train = asc_train_rnd + b_time_rnd * TRAIN_TT_SCALED + b_cost * TRAIN_COST_SCALED
v_swissmetro = asc_sm_rnd + b_time_rnd * SM_TT_SCALED + b_cost * SM_COST_SCALED
v_car = asc_car_rnd + b_time_rnd * CAR_TT_SCALED + b_cost * CAR_CO_SCALED
Associate utility functions with the numbering of alternatives.
v = {1: v_train, 2: v_swissmetro, 3: v_car}
Associate the availability conditions with the alternatives.
av = {1: TRAIN_AV_SP, 2: SM_AV, 3: CAR_AV_SP}
Conditional on the random parameters, the likelihood of one observation is given by the logit model (called the kernel).
choice_probability_one_observation = logit(v, av, CHOICE)
Conditional on the random parameters, the likelihood of all observations for one individual (the trajectory) is the product of the likelihood of each observation.
conditional_trajectory_probability = PanelLikelihoodTrajectory(
choice_probability_one_observation
)
We integrate over the random parameters using Monte-Carlo
log_probability = log(MonteCarlo(conditional_trajectory_probability))
We retrieve the parameters estimates.
try:
results = EstimationResults.from_yaml_file(filename='saved_results/b12panel.yaml')
except FileNotFoundError:
sys.exit(
'Run first the script b12panel.py '
'in order to generate the '
'file b12panel.yaml and move it to the directory saved_results.'
)
Simulate to recalculate the log likelihood directly from the formula, without the Biogeme object
simulated_loglike = calculate_single_formula_from_expression(
expression=log_probability,
database=database,
number_of_draws=NUMBER_OF_DRAWS,
the_betas=results.get_beta_values(),
numerically_safe=False,
second_derivatives_mode=SecondDerivativesMode.NEVER,
use_jit=True,
)
Flattening database [(6768, 38)].
Database flattened [(752, 362)]
Flattening database [(6768, 38)].
Database flattened [(752, 362)]
Flattening database [(6768, 38)].
Database flattened [(752, 362)]
Flattening database [(6768, 38)].
Database flattened [(752, 362)]
print(f'Simulated log likelihood: {simulated_loglike}')
Simulated log likelihood: -3574.7230792117853
We also calculate the individual parameters for the time coefficient.
numerator = MonteCarlo(b_time_rnd * conditional_trajectory_probability)
denominator = MonteCarlo(conditional_trajectory_probability)
simulate = {
'Numerator': numerator,
'Denominator': denominator,
}
Creation of the Biogeme object.
biosim = BIOGEME(database, simulate, number_of_draws=NUMBER_OF_DRAWS, seed=1223)
Biogeme parameters read from biogeme.toml.
Flattening database [(6768, 38)].
Database flattened [(752, 362)]
Flattening database [(6768, 38)].
Database flattened [(752, 362)]
Flattening database [(6768, 38)].
Database flattened [(752, 362)]
Flattening database [(6768, 38)].
Database flattened [(752, 362)]
Simulation.
sim = biosim.simulate(results.get_beta_values())
sim['Individual-level parameters'] = sim['Numerator'] / sim['Denominator']
print(f'{sim.shape=}')
sim.shape=(752, 3)
display(sim)
Numerator Denominator Individual-level parameters
0 -0.071506 0.010421 -6.861496
1 -5.400202 0.762721 -7.080176
2 -2.116223 0.269996 -7.837982
3 -3.576361 0.463302 -7.719288
4 -3.351181 0.476229 -7.036911
.. ... ... ...
747 -0.008062 0.001617 -4.987226
748 -0.000451 0.000129 -3.498607
749 -0.000016 0.000007 -2.303046
750 -5.102442 0.697186 -7.318624
751 -0.034959 0.024362 -1.434957
[752 rows x 3 columns]
Total running time of the script: (1 minutes 54.029 seconds)