Note
Go to the end to download the full example code.
Simulation of panel model
Calculates each contribution to the log likelihood function using simulation. We also calculate the individual parameters.
- author:
Michel Bierlaire, EPFL
- date:
Sun Apr 9 18:16:29 2023
import sys
import biogeme.biogeme as bio
import biogeme.biogeme_logging as blog
import biogeme.exceptions as excep
import biogeme.results as res
from biogeme import models
from biogeme.expressions import (
Beta,
bioDraws,
PanelLikelihoodTrajectory,
MonteCarlo,
log,
)
from biogeme.parameters import Parameters
See the data processing script: Panel data preparation for Swissmetro.
from swissmetro_panel import (
database,
CHOICE,
SM_AV,
CAR_AV_SP,
TRAIN_AV_SP,
TRAIN_TT_SCALED,
TRAIN_COST_SCALED,
SM_TT_SCALED,
SM_COST_SCALED,
CAR_TT_SCALED,
CAR_CO_SCALED,
)
print(f'Samples size = {database.get_sample_size()}')
Samples size = 752
We use a low number of draws, as the objective is to illustrate the syntax. In practice, this value is insufficient to have a good approximation of the integral.
NUMBER_OF_DRAWS = 50
logger = blog.get_screen_logger(level=blog.INFO)
logger.info('Example b13panel_simul.py')
Example b13panel_simul.py
Parameters.
B_COST = Beta('B_COST', 0, None, None, 0)
Define a random parameter, normally distributed across individuals, designed to be used for Monte-Carlo simulation.
B_TIME = Beta('B_TIME', 0, None, None, 0)
B_TIME_S = Beta('B_TIME_S', 1, None, None, 0)
B_TIME_RND = B_TIME + B_TIME_S * bioDraws('b_time_rnd', 'NORMAL_ANTI')
We do the same for the constants, to address serial correlation.
ASC_CAR = Beta('ASC_CAR', 0, None, None, 0)
ASC_CAR_S = Beta('ASC_CAR_S', 1, None, None, 0)
ASC_CAR_RND = ASC_CAR + ASC_CAR_S * bioDraws('ASC_CAR_RND', 'NORMAL_ANTI')
ASC_TRAIN = Beta('ASC_TRAIN', 0, None, None, 0)
ASC_TRAIN_S = Beta('ASC_TRAIN_S', 1, None, None, 0)
ASC_TRAIN_RND = ASC_TRAIN + ASC_TRAIN_S * bioDraws('ASC_TRAIN_RND', 'NORMAL_ANTI')
ASC_SM = Beta('ASC_SM', 0, None, None, 1)
ASC_SM_S = Beta('ASC_SM_S', 1, None, None, 0)
ASC_SM_RND = ASC_SM + ASC_SM_S * bioDraws('ASC_SM_RND', 'NORMAL_ANTI')
Definition of the utility functions.
V1 = ASC_TRAIN_RND + B_TIME_RND * TRAIN_TT_SCALED + B_COST * TRAIN_COST_SCALED
V2 = ASC_SM_RND + B_TIME_RND * SM_TT_SCALED + B_COST * SM_COST_SCALED
V3 = ASC_CAR_RND + B_TIME_RND * CAR_TT_SCALED + B_COST * CAR_CO_SCALED
Associate utility functions with the numbering of alternatives.
V = {1: V1, 2: V2, 3: V3}
Associate the availability conditions with the alternatives.
av = {1: TRAIN_AV_SP, 2: SM_AV, 3: CAR_AV_SP}
Conditional on the random parameters, the likelihood of one observation is given by the logit model (called the kernel).
obsprob = models.logit(V, av, CHOICE)
Conditional on the random parameters, the likelihood of all observations for one individual (the trajectory) is the product of the likelihood of each observation.
condprobIndiv = PanelLikelihoodTrajectory(obsprob)
We integrate over the random parameters using Monte-Carlo
logprob = log(MonteCarlo(condprobIndiv))
We retrieve the parameters estimates.
try:
results = res.bioResults(pickle_file='saved_results/b12panel.pickle')
except excep.BiogemeError:
sys.exit(
'Run first the script b12panel.py '
'in order to generate the '
'file b12panel.pickle.'
)
Simulate to recalculate the log likelihood directly from the formula, without the Biogeme object
simulated_loglike = logprob.get_value_c(
database=database,
betas=results.get_beta_values(),
number_of_draws=NUMBER_OF_DRAWS,
aggregation=True,
prepare_ids=True,
)
print(f'Simulated log likelihood: {simulated_loglike}')
Simulated log likelihood: -3746.058773770821
We also calculate the individual parameters for the time coefficient.
numerator = MonteCarlo(B_TIME_RND * condprobIndiv)
denominator = MonteCarlo(condprobIndiv)
simulate = {
'Numerator': numerator,
'Denominator': denominator,
}
Creation of the Biogeme object.
biosim = bio.BIOGEME(database, simulate, number_of_draws=NUMBER_OF_DRAWS, seed=1223)
Biogeme parameters read from biogeme.toml.
Suimulation.
sim = biosim.simulate(results.get_beta_values())
sim['Individual-level parameters'] = sim['Numerator'] / sim['Denominator']
print(f'{sim.shape=}')
sim.shape=(752, 3)
sim
Total running time of the script: (0 minutes 0.934 seconds)