Simulation of panel model

Calculates each contribution to the log likelihood function using simulation. We also calculate the individual parameters.

author:: Michel Bierlaire, EPFL
date:: Sun Apr 9 18:16:29 2023

import sys
import pickle
import biogeme.biogeme_logging as blog
import biogeme.biogeme as bio
from biogeme import models
import biogeme.exceptions as excep
import biogeme.results as res
from biogeme.expressions import (
    Beta,
    bioDraws,
    PanelLikelihoodTrajectory,
    MonteCarlo,
    log,
)

See the data processing script: Panel data preparation for Swissmetro.

from swissmetro_panel import (
    database,
    CHOICE,
    SM_AV,
    CAR_AV_SP,
    TRAIN_AV_SP,
    TRAIN_TT_SCALED,
    TRAIN_COST_SCALED,
    SM_TT_SCALED,
    SM_COST_SCALED,
    CAR_TT_SCALED,
    CAR_CO_SCALED,
)

print(f'Samples size = {database.getSampleSize()}')

Samples size = 752

We use a low number of draws, as the objective is to illustrate the syntax. In practice, this value is insufficient to have a good approximation of the integral.

NUMBER_OF_DRAWS = 50

logger = blog.get_screen_logger(level=blog.INFO)
logger.info('Example b13panel_simul.py')

Example b13panel_simul.py

Parameters.

B_COST = Beta('B_COST', 0, None, None, 0)

Define a random parameter, normally distributed across individuals, designed to be used for Monte-Carlo simulation.

B_TIME = Beta('B_TIME', 0, None, None, 0)
B_TIME_S = Beta('B_TIME_S', 1, None, None, 0)
B_TIME_RND = B_TIME + B_TIME_S * bioDraws('B_TIME_RND', 'NORMAL_ANTI')

We do the same for the constants, to address serial correlation.

ASC_CAR = Beta('ASC_CAR', 0, None, None, 0)
ASC_CAR_S = Beta('ASC_CAR_S', 1, None, None, 0)
ASC_CAR_RND = ASC_CAR + ASC_CAR_S * bioDraws('ASC_CAR_RND', 'NORMAL_ANTI')

ASC_TRAIN = Beta('ASC_TRAIN', 0, None, None, 0)
ASC_TRAIN_S = Beta('ASC_TRAIN_S', 1, None, None, 0)
ASC_TRAIN_RND = ASC_TRAIN + ASC_TRAIN_S * bioDraws('ASC_TRAIN_RND', 'NORMAL_ANTI')

ASC_SM = Beta('ASC_SM', 0, None, None, 1)
ASC_SM_S = Beta('ASC_SM_S', 1, None, None, 0)
ASC_SM_RND = ASC_SM + ASC_SM_S * bioDraws('ASC_SM_RND', 'NORMAL_ANTI')

Definition of the utility functions.

V1 = ASC_TRAIN_RND + B_TIME_RND * TRAIN_TT_SCALED + B_COST * TRAIN_COST_SCALED
V2 = ASC_SM_RND + B_TIME_RND * SM_TT_SCALED + B_COST * SM_COST_SCALED
V3 = ASC_CAR_RND + B_TIME_RND * CAR_TT_SCALED + B_COST * CAR_CO_SCALED

Associate utility functions with the numbering of alternatives.

V = {1: V1, 2: V2, 3: V3}

Associate the availability conditions with the alternatives.

av = {1: TRAIN_AV_SP, 2: SM_AV, 3: CAR_AV_SP}

Conditional on the random parameters, the likelihood of one observation is given by the logit model (called the kernel).

obsprob = models.logit(V, av, CHOICE)

Conditional on the random parameters, the likelihood of all observations for one individual (the trajectory) is the product of the likelihood of each observation.

condprobIndiv = PanelLikelihoodTrajectory(obsprob)

We integrate over the random parameters using Monte-Carlo

logprob = log(MonteCarlo(condprobIndiv))

We retrieve the parameters estimates.

try:
    results = res.bioResults(pickleFile='saved_results/b12panel.pickle')
except excep.BiogemeError:
    sys.exit(
        'Run first the script b12panel.py '
        'in order to generate the '
        'file b12panel.pickle.'
    )

Simulate to recalculate the log likelihood directly from the formula, without the Biogeme object

simulated_loglike = logprob.getValue_c(
    database=database,
    betas=results.getBetaValues(),
    numberOfDraws=NUMBER_OF_DRAWS,
    aggregation=True,
    prepareIds=True,
)

print(f'Simulated log likelihood: {simulated_loglike}')

Simulated log likelihood: -3746.058773770821

We also calculate the individual parameters for the time coefficient.

numerator = MonteCarlo(B_TIME_RND * condprobIndiv)
denominator = MonteCarlo(condprobIndiv)

simulate = {
    'Numerator': numerator,
    'Denominator': denominator,
}

Creation of the Biogeme object.

biosim = bio.BIOGEME(database, simulate, parameter_file='few_draws.toml')

File few_draws.toml has been parsed.

Suimulation.

sim = biosim.simulate(results.getBetaValues())

sim['Individual-level parameters'] = sim['Numerator'] / sim['Denominator']

print(f'{sim.shape=}')

sim.shape=(752, 3)

sim

	Numerator	Denominator	Individual-level parameters
1	-0.061842	0.010676	-5.792511
2	-5.461292	0.745673	-7.323982
3	-2.165100	0.284843	-7.601041
4	-3.931312	0.472809	-8.314807
5	-3.746690	0.444783	-8.423627
...	...	...	...
935	-0.010185	0.002201	-4.628451
936	-0.000394	0.000110	-3.596566
937	-0.000049	0.000016	-3.106041
938	-5.311541	0.684250	-7.762576
939	-0.000230	0.000540	-0.425370

752 rows × 3 columns

Total running time of the script: (0 minutes 1.431 seconds)

Gallery generated by Sphinx-Gallery