Note
Go to the end to download the full example code
Mixture of logit with panel data
- Example of a mixture of logit models, using Monte-Carlo integration.
The datafile is organized as panel data.
- author:
Michel Bierlaire, EPFL
- date:
Sun Apr 9 18:12:17 2023
import numpy as np
import biogeme.biogeme_logging as blog
import biogeme.biogeme as bio
from biogeme import models
from biogeme.expressions import (
Beta,
bioDraws,
PanelLikelihoodTrajectory,
MonteCarlo,
log,
)
See the data processing script: Panel data preparation for Swissmetro.
from swissmetro_panel import (
database,
CHOICE,
SM_AV,
CAR_AV_SP,
TRAIN_AV_SP,
TRAIN_TT_SCALED,
TRAIN_COST_SCALED,
SM_TT_SCALED,
SM_COST_SCALED,
CAR_TT_SCALED,
CAR_CO_SCALED,
)
logger = blog.get_screen_logger(level=blog.INFO)
logger.info('Example b12panel.py')
Example b12panel.py
We set the seed so that the results are reproducible. This is not necessary in general.
np.random.seed(seed=90267)
Parameters to be estimated.
B_COST = Beta('B_COST', 0, None, None, 0)
Define a random parameter, normally distributed across individuals, designed to be used for Monte-Carlo simulation.
B_TIME = Beta('B_TIME', 0, None, None, 0)
It is advised not to use 0 as starting value for the following parameter.
B_TIME_S = Beta('B_TIME_S', 1, None, None, 0)
B_TIME_RND = B_TIME + B_TIME_S * bioDraws('B_TIME_RND', 'NORMAL_ANTI')
We do the same for the constants, to address serial correlation.
ASC_CAR = Beta('ASC_CAR', 0, None, None, 0)
ASC_CAR_S = Beta('ASC_CAR_S', 1, None, None, 0)
ASC_CAR_RND = ASC_CAR + ASC_CAR_S * bioDraws('ASC_CAR_RND', 'NORMAL_ANTI')
ASC_TRAIN = Beta('ASC_TRAIN', 0, None, None, 0)
ASC_TRAIN_S = Beta('ASC_TRAIN_S', 1, None, None, 0)
ASC_TRAIN_RND = ASC_TRAIN + ASC_TRAIN_S * bioDraws('ASC_TRAIN_RND', 'NORMAL_ANTI')
ASC_SM = Beta('ASC_SM', 0, None, None, 1)
ASC_SM_S = Beta('ASC_SM_S', 1, None, None, 0)
ASC_SM_RND = ASC_SM + ASC_SM_S * bioDraws('ASC_SM_RND', 'NORMAL_ANTI')
Definition of the utility functions.
V1 = ASC_TRAIN_RND + B_TIME_RND * TRAIN_TT_SCALED + B_COST * TRAIN_COST_SCALED
V2 = ASC_SM_RND + B_TIME_RND * SM_TT_SCALED + B_COST * SM_COST_SCALED
V3 = ASC_CAR_RND + B_TIME_RND * CAR_TT_SCALED + B_COST * CAR_CO_SCALED
Associate utility functions with the numbering of alternatives.
V = {1: V1, 2: V2, 3: V3}
Associate the availability conditions with the alternatives.
av = {1: TRAIN_AV_SP, 2: SM_AV, 3: CAR_AV_SP}
Conditional on the random parameters, the likelihood of one observation is given by the logit model (called the kernel).
obsprob = models.logit(V, av, CHOICE)
Conditional on the random parameters, the likelihood of all observations for one individual (the trajectory) is the product of the likelihood of each observation.
condprobIndiv = PanelLikelihoodTrajectory(obsprob)
We integrate over the random parameters using Monte-Carlo
logprob = log(MonteCarlo(condprobIndiv))
Create the Biogeme object. As the objective is to illustrate the syntax, we calculate the Monte-Carlo approximation with a small number of draws. To achieve that, we provide a parameter file different from the default one: few_draws.toml
the_biogeme = bio.BIOGEME(database, logprob, parameter_file='few_draws.toml')
the_biogeme.modelName = 'b12panel'
File few_draws.toml has been parsed.
Estimate the parameters.
results = the_biogeme.estimate()
*** Initial values of the parameters are obtained from the file __b12panel.iter
Cannot read file __b12panel.iter. Statement is ignored.
Optimization algorithm: hybrid Newton/BFGS with simple bounds [simple_bounds]
** Optimization: Newton with trust region for simple bounds
Iter. ASC_CAR ASC_CAR_S ASC_SM_S ASC_TRAIN ASC_TRAIN_S B_COST B_TIME B_TIME_S Function Relgrad Radius Rho
0 0.2 1.4 0.91 -0.79 0.97 -0.98 -1 1.3 4.1e+03 0.047 10 1.2 ++
1 0.2 1.4 0.91 -0.79 0.97 -0.98 -1 1.3 4.1e+03 0.047 5 -0.4 -
2 -0.33 2.6 -3 -1.6 6 -0.78 -2.9 0.87 4e+03 0.041 5 0.26 +
3 -0.33 2.6 -3 -1.6 6 -0.78 -2.9 0.87 4e+03 0.041 2.5 -1.2 -
4 0.88 2.4 -1.5 -3.6 3.5 -3.3 -3.3 3.4 3.9e+03 0.061 2.5 0.23 +
5 0.49 1.9 -2.7 -1.1 3.9 -2.4 -3.3 3.9 3.8e+03 0.024 2.5 0.47 +
6 0.49 1.9 -2.7 -1.1 3.9 -2.4 -3.3 3.9 3.8e+03 0.024 1.2 -2.6 -
7 0.49 2.8 -1.8 -2.3 2.9 -2.8 -4.5 3.5 3.7e+03 0.03 1.2 0.72 +
8 -0.035 3.2 -2.1 -1.1 3.3 -2.7 -4.4 3.4 3.7e+03 0.015 1.2 0.83 +
9 -0.29 3.9 -0.8 -0.73 2.4 -2.6 -5 4 3.7e+03 0.014 1.2 0.48 +
10 -0.17 3.9 0.45 -0.037 2 -2.9 -5.4 4.3 3.6e+03 0.03 12 0.92 ++
11 -0.17 3.9 0.45 -0.037 2 -2.9 -5.4 4.3 3.6e+03 0.03 6.2 -10 -
12 -0.17 3.9 0.45 -0.037 2 -2.9 -5.4 4.3 3.6e+03 0.03 3.1 -3.8 -
13 -0.17 3.9 0.45 -0.037 2 -2.9 -5.4 4.3 3.6e+03 0.03 1.6 -1.7 -
14 -0.17 3.9 0.45 -0.037 2 -2.9 -5.4 4.3 3.6e+03 0.03 0.78 -0.58 -
15 0.069 3.8 1.2 -0.67 2.4 -3.1 -5.6 4.1 3.6e+03 0.022 0.78 0.37 +
16 0.069 3.8 1.2 -0.67 2.4 -3.1 -5.6 4.1 3.6e+03 0.022 0.39 -0.038 -
17 0.069 3.8 1.2 -0.67 2.4 -3.1 -5.6 4.1 3.6e+03 0.022 0.2 0.018 -
18 0.096 3.6 1 -0.47 2.2 -3.3 -5.8 3.9 3.6e+03 0.0095 0.2 0.16 +
19 0.23 3.7 0.84 -0.32 2.2 -3.1 -5.8 3.9 3.6e+03 0.0028 2 0.97 ++
20 0.25 3.7 0.85 -0.27 2.2 -3.2 -5.9 4 3.6e+03 4.9e-05 20 1 ++
21 0.25 3.7 0.85 -0.27 2.2 -3.2 -5.9 4 3.6e+03 2e-08 20 1 ++
Results saved in file b12panel.html
Results saved in file b12panel.pickle
print(results.short_summary())
Results for model b12panel
Nbr of parameters: 8
Sample size: 752
Observations: 6768
Excluded data: 3960
Final log likelihood: -3618.834
Akaike Information Criterion: 7253.667
Bayesian Information Criterion: 7290.649
pandas_results = results.getEstimatedParameters()
pandas_results
Total running time of the script: (0 minutes 29.940 seconds)