Note
Go to the end to download the full example code.
Discrete mixture with panel dataΒΆ
- Example of a discrete mixture of logit models, also called latent_old
class model. The datafile is organized as panel data. Compared to Discrete mixture with panel data, we integrate before the discrete mixture to show that it is equivalent.
Michel Bierlaire, EPFL Sat Jun 21 2025, 17:22:38
from IPython.core.display_functions import display
See the data processing script: Panel data preparation for Swissmetro.
from swissmetro_panel import (
CAR_AV_SP,
CAR_CO_SCALED,
CAR_TT_SCALED,
CHOICE,
SM_AV,
SM_COST_SCALED,
SM_TT_SCALED,
TRAIN_AV_SP,
TRAIN_COST_SCALED,
TRAIN_TT_SCALED,
database,
)
import biogeme.biogeme_logging as blog
from biogeme.biogeme import BIOGEME
from biogeme.expressions import (
Beta,
Draws,
ExpressionOrNumeric,
MonteCarlo,
PanelLikelihoodTrajectory,
log,
)
from biogeme.models import logit
from biogeme.results_processing import get_pandas_estimated_parameters
logger = blog.get_screen_logger(level=blog.INFO)
logger.info('Example b15panel_discrete_bis.py')
Example b15panel_discrete_bis.py
Parameters to be estimated. One version for each latent_old class.
NUMBER_OF_CLASSES = 2
b_cost = [Beta(f'b_cost_class{i}', 0, None, None, 0) for i in range(NUMBER_OF_CLASSES)]
Define a random parameter, normally distributed across individuals, designed to be used for Monte-Carlo simulation.
b_time = [Beta(f'b_time_class{i}', 0, None, None, 0) for i in range(NUMBER_OF_CLASSES)]
It is advised not to use 0 as starting value for the following parameter.
b_time_s = [
Beta(f'b_time_s_class{i}', 1, None, None, 0) for i in range(NUMBER_OF_CLASSES)
]
b_time_rnd: list[ExpressionOrNumeric] = [
b_time[i] + b_time_s[i] * Draws(f'b_time_rnd_class{i}', 'NORMAL_ANTI')
for i in range(NUMBER_OF_CLASSES)
]
We do the same for the constants, to address serial correlation.
asc_car = [
Beta(f'asc_car_class{i}', 0, None, None, 0) for i in range(NUMBER_OF_CLASSES)
]
asc_car_s = [
Beta(f'asc_car_s_class{i}', 1, None, None, 0) for i in range(NUMBER_OF_CLASSES)
]
asc_car_rnd = [
asc_car[i] + asc_car_s[i] * Draws(f'asc_car_rnd_class{i}', 'NORMAL_ANTI')
for i in range(NUMBER_OF_CLASSES)
]
asc_train = [
Beta(f'asc_train_class{i}', 0, None, None, 0) for i in range(NUMBER_OF_CLASSES)
]
asc_train_s = [
Beta(f'asc_train_s_class{i}', 1, None, None, 0) for i in range(NUMBER_OF_CLASSES)
]
ASC_TRAIN_RND = [
asc_train[i] + asc_train_s[i] * Draws(f'asc_train_rnd_class{i}', 'NORMAL_ANTI')
for i in range(NUMBER_OF_CLASSES)
]
asc_sm = [Beta(f'asc_sm_class{i}', 0, None, None, 1) for i in range(NUMBER_OF_CLASSES)]
asc_sm_s = [
Beta(f'asc_sm_s_class{i}', 1, None, None, 0) for i in range(NUMBER_OF_CLASSES)
]
asc_sm_rnd = [
asc_sm[i] + asc_sm_s[i] * Draws(f'asc_sm_rnd_class{i}', 'NORMAL_ANTI')
for i in range(NUMBER_OF_CLASSES)
]
Class membership probability.
score_class_0 = Beta('score_class_0', 0, None, None, 0)
probability_class_0 = logit({0: score_class_0, 1: 0}, None, 0)
probability_class_1 = logit({0: score_class_0, 1: 0}, None, 1)
In class 0, it is assumed that the time coefficient is zero.
b_time_rnd[0] = 0
Utility functions.
v_train_per_class = [
ASC_TRAIN_RND[i] + b_time_rnd[i] * TRAIN_TT_SCALED + b_cost[i] * TRAIN_COST_SCALED
for i in range(NUMBER_OF_CLASSES)
]
v_swissmetro_per_class = [
asc_sm_rnd[i] + b_time_rnd[i] * SM_TT_SCALED + b_cost[i] * SM_COST_SCALED
for i in range(NUMBER_OF_CLASSES)
]
v_car_per_class = [
asc_car_rnd[i] + b_time_rnd[i] * CAR_TT_SCALED + b_cost[i] * CAR_CO_SCALED
for i in range(NUMBER_OF_CLASSES)
]
v_per_class = [
{1: v_train_per_class[i], 2: v_swissmetro_per_class[i], 3: v_car_per_class[i]}
for i in range(NUMBER_OF_CLASSES)
]
Associate the availability conditions with the alternatives
av = {1: TRAIN_AV_SP, 2: SM_AV, 3: CAR_AV_SP}
The choice model is a discrete mixture of logit, with availability conditions We calculate the conditional probability for each class.
choice_probability_per_class = [
MonteCarlo(PanelLikelihoodTrajectory(logit(v_per_class[i], av, CHOICE)))
for i in range(NUMBER_OF_CLASSES)
]
Conditional to the random variables, likelihood for the individual.
choice_probability = (
probability_class_0 * choice_probability_per_class[0]
+ probability_class_1 * choice_probability_per_class[1]
)
We integrate over the random variables using Monte-Carlo.
log_probability = log(choice_probability)
The model is complex, and there are numerical issues when calculating the second derivatives. Therefore, we instruct Biogeme not to evaluate the second derivatives. As a consequence, the statistics reported after estimation are based on the BHHH matrix instead of the Rao-Cramer bound.
the_biogeme = BIOGEME(
database,
log_probability,
number_of_draws=10_000,
seed=1223,
calculating_second_derivatives='never',
)
the_biogeme.model_name = 'b15panel_discrete_bis'
Biogeme parameters read from biogeme.toml.
Flattening database [(6768, 38)].
Database flattened [(752, 362)]
Flattening database [(6768, 38)].
Database flattened [(752, 362)]
Flattening database [(6768, 38)].
Database flattened [(752, 362)]
Flattening database [(6768, 38)].
Database flattened [(752, 362)]
Estimate the parameters.
results = the_biogeme.estimate(recycle=True)
Estimation results read from b15panel_discrete_bis.yaml. There is no guarantee that they correspond to the specified model.
print(results.short_summary())
Results for model b15panel_discrete_bis
Nbr of parameters: 15
Sample size: 752
Observations: 6768
Excluded data: 0
Final log likelihood: -3526.528
Akaike Information Criterion: 7083.055
Bayesian Information Criterion: 7152.396
pandas_results = get_pandas_estimated_parameters(estimation_results=results)
display(pandas_results)
Name Value BHHH std err. BHHH t-stat. BHHH p-value
0 score_class_0 -1.701641 0.232348 -7.323685 2.413625e-13
1 asc_train_class0 -1.113282 0.646686 -1.721521 8.515638e-02
2 asc_train_s_class0 2.978551 1.103334 2.699590 6.942492e-03
3 b_cost_class0 -1.246642 0.559509 -2.228100 2.587386e-02
4 asc_sm_s_class0 -0.646807 4.749514 -0.136184 8.916760e-01
5 asc_car_class0 -4.809474 1.675431 -2.870589 4.097072e-03
6 asc_car_s_class0 5.860782 1.760272 3.329475 8.700985e-04
7 asc_train_class1 -0.233063 0.284641 -0.818797 4.129024e-01
8 asc_train_s_class1 1.586898 0.503922 3.149096 1.637762e-03
9 b_time_class1 -7.057735 0.384179 -18.370944 0.000000e+00
10 b_time_s_class1 3.309772 0.397917 8.317750 0.000000e+00
11 b_cost_class1 -4.805090 0.256154 -18.758612 0.000000e+00
12 asc_sm_s_class1 1.925231 0.318139 6.051532 1.434748e-09
13 asc_car_class1 1.080411 0.218734 4.939379 7.837166e-07
14 asc_car_s_class1 2.735809 0.265924 10.287922 0.000000e+00
Total running time of the script: (0 minutes 17.039 seconds)