Note
Go to the end to download the full example code.
15. Discrete mixture with panel data¶
Bayesian estimation of a discrete mixture of logit models, also called latent class model. The datafile is organized as panel data.
Michel Bierlaire, EPFL Sat Nov 15 2025, 17:39:13
import biogeme.biogeme_logging as blog
from IPython.core.display_functions import display
from biogeme.bayesian_estimation import BayesianResults, get_pandas_estimated_parameters
from biogeme.biogeme import BIOGEME
from biogeme.expressions import (
Beta,
DistributedParameter,
Draws,
exp,
log,
)
from biogeme.models import logit
See the data processing script: Panel data preparation for Swissmetro.
from swissmetro_panel import (
CAR_AV_SP,
CAR_CO_SCALED,
CAR_TT_SCALED,
CHOICE,
SM_AV,
SM_COST_SCALED,
SM_TT_SCALED,
TRAIN_AV_SP,
TRAIN_COST_SCALED,
TRAIN_TT_SCALED,
database,
)
logger = blog.get_screen_logger(level=blog.INFO)
logger.info('Example b15_panel_discrete.py')
Example b15_panel_discrete.py
Parameters to be estimated. One version for each latent_old class.
NUMBER_OF_CLASSES = 2
b_cost = [Beta(f'b_cost_class{i}', 0, None, None, 0) for i in range(NUMBER_OF_CLASSES)]
Define a random parameter, normally distributed across individuals, designed to be used for Monte-Carlo simulation
b_time = [Beta(f'b_time_class{i}', 0, None, None, 0) for i in range(NUMBER_OF_CLASSES)]
It is advised not to use 0 as starting value for the following parameter.
b_time_s = [
Beta(f'b_time_s_class{i}', 1, None, None, 0) for i in range(NUMBER_OF_CLASSES)
]
b_time_rnd = [
DistributedParameter(
f'b_time_rnd_class{i}',
b_time[i] + b_time_s[i] * Draws(f'b_time_eps_class{i}', 'NORMAL'),
)
for i in range(NUMBER_OF_CLASSES)
]
We do the same for the constants, to address serial correlation.
asc_car = [
Beta(f'asc_car_class{i}', 0, None, None, 0) for i in range(NUMBER_OF_CLASSES)
]
asc_car_s = [
Beta(f'asc_car_s_class{i}', 1, None, None, 0) for i in range(NUMBER_OF_CLASSES)
]
asc_car_rnd = [
DistributedParameter(
f'asc_car_rnd_class{i}',
asc_car[i] + asc_car_s[i] * Draws(f'asc_car_eps_class{i}', 'NORMAL'),
)
for i in range(NUMBER_OF_CLASSES)
]
asc_train = [
Beta(f'asc_train_class{i}', 0, None, None, 0) for i in range(NUMBER_OF_CLASSES)
]
asc_train_s = [
Beta(f'asc_train_s_class{i}', 1, None, None, 0) for i in range(NUMBER_OF_CLASSES)
]
asc_train_rnd = [
DistributedParameter(
f'asc_train_rnd_class{i}',
asc_train[i] + asc_train_s[i] * Draws(f'asc_train_eps_class{i}', 'NORMAL'),
)
for i in range(NUMBER_OF_CLASSES)
]
asc_sm = [Beta(f'asc_sm_class{i}', 0, None, None, 1) for i in range(NUMBER_OF_CLASSES)]
asc_sm_s = [
Beta(f'asc_sm_s_class{i}', 1, None, None, 0) for i in range(NUMBER_OF_CLASSES)
]
asc_sm_rnd = [
DistributedParameter(
f'asc_sm_rnd_class{i}',
asc_sm[i] + asc_sm_s[i] * Draws(f'asc_sm_eps_class{i}', 'NORMAL'),
)
for i in range(NUMBER_OF_CLASSES)
]
Class membership probability. Note: for Bayesian estimation, this should not call the logit model.
score_class_0 = Beta('score_class_0', -1.7, None, None, 0)
probability_class_1 = 1 / (1 + exp(score_class_0))
probability_class_0 = 1 - probability_class_1
In class 0, it is assumed that the time coefficient is zero.
b_time_rnd[0] = 0
Utility functions.
v_train_per_class = [
asc_train_rnd[i] + b_time_rnd[i] * TRAIN_TT_SCALED + b_cost[i] * TRAIN_COST_SCALED
for i in range(NUMBER_OF_CLASSES)
]
v_swissmetro_per_class = [
asc_sm_rnd[i] + b_time_rnd[i] * SM_TT_SCALED + b_cost[i] * SM_COST_SCALED
for i in range(NUMBER_OF_CLASSES)
]
v_car_per_class = [
asc_car_rnd[i] + b_time_rnd[i] * CAR_TT_SCALED + b_cost[i] * CAR_CO_SCALED
for i in range(NUMBER_OF_CLASSES)
]
v_per_class = [
{1: v_train_per_class[i], 2: v_swissmetro_per_class[i], 3: v_car_per_class[i]}
for i in range(NUMBER_OF_CLASSES)
]
Associate the availability conditions with the alternatives.
av = {1: TRAIN_AV_SP, 2: SM_AV, 3: CAR_AV_SP}
The choice model is a discrete mixture of logit, with availability conditions We calculate the conditional probability for each class.
conditional_probability_per_class = [
logit(v_per_class[i], av, CHOICE) for i in range(NUMBER_OF_CLASSES)
]
Conditional to the random variables, likelihood for the individual.
conditional_choice_probability = (
probability_class_0 * conditional_probability_per_class[0]
+ probability_class_1 * conditional_probability_per_class[1]
)
We need the log probability per observation
conditional_log_probability = log(conditional_choice_probability)
the_biogeme = BIOGEME(
database,
conditional_log_probability,
warmup=4000,
bayesian_draws=4000,
chains=4,
)
the_biogeme.model_name = 'b15_panel_discrete'
Default values of the Biogeme parameters are used.
File biogeme.toml has been created
Estimate the parameters.
try:
results = BayesianResults.from_netcdf(
filename=f'saved_results/{the_biogeme.model_name}.nc'
)
except FileNotFoundError:
results = the_biogeme.bayesian_estimation()
Loaded NetCDF file size: 2.6 GB
load finished in 19110 ms (19.11 s)
print(results.short_summary())
posterior_predictive_loglike finished in 77 ms
/Users/bierlair/python_envs/venv313/lib/python3.13/site-packages/arviz/stats/stats.py:1667: UserWarning: For one or more samples the posterior variance of the log predictive densities exceeds 0.4. This could be indication of WAIC starting to fail.
See http://arxiv.org/abs/1507.04544 for details
warnings.warn(
waic_res finished in 136 ms
waic finished in 136 ms
/Users/bierlair/python_envs/venv313/lib/python3.13/site-packages/arviz/stats/stats.py:797: UserWarning: Estimated shape parameter of Pareto distribution is greater than 0.70 for one or more samples. You should consider using a more robust model, this is because importance sampling is less likely to work well if the marginal posterior and LOO posterior are very different. This is more likely to happen with a non-robust model and highly influential observations.
warnings.warn(
loo_res finished in 33527 ms (33.53 s)
loo finished in 33527 ms (33.53 s)
Sample size 6768
Sampler NUTS
Number of chains 4
Number of draws per chain 4000
Total number of draws 16000
Acceptance rate target 0.9
Run time 0:21:03.687596
Posterior predictive log-likelihood (sum of log mean p) -2156.92
Expected log-likelihood E[log L(Y|θ)] -2342.51
Best-draw log-likelihood (posterior upper bound) -2227.53
WAIC (Widely Applicable Information Criterion) -2724.85
WAIC Standard Error 73.52
Effective number of parameters (p_WAIC) 567.93
LOO (Leave-One-Out Cross-Validation) -3043.75
LOO Standard Error 78.29
Effective number of parameters (p_LOO) 886.83
pandas_results = get_pandas_estimated_parameters(estimation_results=results)
display(pandas_results)
Diagnostics computation took 417.0 seconds (cached).
Name Value (mean) ... ESS (bulk) ESS (tail)
0 score_class_0 -5.286779 ... 1285.328326 1331.441053
1 asc_train_class0 -2.573163 ... 4465.186224 9404.567724
2 asc_train_s_class0 0.779934 ... 5687.738337 10422.160967
3 b_cost_class0 2.754704 ... 2355.876493 1338.078191
4 asc_sm_s_class0 1.412036 ... 2210.169570 5602.051806
5 asc_car_class0 -0.832146 ... 3880.861502 8316.122106
6 asc_car_s_class0 0.965673 ... 4882.557850 10246.075254
7 asc_train_class1 -0.308288 ... 1416.240418 2993.273341
8 asc_train_s_class1 -1.060029 ... 7.289857 28.262154
9 b_time_class1 -6.523038 ... 1834.577879 2143.767214
10 b_time_s_class1 4.033149 ... 1815.166833 2039.288905
11 b_cost_class1 -4.217157 ... 2072.092115 2448.375181
12 asc_sm_s_class1 0.148560 ... 25.929931 422.769703
13 asc_car_class1 0.509266 ... 3064.176039 5047.972557
14 asc_car_s_class1 2.017802 ... 7.145398 26.944499
[15 rows x 12 columns]
Total running time of the script: (7 minutes 50.703 seconds)