Note
Go to the end to download the full example code.
11. Cross-nested logit¶
Bayesian estimation of a cross-nested logit model with two nests:
one with existing alternatives (car and train),
one with public transportation alternatives (train and Swissmetro)
Michel Bierlaire, EPFL Mon Nov 03 2025, 20:14:23
from pathlib import Path
from IPython.core.display_functions import display
See the data processing script: Data preparation for Swissmetro.
from swissmetro_data import (
CAR_AV_SP,
CAR_CO_SCALED,
CAR_TT_SCALED,
CHOICE,
GA,
SM_AV,
SM_COST_SCALED,
SM_HE,
SM_TT_SCALED,
TRAIN_AV_SP,
TRAIN_COST_SCALED,
TRAIN_HE,
TRAIN_TT_SCALED,
database,
)
import biogeme.biogeme_logging as blog
from biogeme.bayesian_estimation import (
BayesianResults,
BayesianResultsSummary,
get_pandas_estimated_parameters,
)
from biogeme.biogeme import BIOGEME
from biogeme.expressions import Beta
from biogeme.models import logcnl
from biogeme.nests import NestsForCrossNestedLogit, OneNestForCrossNestedLogit
logger = blog.get_screen_logger(level=blog.INFO)
logger.info('Example b11_cnl.py')
Example b11_cnl.py
Parameters to be estimated.
asc_car = Beta('asc_car', 0, None, None, 0)
asc_train = Beta('asc_train', 0, None, None, 0)
asc_sm = Beta('asc_sm', 0, None, None, 1)
b_time_swissmetro = Beta('b_time_swissmetro', 0, None, 0, 0)
b_time_train = Beta('b_time_train', 0, None, 0, 0)
b_time_car = Beta('b_time_car', 0, None, 0, 0)
b_cost = Beta('b_cost', 0, None, 0, 0)
b_headway_swissmetro = Beta('b_headway_swissmetro', 0, None, 0, 0)
b_headway_train = Beta('b_headway_train', 0, None, 0, 0)
ga_train = Beta('ga_train', 0, None, None, 0)
ga_swissmetro = Beta('ga_swissmetro', 0, None, None, 0)
existing_nest_parameter = Beta('existing_nest_parameter', 1.05, 1, 3, 0)
public_nest_parameter = Beta('public_nest_parameter', 1.05, 1, 3, 0)
Nest membership parameters.
alpha_existing = Beta('alpha_existing', 0.5, 0, 1, 0)
alpha_public = 1 - alpha_existing
Definition of the utility functions
v_train = (
asc_train
+ b_time_train * TRAIN_TT_SCALED
+ b_cost * TRAIN_COST_SCALED
+ b_headway_train * TRAIN_HE
+ ga_train * GA
)
v_swissmetro = (
asc_sm
+ b_time_swissmetro * SM_TT_SCALED
+ b_cost * SM_COST_SCALED
+ b_headway_swissmetro * SM_HE
+ ga_swissmetro * GA
)
v_car = asc_car + b_time_car * CAR_TT_SCALED + b_cost * CAR_CO_SCALED
Associate utility functions with the numbering of alternatives
v = {1: v_train, 2: v_swissmetro, 3: v_car}
Associate the availability conditions with the alternatives
av = {1: TRAIN_AV_SP, 2: SM_AV, 3: CAR_AV_SP}
Definition of nests.
nest_existing = OneNestForCrossNestedLogit(
nest_param=existing_nest_parameter,
dict_of_alpha={1: alpha_existing, 2: 0.0, 3: 1.0},
name='existing',
)
nest_public = OneNestForCrossNestedLogit(
nest_param=public_nest_parameter,
dict_of_alpha={1: alpha_public, 2: 1.0, 3: 0.0},
name='public',
)
nests = NestsForCrossNestedLogit(
choice_set=[1, 2, 3], tuple_of_nests=(nest_existing, nest_public)
)
The choice model is a cross-nested logit, with availability conditions.
log_probability = logcnl(v, av, nests, CHOICE)
Create the Biogeme object
the_biogeme = BIOGEME(
database,
log_probability,
chains=4,
bayesian_draws=40_000,
warmup=40_000,
calculate_loo=False,
)
the_biogeme.model_name = 'b11_cnl'
Biogeme parameters read from biogeme.toml.
Estimate the posterior distribution of the parameters, or read the results if already available.
yaml_file = Path('saved_results') / f'{the_biogeme.model_name}.yaml'
try:
summary_results = BayesianResultsSummary.from_yaml_file(filename=yaml_file)
except FileNotFoundError:
results: BayesianResults = the_biogeme.bayesian_estimation()
summary_results = results.to_summary()
print(summary_results.short_summary())
Sample size 6768
Sampler NUTS
Number of chains 4
Number of draws per chain 40000
Total number of draws 160000
Acceptance rate target 0.9
Run time 1:47:31.773810
Posterior predictive log-likelihood (sum of log mean p) -5476.77
Expected log-likelihood E[log L(Y|θ)] -57181.10
Best-draw log-likelihood (posterior upper bound) -4998.29
Present the parameter estimates in a pandas table.
pandas_results = get_pandas_estimated_parameters(
estimation_results=summary_results,
)
display(pandas_results)
Name Value (mean) ... ESS (bulk) ESS (tail)
0 asc_train -0.418083 ... 19.678847 6.307843
1 ga_train 1.003471 ... 7.080623 4.000800
2 ga_swissmetro -0.127364 ... 97.404512 100.741874
3 asc_car -0.574272 ... 77.398776 100.034620
4 b_time_train -1.036879 ... 7.070697 99.377635
5 b_cost -1.252157 ... 7.065671 4.000800
6 b_headway_train -0.637087 ... 7.066519 4.000800
7 alpha_existing 0.733735 ... 230.925443 99.474175
8 existing_nest_parameter 1.854212 ... 7.082086 103.581575
9 b_time_swissmetro -0.967080 ... 7.929068 100.354628
10 b_headway_swissmetro -0.185868 ... 7.062704 4.000800
11 b_time_car -0.931353 ... 7.520995 4.073666
12 public_nest_parameter 1.923020 ... 129.255682 100.563324
[13 rows x 12 columns]
Report the variables stored in the Bayesian estimation results.
display(summary_results.report_stored_variables())
group ... shape
0 constant_data ... [6768]
1 constant_data ... [6768]
2 constant_data ... [6768]
3 constant_data ... [6768]
4 constant_data ... [6768]
5 constant_data ... [6768]
6 constant_data ... [6768]
7 constant_data ... [6768]
8 constant_data ... [6768]
9 constant_data ... [6768]
10 constant_data ... [6768]
11 constant_data ... [6768]
12 constant_data ... [6768]
13 log_likelihood ... [4, 40000, 6768]
14 posterior ... [4, 40000]
15 posterior ... [4, 40000]
16 posterior ... [4, 40000]
17 posterior ... [4, 40000]
18 posterior ... [4, 40000]
19 posterior ... [4, 40000]
20 posterior ... [4, 40000]
21 posterior ... [4, 40000]
22 posterior ... [4, 40000]
23 posterior ... [4, 40000]
24 posterior ... [4, 40000]
25 posterior ... [4, 40000]
26 posterior ... [4, 40000, 6768]
27 posterior ... [4, 40000]
28 prior ... [1, 40000]
29 prior ... [1, 40000]
30 prior ... [1, 40000]
31 prior ... [1, 40000]
32 prior ... [1, 40000]
33 prior ... [1, 40000]
34 prior ... [1, 40000]
35 prior ... [1, 40000]
36 prior ... [1, 40000]
37 prior ... [1, 40000]
38 prior ... [1, 40000]
39 prior ... [1, 40000]
40 prior ... [1, 40000, 6768]
41 prior ... [1, 40000]
42 sample_stats ... [4, 40000]
43 sample_stats ... [4, 40000]
44 sample_stats ... [4, 40000]
45 sample_stats ... [4, 40000]
46 sample_stats ... [4, 40000]
47 sample_stats ... [4, 40000]
48 sample_stats ... [4, 40000]
[49 rows x 4 columns]
Total running time of the script: (0 minutes 1.152 seconds)