11. Cross-nested logit

Bayesian estimation of a cross-nested logit model with two nests:

  • one with existing alternatives (car and train),

  • one with public transportation alternatives (train and Swissmetro)

Michel Bierlaire, EPFL Mon Nov 03 2025, 20:14:23

from pathlib import Path

from IPython.core.display_functions import display

See the data processing script: Data preparation for Swissmetro.

from swissmetro_data import (
    CAR_AV_SP,
    CAR_CO_SCALED,
    CAR_TT_SCALED,
    CHOICE,
    GA,
    SM_AV,
    SM_COST_SCALED,
    SM_HE,
    SM_TT_SCALED,
    TRAIN_AV_SP,
    TRAIN_COST_SCALED,
    TRAIN_HE,
    TRAIN_TT_SCALED,
    database,
)

import biogeme.biogeme_logging as blog
from biogeme.bayesian_estimation import (
    BayesianResults,
    BayesianResultsSummary,
    get_pandas_estimated_parameters,
)
from biogeme.biogeme import BIOGEME
from biogeme.expressions import Beta
from biogeme.models import logcnl
from biogeme.nests import NestsForCrossNestedLogit, OneNestForCrossNestedLogit

logger = blog.get_screen_logger(level=blog.INFO)
logger.info('Example b11_cnl.py')
Example b11_cnl.py

Parameters to be estimated.

asc_car = Beta('asc_car', 0, None, None, 0)
asc_train = Beta('asc_train', 0, None, None, 0)
asc_sm = Beta('asc_sm', 0, None, None, 1)
b_time_swissmetro = Beta('b_time_swissmetro', 0, None, 0, 0)
b_time_train = Beta('b_time_train', 0, None, 0, 0)
b_time_car = Beta('b_time_car', 0, None, 0, 0)
b_cost = Beta('b_cost', 0, None, 0, 0)
b_headway_swissmetro = Beta('b_headway_swissmetro', 0, None, 0, 0)
b_headway_train = Beta('b_headway_train', 0, None, 0, 0)
ga_train = Beta('ga_train', 0, None, None, 0)
ga_swissmetro = Beta('ga_swissmetro', 0, None, None, 0)
existing_nest_parameter = Beta('existing_nest_parameter', 1.05, 1, 3, 0)
public_nest_parameter = Beta('public_nest_parameter', 1.05, 1, 3, 0)

Nest membership parameters.

alpha_existing = Beta('alpha_existing', 0.5, 0, 1, 0)
alpha_public = 1 - alpha_existing

Definition of the utility functions

v_train = (
    asc_train
    + b_time_train * TRAIN_TT_SCALED
    + b_cost * TRAIN_COST_SCALED
    + b_headway_train * TRAIN_HE
    + ga_train * GA
)
v_swissmetro = (
    asc_sm
    + b_time_swissmetro * SM_TT_SCALED
    + b_cost * SM_COST_SCALED
    + b_headway_swissmetro * SM_HE
    + ga_swissmetro * GA
)
v_car = asc_car + b_time_car * CAR_TT_SCALED + b_cost * CAR_CO_SCALED

Associate utility functions with the numbering of alternatives

v = {1: v_train, 2: v_swissmetro, 3: v_car}

Associate the availability conditions with the alternatives

av = {1: TRAIN_AV_SP, 2: SM_AV, 3: CAR_AV_SP}

Definition of nests.

nest_existing = OneNestForCrossNestedLogit(
    nest_param=existing_nest_parameter,
    dict_of_alpha={1: alpha_existing, 2: 0.0, 3: 1.0},
    name='existing',
)

nest_public = OneNestForCrossNestedLogit(
    nest_param=public_nest_parameter,
    dict_of_alpha={1: alpha_public, 2: 1.0, 3: 0.0},
    name='public',
)

nests = NestsForCrossNestedLogit(
    choice_set=[1, 2, 3], tuple_of_nests=(nest_existing, nest_public)
)

The choice model is a cross-nested logit, with availability conditions.

log_probability = logcnl(v, av, nests, CHOICE)

Create the Biogeme object

the_biogeme = BIOGEME(
    database,
    log_probability,
    chains=4,
    bayesian_draws=40_000,
    warmup=40_000,
    calculate_loo=False,
)
the_biogeme.model_name = 'b11_cnl'
Biogeme parameters read from biogeme.toml.

Estimate the posterior distribution of the parameters, or read the results if already available.

yaml_file = Path('saved_results') / f'{the_biogeme.model_name}.yaml'
try:
    summary_results = BayesianResultsSummary.from_yaml_file(filename=yaml_file)
except FileNotFoundError:
    results: BayesianResults = the_biogeme.bayesian_estimation()
    summary_results = results.to_summary()
print(summary_results.short_summary())
Sample size                                              6768
Sampler                                                  NUTS
Number of chains                                         4
Number of draws per chain                                40000
Total number of draws                                    160000
Acceptance rate target                                   0.9
Run time                                                 1:47:31.773810
Posterior predictive log-likelihood (sum of log mean p)  -5476.77
Expected log-likelihood E[log L(Y|θ)]                    -57181.10
Best-draw log-likelihood (posterior upper bound)         -4998.29

Present the parameter estimates in a pandas table.

pandas_results = get_pandas_estimated_parameters(
    estimation_results=summary_results,
)
display(pandas_results)
                       Name  Value (mean)  ...  ESS (bulk)  ESS (tail)
0                 asc_train     -0.418083  ...   19.678847    6.307843
1                  ga_train      1.003471  ...    7.080623    4.000800
2             ga_swissmetro     -0.127364  ...   97.404512  100.741874
3                   asc_car     -0.574272  ...   77.398776  100.034620
4              b_time_train     -1.036879  ...    7.070697   99.377635
5                    b_cost     -1.252157  ...    7.065671    4.000800
6           b_headway_train     -0.637087  ...    7.066519    4.000800
7            alpha_existing      0.733735  ...  230.925443   99.474175
8   existing_nest_parameter      1.854212  ...    7.082086  103.581575
9         b_time_swissmetro     -0.967080  ...    7.929068  100.354628
10     b_headway_swissmetro     -0.185868  ...    7.062704    4.000800
11               b_time_car     -0.931353  ...    7.520995    4.073666
12    public_nest_parameter      1.923020  ...  129.255682  100.563324

[13 rows x 12 columns]

Report the variables stored in the Bayesian estimation results.

display(summary_results.report_stored_variables())
             group  ...             shape
0    constant_data  ...            [6768]
1    constant_data  ...            [6768]
2    constant_data  ...            [6768]
3    constant_data  ...            [6768]
4    constant_data  ...            [6768]
5    constant_data  ...            [6768]
6    constant_data  ...            [6768]
7    constant_data  ...            [6768]
8    constant_data  ...            [6768]
9    constant_data  ...            [6768]
10   constant_data  ...            [6768]
11   constant_data  ...            [6768]
12   constant_data  ...            [6768]
13  log_likelihood  ...  [4, 40000, 6768]
14       posterior  ...        [4, 40000]
15       posterior  ...        [4, 40000]
16       posterior  ...        [4, 40000]
17       posterior  ...        [4, 40000]
18       posterior  ...        [4, 40000]
19       posterior  ...        [4, 40000]
20       posterior  ...        [4, 40000]
21       posterior  ...        [4, 40000]
22       posterior  ...        [4, 40000]
23       posterior  ...        [4, 40000]
24       posterior  ...        [4, 40000]
25       posterior  ...        [4, 40000]
26       posterior  ...  [4, 40000, 6768]
27       posterior  ...        [4, 40000]
28           prior  ...        [1, 40000]
29           prior  ...        [1, 40000]
30           prior  ...        [1, 40000]
31           prior  ...        [1, 40000]
32           prior  ...        [1, 40000]
33           prior  ...        [1, 40000]
34           prior  ...        [1, 40000]
35           prior  ...        [1, 40000]
36           prior  ...        [1, 40000]
37           prior  ...        [1, 40000]
38           prior  ...        [1, 40000]
39           prior  ...        [1, 40000]
40           prior  ...  [1, 40000, 6768]
41           prior  ...        [1, 40000]
42    sample_stats  ...        [4, 40000]
43    sample_stats  ...        [4, 40000]
44    sample_stats  ...        [4, 40000]
45    sample_stats  ...        [4, 40000]
46    sample_stats  ...        [4, 40000]
47    sample_stats  ...        [4, 40000]
48    sample_stats  ...        [4, 40000]

[49 rows x 4 columns]

Total running time of the script: (0 minutes 1.152 seconds)

Gallery generated by Sphinx-Gallery