11. Cross-nested logit¶

Bayesian estimation of a cross-nested logit model with two nests:

one with existing alternatives (car and train),

one with public transportation alternatives (train and Swissmetro)

Michel Bierlaire, EPFL Mon Nov 03 2025, 20:14:23

import biogeme.biogeme_logging as blog
from IPython.core.display_functions import display
from biogeme.bayesian_estimation import BayesianResults, get_pandas_estimated_parameters
from biogeme.biogeme import BIOGEME
from biogeme.expressions import Beta
from biogeme.models import logcnl
from biogeme.nests import NestsForCrossNestedLogit, OneNestForCrossNestedLogit

See the data processing script: Data preparation for Swissmetro.

from swissmetro_data import (
    CAR_AV_SP,
    CAR_CO_SCALED,
    CAR_TT_SCALED,
    CHOICE,
    GA,
    SM_AV,
    SM_COST_SCALED,
    SM_HE,
    SM_TT_SCALED,
    TRAIN_AV_SP,
    TRAIN_COST_SCALED,
    TRAIN_HE,
    TRAIN_TT_SCALED,
    database,
)

logger = blog.get_screen_logger(level=blog.INFO)
logger.info('Example b11_cnl.py')

Example b11_cnl.py

Parameters to be estimated.

asc_car = Beta('asc_car', 0, None, None, 0)
asc_train = Beta('asc_train', 0, None, None, 0)
asc_sm = Beta('asc_sm', 0, None, None, 1)
b_time_swissmetro = Beta('b_time_swissmetro', 0, None, 0, 0)
b_time_train = Beta('b_time_train', 0, None, 0, 0)
b_time_car = Beta('b_time_car', 0, None, 0, 0)
b_cost = Beta('b_cost', 0, None, 0, 0)
b_headway_swissmetro = Beta('b_headway_swissmetro', 0, None, 0, 0)
b_headway_train = Beta('b_headway_train', 0, None, 0, 0)
ga_train = Beta('ga_train', 0, None, None, 0)
ga_swissmetro = Beta('ga_swissmetro', 0, None, None, 0)

existing_nest_parameter = Beta('existing_nest_parameter', 1.05, 1, 3, 0)
public_nest_parameter = Beta('public_nest_parameter', 1.05, 1, 3, 0)

Nest membership parameters.

alpha_existing = Beta('alpha_existing', 0.5, 0, 1, 0)
alpha_public = 1 - alpha_existing

Definition of the utility functions

v_train = (
    asc_train
    + b_time_train * TRAIN_TT_SCALED
    + b_cost * TRAIN_COST_SCALED
    + b_headway_train * TRAIN_HE
    + ga_train * GA
)
v_swissmetro = (
    asc_sm
    + b_time_swissmetro * SM_TT_SCALED
    + b_cost * SM_COST_SCALED
    + b_headway_swissmetro * SM_HE
    + ga_swissmetro * GA
)
v_car = asc_car + b_time_car * CAR_TT_SCALED + b_cost * CAR_CO_SCALED

Associate utility functions with the numbering of alternatives

v = {1: v_train, 2: v_swissmetro, 3: v_car}

Associate the availability conditions with the alternatives

av = {1: TRAIN_AV_SP, 2: SM_AV, 3: CAR_AV_SP}

Definition of nests.

nest_existing = OneNestForCrossNestedLogit(
    nest_param=existing_nest_parameter,
    dict_of_alpha={1: alpha_existing, 2: 0.0, 3: 1.0},
    name='existing',
)

nest_public = OneNestForCrossNestedLogit(
    nest_param=public_nest_parameter,
    dict_of_alpha={1: alpha_public, 2: 1.0, 3: 0.0},
    name='public',
)

nests = NestsForCrossNestedLogit(
    choice_set=[1, 2, 3], tuple_of_nests=(nest_existing, nest_public)
)

The choice model is a cross-nested logit, with availability conditions.

log_probability = logcnl(v, av, nests, CHOICE)

Create the Biogeme object

the_biogeme = BIOGEME(
    database,
    log_probability,
    chains=4,
    bayesian_draws=40_000,
    warmup=40_000,
    calculate_loo=False,
)
the_biogeme.model_name = 'b11_cnl'

Biogeme parameters read from biogeme.toml.

Estimate the parameters.

try:
    results = BayesianResults.from_netcdf(
        filename=f'saved_results/{the_biogeme.model_name}.nc'
    )
except FileNotFoundError:
    results = the_biogeme.bayesian_estimation()

Loaded NetCDF file size: 9.1 GB
load finished in 52660 ms (52.66 s)

print(results.short_summary())

posterior_predictive_loglike finished in 11028 ms (11.03 s)
expected_log_likelihood finished in 423 ms
best_draw_log_likelihood finished in 209 ms
/Users/bierlair/python_envs/venv313/lib/python3.13/site-packages/arviz/stats/stats.py:1667: UserWarning: For one or more samples the posterior variance of the log predictive densities exceeds 0.4. This could be indication of WAIC starting to fail.
See http://arxiv.org/abs/1507.04544 for details
  warnings.warn(
waic_res finished in 23126 ms (23.13 s)
waic finished in 23126 ms (23.13 s)
/Users/bierlair/python_envs/venv313/lib/python3.13/site-packages/arviz/stats/stats.py:1057: RuntimeWarning: overflow encountered in exp
  weights = 1 / np.exp(len_scale - len_scale[:, None]).sum(axis=1)
/Users/bierlair/python_envs/venv313/lib/python3.13/site-packages/numpy/_core/_methods.py:52: RuntimeWarning: overflow encountered in reduce
  return umr_sum(a, axis, dtype, out, keepdims, initial, where)
/Users/bierlair/python_envs/venv313/lib/python3.13/site-packages/arviz/stats/stats.py:797: UserWarning: Estimated shape parameter of Pareto distribution is greater than 0.70 for one or more samples. You should consider using a more robust model, this is because importance sampling is less likely to work well if the marginal posterior and LOO posterior are very different. This is more likely to happen with a non-robust model and highly influential observations.
  warnings.warn(
loo_res finished in 597241 ms (9.95 min)
loo finished in 597242 ms (9.95 min)
Sample size                                              6768
Sampler                                                  NUTS
Number of chains                                         4
Number of draws per chain                                40000
Total number of draws                                    160000
Acceptance rate target                                   0.9
Run time                                                 2:32:21.137282
Posterior predictive log-likelihood (sum of log mean p)  -6715.03
Expected log-likelihood E[log L(Y|θ)]                    -109707.06
Best-draw log-likelihood (posterior upper bound)         -4998.66
WAIC (Widely Applicable Information Criterion)           -8656808.31
WAIC Standard Error                                      474795.01
Effective number of parameters (p_WAIC)                  8650093.28
LOO (Leave-One-Out Cross-Validation)                     -231017.83
LOO Standard Error                                       5573.76
Effective number of parameters (p_LOO)                   224302.79

pandas_results = get_pandas_estimated_parameters(estimation_results=results)
display(pandas_results)

Diagnostics computation took 1237.0 seconds (cached).
                       Name  Value (mean)  ...  ESS (bulk)  ESS (tail)
               asc_train     -0.215706  ...    4.384520    4.001150
                ga_train      0.763243  ...    5.009253    4.000800
           ga_swissmetro     -0.336791  ...    5.213472    4.001500
                 asc_car     -0.310076  ...    7.014123   81.991726
            b_time_train     -1.198929  ...    5.795722    4.084138
                  b_cost     -0.726218  ...    4.954078   78.640614
         b_headway_train     -0.898006  ...    4.955540    4.000800
          alpha_existing      0.662106  ...    6.837447    4.007692
 existing_nest_parameter      1.989690  ...    4.963419   80.614325
       b_time_swissmetro     -1.155630  ...    6.781623    4.000800
   b_headway_swissmetro     -0.597729  ...    4.959977    4.000800
             b_time_car     -0.879311  ...    4.383433    4.000800
  public_nest_parameter      1.991096  ...   14.658280   80.734784

[13 rows x 12 columns]

Total running time of the script: (32 minutes 5.512 seconds)

Gallery generated by Sphinx-Gallery