18a. Ordinal logit model

Bayesian estimation of an ordinal logit model. This is just to illustrate the syntax, as the data are not ordered. But the example assume, for the sake of it, that the alternatives are ordered as 1->2->3

Michel Bierlaire, EPFL Mon Nov 17 2025, 16:38:41

from pathlib import Path

from IPython.core.display_functions import display

See the data processing script: Data preparation for Swissmetro.

from swissmetro_data import CHOICE, TRAIN_COST_SCALED, TRAIN_TT_SCALED, database

import biogeme.biogeme_logging as blog
from biogeme.bayesian_estimation import (
    BayesianResults,
    BayesianResultsSummary,
    get_pandas_estimated_parameters,
)
from biogeme.biogeme import BIOGEME
from biogeme.expressions import Beta, OrderedLogLogit

logger = blog.get_screen_logger(level=blog.INFO)
logger.info('Example b18a_ordinal_logit.py')
Example b18a_ordinal_logit.py

We define a small but positive lower bound

POSITIVE_LOWER_BOUND = 1.0e-5

Parameters to be estimated

b_time = Beta('b_time', 0, None, None, 0)
b_cost = Beta('b_cost', 0, None, None, 0)

Threshold parameters for the ordered logit.

\(\tau_1 \leq 0\).

tau1 = Beta('tau1', -1, None, 0, 0)

\(\delta_2 \geq 0\).

delta2 = Beta('delta2', 2, POSITIVE_LOWER_BOUND, None, 0)

\(\tau_2 = \tau_1 + \delta_2\)

tau2 = tau1 + delta2

Utility.

utility = b_time * TRAIN_TT_SCALED + b_cost * TRAIN_COST_SCALED

Associate each discrete indicator with an interval.

  1. \(-\infty \to \tau_1\),

  2. \(\tau_1 \to \tau_2\),

  3. \(\tau_2 \to +\infty\).

log_probability = OrderedLogLogit(
    eta=utility,
    cutpoints=[tau1, tau2],
    y=CHOICE,
    categories=[1, 2, 3],
    neutral_labels=[],
)

Create the Biogeme object.

the_biogeme = BIOGEME(database, log_probability)
the_biogeme.model_name = 'b18a_ordinal_logit'
Biogeme parameters read from biogeme.toml.

Estimate the posterior distribution of the parameters, or read the results if already available.

yaml_file = Path('saved_results') / f'{the_biogeme.model_name}.yaml'
try:
    summary_results = BayesianResultsSummary.from_yaml_file(filename=yaml_file)
except FileNotFoundError:
    results: BayesianResults = the_biogeme.bayesian_estimation()
    summary_results = results.to_summary()
print(summary_results.short_summary())
Sample size                                              6768
Sampler                                                  NUTS
Number of chains                                         4
Number of draws per chain                                2000
Total number of draws                                    8000
Acceptance rate target                                   0.9
Run time                                                 0:00:43.994059
Posterior predictive log-likelihood (sum of log mean p)  -5789.11
Expected log-likelihood E[log L(Y|θ)]                    -5791.36
Best-draw log-likelihood (posterior upper bound)         -5789.33
LOO (Leave-One-Out Cross-Validation)                     -5793.62
LOO Standard Error                                       48.79
Effective number of parameters (p_LOO)                   4.50

Present the parameter estimates in a pandas table.

pandas_results = get_pandas_estimated_parameters(
    estimation_results=summary_results,
)
display(pandas_results)
     Name  Value (mean)  Value (median)  ...     R hat   ESS (bulk)   ESS (tail)
0  b_time     -0.021337       -0.021328  ...  1.001044  3111.670753  3911.224192
1  b_cost      1.263322        1.263048  ...  0.999984  4390.811468  4695.446722
2    tau1     -1.029423       -1.029357  ...  1.001380  3397.531042  3748.656866
3  delta2      3.193695        3.193325  ...  1.001228  4736.247040  4684.943365

[4 rows x 12 columns]

Report the variables stored in the Bayesian estimation results.

display(summary_results.report_stored_variables())
             group           variable                dims            shape
0    constant_data             CHOICE               [obs]           [6768]
1    constant_data  TRAIN_COST_SCALED               [obs]           [6768]
2    constant_data    TRAIN_TT_SCALED               [obs]           [6768]
3   log_likelihood            _choice  [chain, draw, obs]  [4, 2000, 6768]
4        posterior             b_cost       [chain, draw]        [4, 2000]
5        posterior             b_time       [chain, draw]        [4, 2000]
6        posterior             delta2       [chain, draw]        [4, 2000]
7        posterior           log_like  [chain, draw, obs]  [4, 2000, 6768]
8        posterior               tau1       [chain, draw]        [4, 2000]
9            prior             b_cost       [chain, draw]        [1, 2000]
10           prior             b_time       [chain, draw]        [1, 2000]
11           prior             delta2       [chain, draw]        [1, 2000]
12           prior           log_like  [chain, draw, obs]  [1, 2000, 6768]
13           prior               tau1       [chain, draw]        [1, 2000]
14    sample_stats    acceptance_rate       [chain, draw]        [4, 2000]
15    sample_stats          diverging       [chain, draw]        [4, 2000]
16    sample_stats             energy       [chain, draw]        [4, 2000]
17    sample_stats                 lp       [chain, draw]        [4, 2000]
18    sample_stats            n_steps       [chain, draw]        [4, 2000]
19    sample_stats          step_size       [chain, draw]        [4, 2000]
20    sample_stats         tree_depth       [chain, draw]        [4, 2000]

Total running time of the script: (0 minutes 1.126 seconds)

Gallery generated by Sphinx-Gallery