Note
Go to the end to download the full example code.
18a. Ordinal logit model¶
Bayesian estimation of an ordinal logit model. This is just to illustrate the syntax, as the data are not ordered. But the example assume, for the sake of it, that the alternatives are ordered as 1->2->3
Michel Bierlaire, EPFL Mon Nov 17 2025, 16:38:41
from pathlib import Path
from IPython.core.display_functions import display
See the data processing script: Data preparation for Swissmetro.
from swissmetro_data import CHOICE, TRAIN_COST_SCALED, TRAIN_TT_SCALED, database
import biogeme.biogeme_logging as blog
from biogeme.bayesian_estimation import (
BayesianResults,
BayesianResultsSummary,
get_pandas_estimated_parameters,
)
from biogeme.biogeme import BIOGEME
from biogeme.expressions import Beta, OrderedLogLogit
logger = blog.get_screen_logger(level=blog.INFO)
logger.info('Example b18a_ordinal_logit.py')
Example b18a_ordinal_logit.py
We define a small but positive lower bound
POSITIVE_LOWER_BOUND = 1.0e-5
Parameters to be estimated
b_time = Beta('b_time', 0, None, None, 0)
b_cost = Beta('b_cost', 0, None, None, 0)
Threshold parameters for the ordered logit.
\(\tau_1 \leq 0\).
tau1 = Beta('tau1', -1, None, 0, 0)
\(\delta_2 \geq 0\).
delta2 = Beta('delta2', 2, POSITIVE_LOWER_BOUND, None, 0)
\(\tau_2 = \tau_1 + \delta_2\)
tau2 = tau1 + delta2
Utility.
utility = b_time * TRAIN_TT_SCALED + b_cost * TRAIN_COST_SCALED
Associate each discrete indicator with an interval.
\(-\infty \to \tau_1\),
\(\tau_1 \to \tau_2\),
\(\tau_2 \to +\infty\).
log_probability = OrderedLogLogit(
eta=utility,
cutpoints=[tau1, tau2],
y=CHOICE,
categories=[1, 2, 3],
neutral_labels=[],
)
Create the Biogeme object.
the_biogeme = BIOGEME(database, log_probability)
the_biogeme.model_name = 'b18a_ordinal_logit'
Biogeme parameters read from biogeme.toml.
Estimate the posterior distribution of the parameters, or read the results if already available.
yaml_file = Path('saved_results') / f'{the_biogeme.model_name}.yaml'
try:
summary_results = BayesianResultsSummary.from_yaml_file(filename=yaml_file)
except FileNotFoundError:
results: BayesianResults = the_biogeme.bayesian_estimation()
summary_results = results.to_summary()
print(summary_results.short_summary())
Sample size 6768
Sampler NUTS
Number of chains 4
Number of draws per chain 2000
Total number of draws 8000
Acceptance rate target 0.9
Run time 0:00:43.994059
Posterior predictive log-likelihood (sum of log mean p) -5789.11
Expected log-likelihood E[log L(Y|θ)] -5791.36
Best-draw log-likelihood (posterior upper bound) -5789.33
LOO (Leave-One-Out Cross-Validation) -5793.62
LOO Standard Error 48.79
Effective number of parameters (p_LOO) 4.50
Present the parameter estimates in a pandas table.
pandas_results = get_pandas_estimated_parameters(
estimation_results=summary_results,
)
display(pandas_results)
Name Value (mean) Value (median) ... R hat ESS (bulk) ESS (tail)
0 b_time -0.021337 -0.021328 ... 1.001044 3111.670753 3911.224192
1 b_cost 1.263322 1.263048 ... 0.999984 4390.811468 4695.446722
2 tau1 -1.029423 -1.029357 ... 1.001380 3397.531042 3748.656866
3 delta2 3.193695 3.193325 ... 1.001228 4736.247040 4684.943365
[4 rows x 12 columns]
Report the variables stored in the Bayesian estimation results.
display(summary_results.report_stored_variables())
group variable dims shape
0 constant_data CHOICE [obs] [6768]
1 constant_data TRAIN_COST_SCALED [obs] [6768]
2 constant_data TRAIN_TT_SCALED [obs] [6768]
3 log_likelihood _choice [chain, draw, obs] [4, 2000, 6768]
4 posterior b_cost [chain, draw] [4, 2000]
5 posterior b_time [chain, draw] [4, 2000]
6 posterior delta2 [chain, draw] [4, 2000]
7 posterior log_like [chain, draw, obs] [4, 2000, 6768]
8 posterior tau1 [chain, draw] [4, 2000]
9 prior b_cost [chain, draw] [1, 2000]
10 prior b_time [chain, draw] [1, 2000]
11 prior delta2 [chain, draw] [1, 2000]
12 prior log_like [chain, draw, obs] [1, 2000, 6768]
13 prior tau1 [chain, draw] [1, 2000]
14 sample_stats acceptance_rate [chain, draw] [4, 2000]
15 sample_stats diverging [chain, draw] [4, 2000]
16 sample_stats energy [chain, draw] [4, 2000]
17 sample_stats lp [chain, draw] [4, 2000]
18 sample_stats n_steps [chain, draw] [4, 2000]
19 sample_stats step_size [chain, draw] [4, 2000]
20 sample_stats tree_depth [chain, draw] [4, 2000]
Total running time of the script: (0 minutes 1.126 seconds)