Note
Go to the end to download the full example code.
3. Moneymetric and heteroscedastic specification¶
Although normalizing the scale to 1 is a common practice in random utility models, it is sometimes preferable to normalize another parameter. For instance, normalizing the cost coefficient to -1 sets the units of the utility function as currency units (CHF here), and the estimated coefficients are easily interpreted as willingness to pay. In that case, the scale must be estimated.
We also illustrate here a heteroscedastic specification, where a different scale is associated with different segments of the sample.
This example illustrates how to estimate such a specification with Bayesian inference.
Michel Bierlaire, EPFL Thu Nov 20 2025, 11:10:03
from pathlib import Path
from IPython.core.display_functions import display
See the data processing script: Data preparation for Swissmetro.
from swissmetro_data import (
CAR_AV_SP,
CAR_CO_SCALED,
CAR_TT_SCALED,
CHOICE,
GROUP,
SM_AV,
SM_COST_SCALED,
SM_TT_SCALED,
TRAIN_AV_SP,
TRAIN_COST_SCALED,
TRAIN_TT_SCALED,
database,
)
from biogeme.bayesian_estimation import (
BayesianResults,
BayesianResultsSummary,
get_pandas_estimated_parameters,
)
from biogeme.biogeme import BIOGEME
from biogeme.expressions import Beta
from biogeme.models import loglogit
The scale parameters must stay away from zero. We define a small but positive lower bound
POSITIVE_LOWER_BOUND = 1.0e-5
Parameters to be estimated.
asc_car = Beta('asc_car', 0, None, None, 0)
asc_train = Beta('asc_train', 0, None, None, 0)
asc_sm = Beta('asc_sm', 0, None, None, 1)
b_time = Beta('b_time', 0, None, 0, 0)
b_cost = Beta('b_cost', -1, None, None, 1)
scale_not_group3 = Beta('scale_not_group3', 1, POSITIVE_LOWER_BOUND, None, 0)
scale_group3 = Beta('scale_group3', 1, POSITIVE_LOWER_BOUND, None, 0)
Definition of the utility functions.
v_train = asc_train + b_time * TRAIN_TT_SCALED + b_cost * TRAIN_COST_SCALED
v_swissmetro = asc_sm + b_time * SM_TT_SCALED + b_cost * SM_COST_SCALED
v_car = asc_car + b_time * CAR_TT_SCALED + b_cost * CAR_CO_SCALED
Scale associated with group 3 is estimated.
scale = (GROUP != 3) * scale_not_group3 + (GROUP == 3) * scale_group3
Scale the utility functions, and associate them with the numbering of alternatives.
v = {1: scale * v_train, 2: scale * v_swissmetro, 3: scale * v_car}
Associate the availability conditions with the alternatives.
av = {1: TRAIN_AV_SP, 2: SM_AV, 3: CAR_AV_SP}
Definition of the model. This is the contribution of each observation to the log likelihood function.
logprob = loglogit(v, av, CHOICE)
These notes will be included as such in the report file.
USER_NOTES = (
'Illustrates a moneymetric heteroscedastic specification. A different scale is'
' associated with different segments of the sample.'
)
Create the Biogeme object.
the_biogeme = BIOGEME(database, logprob, user_notes=USER_NOTES)
the_biogeme.model_name = 'b03_scale'
Estimate the posterior distribution of the parameters, or read the results if already available.
yaml_file = Path('saved_results') / f'{the_biogeme.model_name}.yaml'
try:
summary_results = BayesianResultsSummary.from_yaml_file(filename=yaml_file)
except FileNotFoundError:
results: BayesianResults = the_biogeme.bayesian_estimation()
summary_results = results.to_summary()
print(summary_results.short_summary())
Sample size 6768
Sampler NUTS
Number of chains 4
Number of draws per chain 2000
Total number of draws 8000
Acceptance rate target 0.9
Run time 0:00:38.540117
Posterior predictive log-likelihood (sum of log mean p) -4975.40
Expected log-likelihood E[log L(Y|θ)] -4979.20
Best-draw log-likelihood (posterior upper bound) -4976.74
LOO (Leave-One-Out Cross-Validation) -4983.02
LOO Standard Error 53.93
Effective number of parameters (p_LOO) 7.62
Present the parameter estimates in a pandas table.
pandas_results = get_pandas_estimated_parameters(
estimation_results=summary_results,
)
display(pandas_results)
Name Value (mean) ... ESS (bulk) ESS (tail)
0 asc_train -1.257455 ... 3245.645126 3719.043357
1 asc_car -0.042467 ... 3572.771739 4585.755392
2 scale_not_group3 0.356773 ... 3421.731782 4254.279504
3 scale_group3 1.489222 ... 3279.999182 3990.383100
4 b_time -1.051430 ... 3172.035280 3941.586272
[5 rows x 12 columns]
Report the variables stored in the Bayesian estimation results.
display(summary_results.report_stored_variables())
group variable dims shape
0 constant_data CAR_AV_SP [obs] [6768]
1 constant_data CAR_CO_SCALED [obs] [6768]
2 constant_data CAR_TT_SCALED [obs] [6768]
3 constant_data CHOICE [obs] [6768]
4 constant_data GROUP [obs] [6768]
5 constant_data SM_AV [obs] [6768]
6 constant_data SM_COST_SCALED [obs] [6768]
7 constant_data SM_TT_SCALED [obs] [6768]
8 constant_data TRAIN_AV_SP [obs] [6768]
9 constant_data TRAIN_COST_SCALED [obs] [6768]
10 constant_data TRAIN_TT_SCALED [obs] [6768]
11 log_likelihood _choice [chain, draw, obs] [4, 2000, 6768]
12 posterior asc_car [chain, draw] [4, 2000]
13 posterior asc_train [chain, draw] [4, 2000]
14 posterior b_time [chain, draw] [4, 2000]
15 posterior log_like [chain, draw, obs] [4, 2000, 6768]
16 posterior scale_group3 [chain, draw] [4, 2000]
17 posterior scale_not_group3 [chain, draw] [4, 2000]
18 prior asc_car [chain, draw] [1, 2000]
19 prior asc_train [chain, draw] [1, 2000]
20 prior b_time [chain, draw] [1, 2000]
21 prior log_like [chain, draw, obs] [1, 2000, 6768]
22 prior scale_group3 [chain, draw] [1, 2000]
23 prior scale_not_group3 [chain, draw] [1, 2000]
24 sample_stats acceptance_rate [chain, draw] [4, 2000]
25 sample_stats diverging [chain, draw] [4, 2000]
26 sample_stats energy [chain, draw] [4, 2000]
27 sample_stats lp [chain, draw] [4, 2000]
28 sample_stats n_steps [chain, draw] [4, 2000]
29 sample_stats step_size [chain, draw] [4, 2000]
30 sample_stats tree_depth [chain, draw] [4, 2000]
Total running time of the script: (0 minutes 1.205 seconds)