Note
Go to the end to download the full example code.
4. Out-of-sample validation¶
Example of the out-of-sample validation of a logit model in a Bayesian estimation context.
Michel Bierlaire, EPFL Thu Oct 30 2025, 16:40:12
from biogeme.bayesian_estimation import BayesianResults
from biogeme.biogeme import BIOGEME
from biogeme.expressions import Beta
from biogeme.models import loglogit
from biogeme.validation import ValidationResult
See the data processing script: Data preparation for Swissmetro.
from swissmetro_data import (
CAR_AV_SP,
CAR_CO_SCALED,
CAR_TT_SCALED,
CHOICE,
SM_AV,
SM_COST_SCALED,
SM_TT_SCALED,
TRAIN_AV_SP,
TRAIN_COST_SCALED,
TRAIN_TT_SCALED,
database,
)
Parameters to be estimated.
asc_car = Beta('asc_car', 0, None, None, 0)
asc_train = Beta('asc_train', 0, None, None, 0)
asc_sm = Beta('asc_sm', 0, None, None, 1)
b_time = Beta('b_time', 0, None, None, 0)
b_cost = Beta('b_cost', 0, None, None, 0)
Definition of the utility functions.
v_train = asc_train + b_time * TRAIN_TT_SCALED + b_cost * TRAIN_COST_SCALED
v_swissmetro = asc_sm + b_time * SM_TT_SCALED + b_cost * SM_COST_SCALED
v_car = asc_car + b_time * CAR_TT_SCALED + b_cost * CAR_CO_SCALED
Associate utility functions with the numbering of alternatives.
v = {1: v_train, 2: v_swissmetro, 3: v_car}
Associate the availability conditions with the alternatives.
av = {1: TRAIN_AV_SP, 2: SM_AV, 3: CAR_AV_SP}
Definition of the model. This is the contribution of each observation to the log likelihood function.
logprob = loglogit(v, av, CHOICE)
Create the Biogeme object.
the_biogeme = BIOGEME(database, logprob)
the_biogeme.model_name = 'b04validation'
Estimate the parameters.
try:
results = BayesianResults.from_netcdf(
filename=f'saved_results/{the_biogeme.model_name}.nc'
)
except FileNotFoundError:
results = the_biogeme.bayesian_estimation()
/Users/bierlair/MyFiles/github/biogeme/src/biogeme/biogeme.py:1002: UserWarning: The effect of Potentials on other parameters is ignored during prior predictive sampling. This is likely to lead to invalid or biased predictive samples.
pm.sample_prior_predictive(
0%| | 0/4000 [00:00<?, ?it/s]
Compiling.. : 0%| | 0/4000 [00:00<?, ?it/s]
0%| | 0/4000 [00:00<?, ?it/s]
Compiling.. : 0%| | 0/4000 [00:00<?, ?it/s]
0%| | 0/4000 [00:00<?, ?it/s]
Compiling.. : 0%| | 0/4000 [00:00<?, ?it/s]
0%| | 0/4000 [00:00<?, ?it/s]
Compiling.. : 0%| | 0/4000 [00:00<?, ?it/s]
Running chain 0: 0%| | 0/4000 [00:00<?, ?it/s]
Running chain 1: 0%| | 0/4000 [00:00<?, ?it/s]
Running chain 2: 0%| | 0/4000 [00:00<?, ?it/s]
Running chain 3: 0%| | 0/4000 [00:00<?, ?it/s]
Running chain 2: 5%|▌ | 200/4000 [00:01<00:06, 619.71it/s]
Running chain 1: 5%|▌ | 200/4000 [00:01<00:06, 591.27it/s]
Running chain 3: 5%|▌ | 200/4000 [00:01<00:06, 555.24it/s]
Running chain 0: 5%|▌ | 200/4000 [00:01<00:07, 520.44it/s]
Running chain 2: 10%|█ | 400/4000 [00:01<00:05, 653.24it/s]
Running chain 1: 10%|█ | 400/4000 [00:01<00:06, 593.14it/s]
Running chain 0: 10%|█ | 400/4000 [00:01<00:05, 603.13it/s]
Running chain 3: 10%|█ | 400/4000 [00:01<00:06, 568.07it/s]
Running chain 2: 15%|█▌ | 600/4000 [00:01<00:05, 626.22it/s]
Running chain 1: 15%|█▌ | 600/4000 [00:01<00:05, 599.02it/s]
Running chain 0: 15%|█▌ | 600/4000 [00:01<00:05, 600.70it/s]
Running chain 3: 15%|█▌ | 600/4000 [00:01<00:05, 576.38it/s]
Running chain 2: 20%|██ | 800/4000 [00:01<00:04, 663.52it/s]
Running chain 0: 20%|██ | 800/4000 [00:01<00:04, 648.53it/s]
Running chain 1: 20%|██ | 800/4000 [00:02<00:05, 620.54it/s]
Running chain 3: 20%|██ | 800/4000 [00:02<00:05, 617.57it/s]
Running chain 2: 25%|██▌ | 1000/4000 [00:02<00:04, 646.59it/s]
Running chain 0: 25%|██▌ | 1000/4000 [00:02<00:04, 645.92it/s]
Running chain 1: 25%|██▌ | 1000/4000 [00:02<00:04, 616.77it/s]
Running chain 3: 25%|██▌ | 1000/4000 [00:02<00:04, 633.50it/s]
Running chain 2: 30%|███ | 1200/4000 [00:02<00:04, 667.67it/s]
Running chain 0: 30%|███ | 1200/4000 [00:02<00:04, 659.29it/s]
Running chain 1: 30%|███ | 1200/4000 [00:02<00:04, 643.76it/s]
Running chain 3: 30%|███ | 1200/4000 [00:02<00:04, 650.52it/s]
Running chain 2: 35%|███▌ | 1400/4000 [00:02<00:03, 658.02it/s]
Running chain 0: 35%|███▌ | 1400/4000 [00:02<00:03, 687.65it/s]
Running chain 1: 35%|███▌ | 1400/4000 [00:02<00:03, 662.47it/s]
Running chain 3: 35%|███▌ | 1400/4000 [00:02<00:03, 668.00it/s]
Running chain 2: 40%|████ | 1600/4000 [00:03<00:03, 672.99it/s]
Running chain 0: 40%|████ | 1600/4000 [00:03<00:03, 693.43it/s]
Running chain 3: 40%|████ | 1600/4000 [00:03<00:03, 680.94it/s]
Running chain 1: 40%|████ | 1600/4000 [00:03<00:03, 652.28it/s]
Running chain 0: 45%|████▌ | 1800/4000 [00:03<00:03, 701.95it/s]
Running chain 2: 45%|████▌ | 1800/4000 [00:03<00:03, 677.35it/s]
Running chain 3: 45%|████▌ | 1800/4000 [00:03<00:03, 715.80it/s]
Running chain 1: 45%|████▌ | 1800/4000 [00:03<00:03, 674.54it/s]
Running chain 0: 50%|█████ | 2000/4000 [00:03<00:02, 703.71it/s]
Running chain 2: 50%|█████ | 2000/4000 [00:03<00:03, 665.42it/s]
Running chain 3: 50%|█████ | 2000/4000 [00:03<00:02, 695.45it/s]
Running chain 1: 50%|█████ | 2000/4000 [00:03<00:03, 660.49it/s]
Running chain 2: 55%|█████▌ | 2200/4000 [00:04<00:02, 656.60it/s]
Running chain 0: 55%|█████▌ | 2200/4000 [00:04<00:02, 654.22it/s]
Running chain 3: 55%|█████▌ | 2200/4000 [00:04<00:02, 677.15it/s]
Running chain 1: 55%|█████▌ | 2200/4000 [00:04<00:02, 661.79it/s]
Running chain 2: 60%|██████ | 2400/4000 [00:04<00:02, 645.39it/s]
Running chain 0: 60%|██████ | 2400/4000 [00:04<00:02, 633.06it/s]
Running chain 3: 60%|██████ | 2400/4000 [00:04<00:02, 651.16it/s]
Running chain 1: 60%|██████ | 2400/4000 [00:04<00:02, 640.43it/s]
Running chain 2: 65%|██████▌ | 2600/4000 [00:04<00:02, 636.24it/s]
Running chain 3: 65%|██████▌ | 2600/4000 [00:04<00:02, 654.12it/s]
Running chain 0: 65%|██████▌ | 2600/4000 [00:04<00:02, 606.98it/s]
Running chain 1: 65%|██████▌ | 2600/4000 [00:04<00:02, 638.00it/s]
Running chain 3: 70%|███████ | 2800/4000 [00:05<00:01, 642.07it/s]
Running chain 2: 70%|███████ | 2800/4000 [00:05<00:01, 623.59it/s]
Running chain 0: 70%|███████ | 2800/4000 [00:05<00:02, 592.91it/s]
Running chain 1: 70%|███████ | 2800/4000 [00:05<00:01, 612.76it/s]
Running chain 3: 75%|███████▌ | 3000/4000 [00:05<00:01, 645.08it/s]
Running chain 2: 75%|███████▌ | 3000/4000 [00:05<00:01, 611.74it/s]
Running chain 1: 75%|███████▌ | 3000/4000 [00:05<00:01, 615.12it/s]
Running chain 0: 75%|███████▌ | 3000/4000 [00:05<00:01, 590.03it/s]
Running chain 3: 80%|████████ | 3200/4000 [00:05<00:01, 639.04it/s]
Running chain 2: 80%|████████ | 3200/4000 [00:05<00:01, 616.49it/s]
Running chain 1: 80%|████████ | 3200/4000 [00:05<00:01, 611.50it/s]
Running chain 0: 80%|████████ | 3200/4000 [00:05<00:01, 578.86it/s]
Running chain 3: 85%|████████▌ | 3400/4000 [00:05<00:00, 653.28it/s]
Running chain 2: 85%|████████▌ | 3400/4000 [00:05<00:00, 626.92it/s]
Running chain 1: 85%|████████▌ | 3400/4000 [00:06<00:00, 614.04it/s]
Running chain 0: 85%|████████▌ | 3400/4000 [00:06<00:01, 568.79it/s]
Running chain 3: 90%|█████████ | 3600/4000 [00:06<00:00, 642.11it/s]
Running chain 2: 90%|█████████ | 3600/4000 [00:06<00:00, 618.91it/s]
Running chain 1: 90%|█████████ | 3600/4000 [00:06<00:00, 613.62it/s]
Running chain 0: 90%|█████████ | 3600/4000 [00:06<00:00, 568.46it/s]
Running chain 3: 95%|█████████▌| 3800/4000 [00:06<00:00, 648.43it/s]
Running chain 2: 95%|█████████▌| 3800/4000 [00:06<00:00, 623.63it/s]
Running chain 1: 95%|█████████▌| 3800/4000 [00:06<00:00, 612.33it/s]
Running chain 0: 95%|█████████▌| 3800/4000 [00:06<00:00, 575.19it/s]
Running chain 3: 100%|██████████| 4000/4000 [00:06<00:00, 653.15it/s]
Running chain 3: 100%|██████████| 4000/4000 [00:06<00:00, 583.22it/s]
Running chain 2: 100%|██████████| 4000/4000 [00:06<00:00, 622.25it/s]
Running chain 2: 100%|██████████| 4000/4000 [00:06<00:00, 575.01it/s]
Running chain 1: 100%|██████████| 4000/4000 [00:07<00:00, 617.70it/s]
Running chain 1: 100%|██████████| 4000/4000 [00:07<00:00, 566.28it/s]
Running chain 0: 100%|██████████| 4000/4000 [00:07<00:00, 574.80it/s]
Running chain 0: 100%|██████████| 4000/4000 [00:07<00:00, 555.02it/s]
posterior_predictive_loglike finished in 159 ms
loo_res finished in 9061 ms (9.06 s)
loo finished in 9061 ms (9.06 s)
The validation consists in organizing the data into several slices of about the same size, randomly defined. Each slice is considered as a validation dataset. The model is then re-estimated using all the data except the slice, and the estimated model is applied on the validation set (i.e. the slice). The value of the log likelihood for each observation in the validation set is reported in a dataframe. As this is done for each slice, the output is a list of dataframes, each corresponding to one of these exercises.
validation_results: list[ValidationResult] = the_biogeme.validate(results, slices=5)
for slide in validation_results:
print(
f'Log likelihood for {slide.simulated_values.shape[0]} validation data: '
f'{slide.simulated_values.iloc[:, 0].sum()}'
)
Log likelihood for 1354 validation data: -1055.0188479016829
Log likelihood for 1354 validation data: -1079.9910821018343
Log likelihood for 1354 validation data: -1066.3253641170731
Log likelihood for 1353 validation data: -1083.5415568879703
Log likelihood for 1353 validation data: -1056.7560136588377
Total running time of the script: (0 minutes 52.835 seconds)