Note
Go to the end to download the full example code.
Investigation of several choice modelsΒΆ
Investigate several choice models:
logit
nested logit with two nests: public and private transportation
nested logit with two nests existing and future modes
for a total of 3 specifications. See Bierlaire and Ortelli (2023).
Michel Bierlaire, EPFL Sun Apr 27 2025, 15:46:15
from IPython.core.display_functions import display
import biogeme.biogeme_logging as blog
from biogeme.biogeme import BIOGEME
from biogeme.catalog import Catalog
from biogeme.data.swissmetro import (
CAR_AV_SP,
CAR_CO_SCALED,
CAR_TT_SCALED,
CHOICE,
SM_AV,
SM_COST_SCALED,
SM_TT_SCALED,
TRAIN_AV_SP,
TRAIN_COST_SCALED,
TRAIN_TT_SCALED,
read_data,
)
from biogeme.expressions import Beta
from biogeme.models import loglogit, lognested
from biogeme.nests import NestsForNestedLogit, OneNestForNestedLogit
from biogeme.results_processing import compile_estimation_results, pareto_optimal
logger = blog.get_screen_logger(level=blog.INFO)
Parameters to be estimated
asc_car = Beta('asc_car', 0, None, None, 0)
asc_train = Beta('asc_train', 0, None, None, 0)
b_time = Beta('b_time', 0, None, None, 0)
b_cost = Beta('b_cost', 0, None, None, 0)
Definition of the utility functions
v_train = asc_train + b_time * TRAIN_TT_SCALED + b_cost * TRAIN_COST_SCALED
v_swissmetro = b_time * SM_TT_SCALED + b_cost * SM_COST_SCALED
v_car = asc_car + b_time * CAR_TT_SCALED + b_cost * CAR_CO_SCALED
Associate utility functions with the numbering of alternatives
v = {1: v_train, 2: v_swissmetro, 3: v_car}
Associate the availability conditions with the alternatives
av = {1: TRAIN_AV_SP, 2: SM_AV, 3: CAR_AV_SP}
Definition of the logit model. This is the contribution of each observation to the log likelihood function.
log_probability_logit = loglogit(v, av, CHOICE)
Nested logit model: nest with existing alternatives.
mu_existing = Beta('mu_existing', 1, 1, 10, 0)
existing = OneNestForNestedLogit(
nest_param=mu_existing, list_of_alternatives=[1, 3], name='Existing'
)
nests_existing = NestsForNestedLogit(choice_set=list(v), tuple_of_nests=(existing,))
log_probability_nested_existing = lognested(v, av, nests_existing, CHOICE)
The following elements do not appear in any nest and are assumed each to be alone in a separate nest: {2}. If it is not the intention, check the assignment of alternatives to nests.
Nested logit model: nest with public transportation alternatives.
mu_public = Beta('mu_public', 1, 1, 10, 0)
public = OneNestForNestedLogit(
nest_param=mu_public, list_of_alternatives=[1, 2], name='Public'
)
nests_public = NestsForNestedLogit(choice_set=list(v), tuple_of_nests=(public,))
log_probability_nested_public = lognested(v, av, nests_public, CHOICE)
The following elements do not appear in any nest and are assumed each to be alone in a separate nest: {3}. If it is not the intention, check the assignment of alternatives to nests.
Catalog.
model_catalog = Catalog.from_dict(
catalog_name='model_catalog',
dict_of_expressions={
'logit': log_probability_logit,
'nested existing': log_probability_nested_existing,
'nested public': log_probability_nested_public,
},
)
Read the data
database = read_data()
Create the Biogeme object.
the_biogeme = BIOGEME(database, model_catalog, generate_html=False, generate_yaml=False)
the_biogeme.model_name = 'b01model'
Biogeme parameters read from biogeme.toml.
Estimate the parameters.
dict_of_results = the_biogeme.estimate_catalog()
Estimating 3 models.
Biogeme parameters provided by the user.
*** Initial values of the parameters are obtained from the file __b01model_000000.iter
Cannot read file __b01model_000000.iter. Statement is ignored.
Starting values for the algorithm: {}
As the model is not too complex, we activate the calculation of second derivatives. To change this behavior, modify the algorithm to "simple_bounds" in the TOML file.
Optimization algorithm: hybrid Newton/BFGS with simple bounds [simple_bounds]
** Optimization: Newton with trust region for simple bounds
Iter. asc_train b_time b_cost mu_public asc_car Function Relgrad Radius Rho
0 -0.82 -1 -0.19 1.4 -0.028 9e+03 0.068 1 0.68 +
1 0.18 -1.4 -0.61 1.6 0.11 8.8e+03 0.058 1 0.47 +
2 -0.29 -0.94 -0.69 1.8 -0.23 8.7e+03 0.0096 1 0.81 +
3 -0.29 -0.94 -0.69 1.8 -0.23 8.7e+03 0.0096 0.41 -0.11 -
4 -0.23 -1.2 -0.71 1.4 -0.07 8.7e+03 0.011 0.41 0.87 +
5 -0.41 -1.2 -0.77 1.2 -0.045 8.7e+03 0.0057 4.1 1.2 ++
6 -0.51 -1.2 -0.78 1.1 -0.024 8.7e+03 0.0017 41 1.2 ++
7 -0.56 -1.3 -0.78 1.1 -0.012 8.7e+03 0.00034 4.1e+02 1.1 ++
8 -0.57 -1.3 -0.78 1.1 -0.0098 8.7e+03 1.2e-05 4.1e+03 1 ++
9 -0.57 -1.3 -0.78 1.1 -0.0098 8.7e+03 2.1e-08 4.1e+03 1 ++
Optimization algorithm has converged.
Relative gradient: 2.1217753971153207e-08
Cause of termination: Relative gradient = 2.1e-08 <= 6.1e-06
Number of function evaluations: 29
Number of gradient evaluations: 19
Number of hessian evaluations: 9
Algorithm: Newton with trust region for simple bound constraints
Number of iterations: 10
Proportion of Hessian calculation: 9/9 = 100.0%
Optimization time: 0:00:01.293310
Calculate second derivatives and BHHH
Biogeme parameters provided by the user.
*** Initial values of the parameters are obtained from the file __b01model_000001.iter
Cannot read file __b01model_000001.iter. Statement is ignored.
Starting values for the algorithm: {}
Optimization algorithm: hybrid Newton/BFGS with simple bounds [simple_bounds]
** Optimization: Newton with trust region for simple bounds
Iter. asc_train b_time b_cost asc_car Function Relgrad Radius Rho
0 -0.76 -0.77 -0.7 -0.29 8.8e+03 0.04 10 1.1 ++
1 -0.66 -1.2 -0.77 -0.0015 8.7e+03 0.0064 1e+02 1.1 ++
2 -0.65 -1.3 -0.79 0.016 8.7e+03 0.00012 1e+03 1 ++
3 -0.65 -1.3 -0.79 0.016 8.7e+03 4e-08 1e+03 1 ++
Optimization algorithm has converged.
Relative gradient: 3.954408093567457e-08
Cause of termination: Relative gradient = 4e-08 <= 6.1e-06
Number of function evaluations: 13
Number of gradient evaluations: 9
Number of hessian evaluations: 4
Algorithm: Newton with trust region for simple bound constraints
Number of iterations: 4
Proportion of Hessian calculation: 4/4 = 100.0%
Optimization time: 0:00:00.398264
Calculate second derivatives and BHHH
Biogeme parameters provided by the user.
*** Initial values of the parameters are obtained from the file __b01model_000002.iter
Cannot read file __b01model_000002.iter. Statement is ignored.
Starting values for the algorithm: {}
Optimization algorithm: hybrid Newton/BFGS with simple bounds [simple_bounds]
** Optimization: Newton with trust region for simple bounds
Iter. asc_train b_time b_cost mu_existing asc_car Function Relgrad Radius Rho
0 -0.7 -0.85 -1 1.6 0.34 8.9e+03 0.12 1 0.79 +
1 -0.7 -0.85 -1 1.6 0.34 8.9e+03 0.12 0.5 -0.8 -
2 -0.39 -0.99 -0.5 1.9 -0.088 8.6e+03 0.031 0.5 0.78 +
3 -0.37 -0.93 -0.62 2.1 -0.0038 8.5e+03 0.0021 5 0.96 ++
4 -0.37 -0.96 -0.63 2 -0.00094 8.5e+03 0.00013 50 0.97 ++
5 -0.37 -0.96 -0.63 2 -0.00094 8.5e+03 6.7e-07 50 1 ++
Optimization algorithm has converged.
Relative gradient: 6.680761256997596e-07
Cause of termination: Relative gradient = 6.7e-07 <= 6.1e-06
Number of function evaluations: 17
Number of gradient evaluations: 11
Number of hessian evaluations: 5
Algorithm: Newton with trust region for simple bound constraints
Number of iterations: 6
Proportion of Hessian calculation: 5/5 = 100.0%
Optimization time: 0:00:01.229005
Calculate second derivatives and BHHH
Number of estimated models.
print(f'A total of {len(dict_of_results)} models have been estimated')
A total of 3 models have been estimated
All estimation results
compiled_results, specs = compile_estimation_results(
dict_of_results, use_short_names=True
)
display('All estimated models')
display(compiled_results)
All estimated models
Model_000000 ... Model_000002
Number of estimated parameters 5 ... 5
Sample size 10719 ... 10719
Final log likelihood -8669.497 ... -8526.89
Akaike Information Criterion 17348.99 ... 17063.78
Bayesian Information Criterion 17385.39 ... 17100.18
asc_train (t-test) -0.567 (-6.87) ... -0.373 (-7.17)
b_time (t-test) -1.25 (-16.2) ... -0.958 (-14.7)
b_cost (t-test) -0.782 (-15) ... -0.629 (-14.8)
mu_public (t-test) 1.08 (12.5) ...
asc_car (t-test) -0.00975 (-0.195) ... -0.00131 (-0.0383)
mu_existing (t-test) ... 2.05 (15.8)
[11 rows x 3 columns]
Glossary
for short_name, spec in specs.items():
print(f'{short_name}\t{spec}')
Model_000000 model_catalog:nested public
Model_000001 model_catalog:logit
Model_000002 model_catalog:nested existing
Estimation results of the Pareto optimal models.
pareto_results = pareto_optimal(dict_of_results)
compiled_pareto_results, pareto_specs = compile_estimation_results(
pareto_results, use_short_names=True
)
No Pareto file has been provided
display('Non dominated models')
display(compiled_pareto_results)
Non dominated models
Model_000000 Model_000001
Number of estimated parameters 5 4
Sample size 10719 10719
Final log likelihood -8526.89 -8670.163
Akaike Information Criterion 17063.78 17348.33
Bayesian Information Criterion 17100.18 17377.45
asc_train (t-test) -0.373 (-7.17) -0.652 (-12)
b_time (t-test) -0.958 (-14.7) -1.28 (-19.5)
b_cost (t-test) -0.629 (-14.8) -0.79 (-15.5)
mu_existing (t-test) 2.05 (15.8)
asc_car (t-test) -0.00131 (-0.0383) 0.0162 (0.438)
Glossary.
for short_name, spec in pareto_specs.items():
print(f'{short_name}\t{spec}')
Model_000000 model_catalog:nested existing
Model_000001 model_catalog:logit
Total running time of the script: (0 minutes 5.560 seconds)