Note
Go to the end to download the full example code.
Investigation of several choice models
Investigate several choice models:
logit
nested logit with two nests: public and private transportation
nested logit with two nests existing and future modes
for a total of 3 specifications. See Bierlaire and Ortelli (2023).
- author:
Michel Bierlaire, EPFL
- date:
Fri Jul 14 09:47:21 2023
import biogeme.biogeme as bio
import biogeme.biogeme_logging as blog
from biogeme import models
from biogeme.expressions import Beta
from biogeme.catalog import Catalog
from biogeme.nests import OneNestForNestedLogit, NestsForNestedLogit
from biogeme.results import compile_estimation_results, pareto_optimal
from IPython.core.display_functions import display
from biogeme.data.swissmetro import (
read_data,
CHOICE,
SM_AV,
CAR_AV_SP,
TRAIN_AV_SP,
TRAIN_TT_SCALED,
TRAIN_COST_SCALED,
SM_TT_SCALED,
SM_COST_SCALED,
CAR_TT_SCALED,
CAR_CO_SCALED,
)
logger = blog.get_screen_logger(level=blog.INFO)
Parameters to be estimated
ASC_CAR = Beta('ASC_CAR', 0, None, None, 0)
ASC_TRAIN = Beta('ASC_TRAIN', 0, None, None, 0)
B_TIME = Beta('B_TIME', 0, None, None, 0)
B_COST = Beta('B_COST', 0, None, None, 0)
Definition of the utility functions
V1 = ASC_TRAIN + B_TIME * TRAIN_TT_SCALED + B_COST * TRAIN_COST_SCALED
V2 = B_TIME * SM_TT_SCALED + B_COST * SM_COST_SCALED
V3 = ASC_CAR + B_TIME * CAR_TT_SCALED + B_COST * CAR_CO_SCALED
Associate utility functions with the numbering of alternatives
V = {1: V1, 2: V2, 3: V3}
Associate the availability conditions with the alternatives
av = {1: TRAIN_AV_SP, 2: SM_AV, 3: CAR_AV_SP}
Definition of the logit model. This is the contribution of each observation to the log likelihood function.
logprob_logit = models.loglogit(V, av, CHOICE)
Nested logit model: nest with existing alternatives.
mu_existing = Beta('mu_existing', 1, 1, 10, 0)
existing = OneNestForNestedLogit(
nest_param=mu_existing, list_of_alternatives=[1, 3], name='Existing'
)
nests_existing = NestsForNestedLogit(choice_set=list(V), tuple_of_nests=(existing,))
logprob_nested_existing = models.lognested(V, av, nests_existing, CHOICE)
The following elements do not appear in any nest and are assumed each to be alone in a separate nest: {2}. If it is not the intention, check the assignment of alternatives to nests.
Nested logit model: nest with public transportation alternatives.
mu_public = Beta('mu_public', 1, 1, 10, 0)
public = OneNestForNestedLogit(
nest_param=mu_public, list_of_alternatives=[1, 2], name='Public'
)
nests_public = NestsForNestedLogit(choice_set=list(V), tuple_of_nests=(public,))
logprob_nested_public = models.lognested(V, av, nests_public, CHOICE)
The following elements do not appear in any nest and are assumed each to be alone in a separate nest: {3}. If it is not the intention, check the assignment of alternatives to nests.
Catalog.
model_catalog = Catalog.from_dict(
catalog_name='model_catalog',
dict_of_expressions={
'logit': logprob_logit,
'nested existing': logprob_nested_existing,
'nested public': logprob_nested_public,
},
)
Read the data
database = read_data()
Create the Biogeme object.
the_biogeme = bio.BIOGEME(database, model_catalog)
the_biogeme.modelName = 'b01model'
the_biogeme.generate_html = False
the_biogeme.generate_pickle = False
Biogeme parameters read from biogeme.toml.
Estimate the parameters.
dict_of_results = the_biogeme.estimate_catalog()
Estimating 3 models.
Biogeme parameters read from biogeme.toml.
As the model is not too complex, we activate the calculation of second derivatives. If you want to change it, change the name of the algorithm in the TOML file from "automatic" to "simple_bounds"
*** Initial values of the parameters are obtained from the file __b01model_000000.iter
Cannot read file __b01model_000000.iter. Statement is ignored.
As the model is not too complex, we activate the calculation of second derivatives. If you want to change it, change the name of the algorithm in the TOML file from "automatic" to "simple_bounds"
Optimization algorithm: hybrid Newton/BFGS with simple bounds [simple_bounds]
** Optimization: Newton with trust region for simple bounds
Iter. ASC_CAR ASC_TRAIN B_COST B_TIME Function Relgrad Radius Rho
0 -0.29 -0.76 -0.7 -0.77 8.8e+03 0.04 10 1.1 ++
1 -0.0015 -0.66 -0.77 -1.2 8.7e+03 0.0064 1e+02 1.1 ++
2 -0.0015 -0.66 -0.77 -1.2 8.7e+03 0.00012 1e+02 1 ++
Biogeme parameters read from biogeme.toml.
As the model is not too complex, we activate the calculation of second derivatives. If you want to change it, change the name of the algorithm in the TOML file from "automatic" to "simple_bounds"
*** Initial values of the parameters are obtained from the file __b01model_000001.iter
Cannot read file __b01model_000001.iter. Statement is ignored.
As the model is not too complex, we activate the calculation of second derivatives. If you want to change it, change the name of the algorithm in the TOML file from "automatic" to "simple_bounds"
Optimization algorithm: hybrid Newton/BFGS with simple bounds [simple_bounds]
** Optimization: Newton with trust region for simple bounds
Iter. ASC_CAR ASC_TRAIN B_COST B_TIME mu_existing Function Relgrad Radius Rho
0 0.078 -0.43 -0.72 -1.1 1.4 8.6e+03 0.01 10 1.2 ++
1 0.033 -0.4 -0.68 -1.1 1.7 8.5e+03 0.0034 1e+02 1.2 ++
2 0.0091 -0.38 -0.64 -0.99 1.9 8.5e+03 0.0018 1e+03 1.2 ++
3 -4.5e-05 -0.37 -0.63 -0.96 2 8.5e+03 0.00049 1e+04 1.1 ++
4 -4.5e-05 -0.37 -0.63 -0.96 2 8.5e+03 8.4e-06 1e+04 1 ++
Biogeme parameters read from biogeme.toml.
As the model is not too complex, we activate the calculation of second derivatives. If you want to change it, change the name of the algorithm in the TOML file from "automatic" to "simple_bounds"
*** Initial values of the parameters are obtained from the file __b01model_000002.iter
Cannot read file __b01model_000002.iter. Statement is ignored.
As the model is not too complex, we activate the calculation of second derivatives. If you want to change it, change the name of the algorithm in the TOML file from "automatic" to "simple_bounds"
Optimization algorithm: hybrid Newton/BFGS with simple bounds [simple_bounds]
** Optimization: Newton with trust region for simple bounds
Iter. ASC_CAR ASC_TRAIN B_COST B_TIME mu_public Function Relgrad Radius Rho
0 -0.13 -0.54 -0.76 -1.2 1.2 8.7e+03 0.013 1 0.87 +
1 0.0049 -0.61 -0.83 -1.3 1 8.7e+03 0.0049 1 0.65 +
2 -0.0017 -0.6 -0.78 -1.3 1.1 8.7e+03 0.00094 10 1 ++
3 -0.0095 -0.57 -0.78 -1.3 1.1 8.7e+03 0.00023 1e+02 1 ++
4 -0.0095 -0.57 -0.78 -1.3 1.1 8.7e+03 3.2e-07 1e+02 1 ++
Number of estimated models.
print(f'A total of {len(dict_of_results)} models have been estimated')
A total of 3 models have been estimated
All estimation results
compiled_results, specs = compile_estimation_results(
dict_of_results, use_short_names=True
)
display(compiled_results)
Model_000000 ... Model_000002
Number of estimated parameters 4 ... 5
Sample size 10719 ... 10719
Final log likelihood -8670.163646 ... -8669.496582
Akaike Information Criterion 17348.327292 ... 17348.993165
Bayesian Information Criterion 17377.446385 ... 17385.39203
ASC_CAR (t-test) 0.0159 (0.429) ... -0.00974 (-0.195)
ASC_TRAIN (t-test) -0.652 (-12) ... -0.567 (-6.87)
B_COST (t-test) -0.789 (-15.5) ... -0.782 (-15)
B_TIME (t-test) -1.28 (-19.5) ... -1.25 (-16.2)
mu_existing (t-test) ...
mu_public (t-test) ... 1.08 (12.5)
[11 rows x 3 columns]
Glossary
for short_name, spec in specs.items():
print(f'{short_name}\t{spec}')
Model_000000 model_catalog:logit
Model_000001 model_catalog:nested existing
Model_000002 model_catalog:nested public
Estimation results of the Pareto optimal models.
pareto_results = pareto_optimal(dict_of_results)
compiled_pareto_results, pareto_specs = compile_estimation_results(
pareto_results, use_short_names=True
)
No Pareto file has been provided
display(compiled_pareto_results)
Model_000000 Model_000001
Number of estimated parameters 5 4
Sample size 10719 10719
Final log likelihood -8526.8899 -8670.163646
Akaike Information Criterion 17063.7798 17348.327292
Bayesian Information Criterion 17100.178666 17377.446385
ASC_CAR (t-test) -0.00129 (-0.0378) 0.0159 (0.429)
ASC_TRAIN (t-test) -0.373 (-7.17) -0.652 (-12)
B_COST (t-test) -0.629 (-14.8) -0.789 (-15.5)
B_TIME (t-test) -0.958 (-14.7) -1.28 (-19.5)
mu_existing (t-test) 2.05 (15.8)
Glossary.
for short_name, spec in pareto_specs.items():
print(f'{short_name}\t{spec}')
Model_000000 model_catalog:nested existing
Model_000001 model_catalog:logit
Total running time of the script: (0 minutes 0.970 seconds)