Note

Go to the end to download the full example code.

Investigation of several choice models¶

Investigate several choice models:

logit

nested logit with two nests: public and private transportation

nested logit with two nests existing and future modes

for a total of 3 specifications. See Bierlaire and Ortelli (2023).

Michel Bierlaire, EPFL Sun Apr 27 2025, 15:46:15

from IPython.core.display_functions import display

import biogeme.biogeme_logging as blog
from biogeme.biogeme import BIOGEME
from biogeme.catalog import Catalog
from biogeme.data.swissmetro import (
    CAR_AV_SP,
    CAR_CO_SCALED,
    CAR_TT_SCALED,
    CHOICE,
    SM_AV,
    SM_COST_SCALED,
    SM_TT_SCALED,
    TRAIN_AV_SP,
    TRAIN_COST_SCALED,
    TRAIN_TT_SCALED,
    read_data,
)
from biogeme.expressions import Beta
from biogeme.models import loglogit, lognested
from biogeme.nests import NestsForNestedLogit, OneNestForNestedLogit
from biogeme.results_processing import compile_estimation_results, pareto_optimal

logger = blog.get_screen_logger(level=blog.INFO)

Parameters to be estimated

asc_car = Beta('asc_car', 0, None, None, 0)
asc_train = Beta('asc_train', 0, None, None, 0)
b_time = Beta('b_time', 0, None, None, 0)
b_cost = Beta('b_cost', 0, None, None, 0)

Definition of the utility functions

v_train = asc_train + b_time * TRAIN_TT_SCALED + b_cost * TRAIN_COST_SCALED
v_swissmetro = b_time * SM_TT_SCALED + b_cost * SM_COST_SCALED
v_car = asc_car + b_time * CAR_TT_SCALED + b_cost * CAR_CO_SCALED

Associate utility functions with the numbering of alternatives

v = {1: v_train, 2: v_swissmetro, 3: v_car}

Associate the availability conditions with the alternatives

av = {1: TRAIN_AV_SP, 2: SM_AV, 3: CAR_AV_SP}

Definition of the logit model. This is the contribution of each observation to the log likelihood function.

log_probability_logit = loglogit(v, av, CHOICE)

Nested logit model: nest with existing alternatives.

mu_existing = Beta('mu_existing', 1, 1, 10, 0)
existing = OneNestForNestedLogit(
    nest_param=mu_existing, list_of_alternatives=[1, 3], name='Existing'
)

nests_existing = NestsForNestedLogit(choice_set=list(v), tuple_of_nests=(existing,))
log_probability_nested_existing = lognested(v, av, nests_existing, CHOICE)

The following elements do not appear in any nest and are assumed each to be alone in a separate nest: {2}. If it is not the intention, check the assignment of alternatives to nests.

Nested logit model: nest with public transportation alternatives.

mu_public = Beta('mu_public', 1, 1, 10, 0)
public = OneNestForNestedLogit(
    nest_param=mu_public, list_of_alternatives=[1, 2], name='Public'
)

nests_public = NestsForNestedLogit(choice_set=list(v), tuple_of_nests=(public,))
log_probability_nested_public = lognested(v, av, nests_public, CHOICE)

The following elements do not appear in any nest and are assumed each to be alone in a separate nest: {3}. If it is not the intention, check the assignment of alternatives to nests.

Catalog.

model_catalog = Catalog.from_dict(
    catalog_name='model_catalog',
    dict_of_expressions={
        'logit': log_probability_logit,
        'nested existing': log_probability_nested_existing,
        'nested public': log_probability_nested_public,
    },
)

Read the data

database = read_data()

Create the Biogeme object.

the_biogeme = BIOGEME(database, model_catalog, generate_html=False, generate_yaml=False)
the_biogeme.model_name = 'b01model'

Biogeme parameters read from biogeme.toml.

Estimate the parameters.

dict_of_results = the_biogeme.estimate_catalog()

Estimating 3 models.
Biogeme parameters provided by the user.
*** Initial values of the parameters are obtained from the file __b01model_000000.iter
Cannot read file __b01model_000000.iter. Statement is ignored.
Starting values for the algorithm: {}
As the model is not too complex, we activate the calculation of second derivatives. To change this behavior, modify the algorithm to "simple_bounds" in the TOML file.
Optimization algorithm: hybrid Newton/BFGS with simple bounds [simple_bounds]
** Optimization: Newton with trust region for simple bounds
Iter.       asc_train          b_time          b_cost       mu_public         asc_car     Function    Relgrad   Radius      Rho
    0           -0.82              -1           -0.19             1.4          -0.028        9e+03      0.068        1     0.68    +
    1            0.18            -1.4           -0.61             1.6            0.11      8.8e+03      0.058        1     0.47    +
    2           -0.29           -0.94           -0.69             1.8           -0.23      8.7e+03     0.0096        1     0.81    +
    3           -0.29           -0.94           -0.69             1.8           -0.23      8.7e+03     0.0096     0.41    -0.11    -
    4           -0.23            -1.2           -0.71             1.4           -0.07      8.7e+03      0.011     0.41     0.87    +
    5           -0.41            -1.2           -0.77             1.2          -0.045      8.7e+03     0.0057      4.1      1.2   ++
    6           -0.51            -1.2           -0.78             1.1          -0.024      8.7e+03     0.0017       41      1.2   ++
    7           -0.56            -1.3           -0.78             1.1          -0.012      8.7e+03    0.00034  4.1e+02      1.1   ++
    8           -0.57            -1.3           -0.78             1.1         -0.0098      8.7e+03    1.2e-05  4.1e+03        1   ++
    9           -0.57            -1.3           -0.78             1.1         -0.0098      8.7e+03    2.1e-08  4.1e+03        1   ++
Optimization algorithm has converged.
Relative gradient: 2.1217753971153207e-08
Cause of termination: Relative gradient = 2.1e-08 <= 6.1e-06
Number of function evaluations: 29
Number of gradient evaluations: 19
Number of hessian evaluations: 9
Algorithm: Newton with trust region for simple bound constraints
Number of iterations: 10
Proportion of Hessian calculation: 9/9 = 100.0%
Optimization time: 0:00:01.293310
Calculate second derivatives and BHHH
Biogeme parameters provided by the user.
*** Initial values of the parameters are obtained from the file __b01model_000001.iter
Cannot read file __b01model_000001.iter. Statement is ignored.
Starting values for the algorithm: {}
Optimization algorithm: hybrid Newton/BFGS with simple bounds [simple_bounds]
** Optimization: Newton with trust region for simple bounds
Iter.       asc_train          b_time          b_cost         asc_car     Function    Relgrad   Radius      Rho
    0           -0.76           -0.77            -0.7           -0.29      8.8e+03       0.04       10      1.1   ++
    1           -0.66            -1.2           -0.77         -0.0015      8.7e+03     0.0064    1e+02      1.1   ++
    2           -0.65            -1.3           -0.79           0.016      8.7e+03    0.00012    1e+03        1   ++
    3           -0.65            -1.3           -0.79           0.016      8.7e+03      4e-08    1e+03        1   ++
Optimization algorithm has converged.
Relative gradient: 3.954408093567457e-08
Cause of termination: Relative gradient = 4e-08 <= 6.1e-06
Number of function evaluations: 13
Number of gradient evaluations: 9
Number of hessian evaluations: 4
Algorithm: Newton with trust region for simple bound constraints
Number of iterations: 4
Proportion of Hessian calculation: 4/4 = 100.0%
Optimization time: 0:00:00.398264
Calculate second derivatives and BHHH
Biogeme parameters provided by the user.
*** Initial values of the parameters are obtained from the file __b01model_000002.iter
Cannot read file __b01model_000002.iter. Statement is ignored.
Starting values for the algorithm: {}
Optimization algorithm: hybrid Newton/BFGS with simple bounds [simple_bounds]
** Optimization: Newton with trust region for simple bounds
Iter.       asc_train          b_time          b_cost     mu_existing         asc_car     Function    Relgrad   Radius      Rho
    0            -0.7           -0.85              -1             1.6            0.34      8.9e+03       0.12        1     0.79    +
    1            -0.7           -0.85              -1             1.6            0.34      8.9e+03       0.12      0.5     -0.8    -
    2           -0.39           -0.99            -0.5             1.9          -0.088      8.6e+03      0.031      0.5     0.78    +
    3           -0.37           -0.93           -0.62             2.1         -0.0038      8.5e+03     0.0021        5     0.96   ++
    4           -0.37           -0.96           -0.63               2        -0.00094      8.5e+03    0.00013       50     0.97   ++
    5           -0.37           -0.96           -0.63               2        -0.00094      8.5e+03    6.7e-07       50        1   ++
Optimization algorithm has converged.
Relative gradient: 6.680761256997596e-07
Cause of termination: Relative gradient = 6.7e-07 <= 6.1e-06
Number of function evaluations: 17
Number of gradient evaluations: 11
Number of hessian evaluations: 5
Algorithm: Newton with trust region for simple bound constraints
Number of iterations: 6
Proportion of Hessian calculation: 5/5 = 100.0%
Optimization time: 0:00:01.229005
Calculate second derivatives and BHHH

Number of estimated models.

print(f'A total of {len(dict_of_results)} models have been estimated')

A total of 3 models have been estimated

All estimation results

compiled_results, specs = compile_estimation_results(
    dict_of_results, use_short_names=True
)

display('All estimated models')
display(compiled_results)

All estimated models
                                      Model_000000  ...         Model_000002
Number of estimated parameters                   5  ...                    5
Sample size                                  10719  ...                10719
Final log likelihood                     -8669.497  ...             -8526.89
Akaike Information Criterion              17348.99  ...             17063.78
Bayesian Information Criterion            17385.39  ...             17100.18
asc_train (t-test)                 -0.567  (-6.87)  ...      -0.373  (-7.17)
b_time (t-test)                     -1.25  (-16.2)  ...      -0.958  (-14.7)
b_cost (t-test)                      -0.782  (-15)  ...      -0.629  (-14.8)
mu_public (t-test)                    1.08  (12.5)  ...
asc_car (t-test)                -0.00975  (-0.195)  ...  -0.00131  (-0.0383)
mu_existing (t-test)                                ...         2.05  (15.8)

[11 rows x 3 columns]

Glossary

for short_name, spec in specs.items():
    print(f'{short_name}\t{spec}')

Model_000000    model_catalog:nested public
Model_000001    model_catalog:logit
Model_000002    model_catalog:nested existing

Estimation results of the Pareto optimal models.

pareto_results = pareto_optimal(dict_of_results)
compiled_pareto_results, pareto_specs = compile_estimation_results(
    pareto_results, use_short_names=True
)

No Pareto file has been provided

display('Non dominated models')
display(compiled_pareto_results)

Non dominated models
                                       Model_000000     Model_000001
Number of estimated parameters                    5                4
Sample size                                   10719            10719
Final log likelihood                       -8526.89        -8670.163
Akaike Information Criterion               17063.78         17348.33
Bayesian Information Criterion             17100.18         17377.45
asc_train (t-test)                  -0.373  (-7.17)    -0.652  (-12)
b_time (t-test)                     -0.958  (-14.7)   -1.28  (-19.5)
b_cost (t-test)                     -0.629  (-14.8)   -0.79  (-15.5)
mu_existing (t-test)                   2.05  (15.8)
asc_car (t-test)                -0.00131  (-0.0383)  0.0162  (0.438)

Glossary.

for short_name, spec in pareto_specs.items():
    print(f'{short_name}\t{spec}')

Model_000000    model_catalog:nested existing
Model_000001    model_catalog:logit

Total running time of the script: (0 minutes 5.560 seconds)

Gallery generated by Sphinx-Gallery