Investigation of several choice models

Investigate several choice models:

  • logit

  • nested logit with two nests: public and private transportation

  • nested logit with two nests existing and future modes

for a total of 3 specifications. See Bierlaire and Ortelli (2023).

author:

Michel Bierlaire, EPFL

date:

Fri Jul 14 09:47:21 2023

import biogeme.biogeme as bio
import biogeme.biogeme_logging as blog
from biogeme import models
from biogeme.expressions import Beta
from biogeme.catalog import Catalog
from biogeme.nests import OneNestForNestedLogit, NestsForNestedLogit
from biogeme.results import compile_estimation_results, pareto_optimal
from IPython.core.display_functions import display

from biogeme.data.swissmetro import (
    read_data,
    CHOICE,
    SM_AV,
    CAR_AV_SP,
    TRAIN_AV_SP,
    TRAIN_TT_SCALED,
    TRAIN_COST_SCALED,
    SM_TT_SCALED,
    SM_COST_SCALED,
    CAR_TT_SCALED,
    CAR_CO_SCALED,
)

logger = blog.get_screen_logger(level=blog.INFO)

Parameters to be estimated

ASC_CAR = Beta('ASC_CAR', 0, None, None, 0)
ASC_TRAIN = Beta('ASC_TRAIN', 0, None, None, 0)
B_TIME = Beta('B_TIME', 0, None, None, 0)
B_COST = Beta('B_COST', 0, None, None, 0)

Definition of the utility functions

V1 = ASC_TRAIN + B_TIME * TRAIN_TT_SCALED + B_COST * TRAIN_COST_SCALED
V2 = B_TIME * SM_TT_SCALED + B_COST * SM_COST_SCALED
V3 = ASC_CAR + B_TIME * CAR_TT_SCALED + B_COST * CAR_CO_SCALED

Associate utility functions with the numbering of alternatives

V = {1: V1, 2: V2, 3: V3}

Associate the availability conditions with the alternatives

av = {1: TRAIN_AV_SP, 2: SM_AV, 3: CAR_AV_SP}

Definition of the logit model. This is the contribution of each observation to the log likelihood function.

logprob_logit = models.loglogit(V, av, CHOICE)

Nested logit model: nest with existing alternatives.

mu_existing = Beta('mu_existing', 1, 1, 10, 0)
existing = OneNestForNestedLogit(
    nest_param=mu_existing, list_of_alternatives=[1, 3], name='Existing'
)

nests_existing = NestsForNestedLogit(choice_set=list(V), tuple_of_nests=(existing,))
logprob_nested_existing = models.lognested(V, av, nests_existing, CHOICE)
The following elements do not appear in any nest and are assumed each to be alone in a separate nest: {2}. If it is not the intention, check the assignment of alternatives to nests.

Nested logit model: nest with public transportation alternatives.

mu_public = Beta('mu_public', 1, 1, 10, 0)
public = OneNestForNestedLogit(
    nest_param=mu_public, list_of_alternatives=[1, 2], name='Public'
)

nests_public = NestsForNestedLogit(choice_set=list(V), tuple_of_nests=(public,))
logprob_nested_public = models.lognested(V, av, nests_public, CHOICE)
The following elements do not appear in any nest and are assumed each to be alone in a separate nest: {3}. If it is not the intention, check the assignment of alternatives to nests.

Catalog.

model_catalog = Catalog.from_dict(
    catalog_name='model_catalog',
    dict_of_expressions={
        'logit': logprob_logit,
        'nested existing': logprob_nested_existing,
        'nested public': logprob_nested_public,
    },
)

Read the data

database = read_data()

Create the Biogeme object.

the_biogeme = bio.BIOGEME(database, model_catalog)
the_biogeme.modelName = 'b01model'
the_biogeme.generate_html = False
the_biogeme.generate_pickle = False
Biogeme parameters read from biogeme.toml.

Estimate the parameters.

dict_of_results = the_biogeme.estimate_catalog()
Estimating 3 models.
Biogeme parameters read from biogeme.toml.
As the model is not too complex, we activate the calculation of second derivatives. If you want to change it, change the name of the algorithm in the TOML file from "automatic" to "simple_bounds"
*** Initial values of the parameters are obtained from the file __b01model_000000.iter
Cannot read file __b01model_000000.iter. Statement is ignored.
As the model is not too complex, we activate the calculation of second derivatives. If you want to change it, change the name of the algorithm in the TOML file from "automatic" to "simple_bounds"
Optimization algorithm: hybrid Newton/BFGS with simple bounds [simple_bounds]
** Optimization: Newton with trust region for simple bounds
Iter.         ASC_CAR       ASC_TRAIN          B_COST          B_TIME     Function    Relgrad   Radius      Rho
    0           -0.29           -0.76            -0.7           -0.77      8.8e+03       0.04       10      1.1   ++
    1         -0.0015           -0.66           -0.77            -1.2      8.7e+03     0.0064    1e+02      1.1   ++
    2         -0.0015           -0.66           -0.77            -1.2      8.7e+03    0.00012    1e+02        1   ++
Biogeme parameters read from biogeme.toml.
As the model is not too complex, we activate the calculation of second derivatives. If you want to change it, change the name of the algorithm in the TOML file from "automatic" to "simple_bounds"
*** Initial values of the parameters are obtained from the file __b01model_000001.iter
Cannot read file __b01model_000001.iter. Statement is ignored.
As the model is not too complex, we activate the calculation of second derivatives. If you want to change it, change the name of the algorithm in the TOML file from "automatic" to "simple_bounds"
Optimization algorithm: hybrid Newton/BFGS with simple bounds [simple_bounds]
** Optimization: Newton with trust region for simple bounds
Iter.         ASC_CAR       ASC_TRAIN          B_COST          B_TIME     mu_existing     Function    Relgrad   Radius      Rho
    0           0.078           -0.43           -0.72            -1.1             1.4      8.6e+03       0.01       10      1.2   ++
    1           0.033            -0.4           -0.68            -1.1             1.7      8.5e+03     0.0034    1e+02      1.2   ++
    2          0.0091           -0.38           -0.64           -0.99             1.9      8.5e+03     0.0018    1e+03      1.2   ++
    3        -4.5e-05           -0.37           -0.63           -0.96               2      8.5e+03    0.00049    1e+04      1.1   ++
    4        -4.5e-05           -0.37           -0.63           -0.96               2      8.5e+03    8.4e-06    1e+04        1   ++
Biogeme parameters read from biogeme.toml.
As the model is not too complex, we activate the calculation of second derivatives. If you want to change it, change the name of the algorithm in the TOML file from "automatic" to "simple_bounds"
*** Initial values of the parameters are obtained from the file __b01model_000002.iter
Cannot read file __b01model_000002.iter. Statement is ignored.
As the model is not too complex, we activate the calculation of second derivatives. If you want to change it, change the name of the algorithm in the TOML file from "automatic" to "simple_bounds"
Optimization algorithm: hybrid Newton/BFGS with simple bounds [simple_bounds]
** Optimization: Newton with trust region for simple bounds
Iter.         ASC_CAR       ASC_TRAIN          B_COST          B_TIME       mu_public     Function    Relgrad   Radius      Rho
    0           -0.13           -0.54           -0.76            -1.2             1.2      8.7e+03      0.013        1     0.87    +
    1          0.0049           -0.61           -0.83            -1.3               1      8.7e+03     0.0049        1     0.65    +
    2         -0.0017            -0.6           -0.78            -1.3             1.1      8.7e+03    0.00094       10        1   ++
    3         -0.0095           -0.57           -0.78            -1.3             1.1      8.7e+03    0.00023    1e+02        1   ++
    4         -0.0095           -0.57           -0.78            -1.3             1.1      8.7e+03    3.2e-07    1e+02        1   ++

Number of estimated models.

print(f'A total of {len(dict_of_results)} models have been estimated')
A total of 3 models have been estimated

All estimation results

compiled_results, specs = compile_estimation_results(
    dict_of_results, use_short_names=True
)
display(compiled_results)
                                   Model_000000  ...        Model_000002
Number of estimated parameters                4  ...                   5
Sample size                               10719  ...               10719
Final log likelihood               -8670.163646  ...        -8669.496582
Akaike Information Criterion       17348.327292  ...        17348.993165
Bayesian Information Criterion     17377.446385  ...         17385.39203
ASC_CAR (t-test)                0.0159  (0.429)  ...  -0.00974  (-0.195)
ASC_TRAIN (t-test)                -0.652  (-12)  ...     -0.567  (-6.87)
B_COST (t-test)                 -0.789  (-15.5)  ...       -0.782  (-15)
B_TIME (t-test)                  -1.28  (-19.5)  ...      -1.25  (-16.2)
mu_existing (t-test)                             ...
mu_public (t-test)                               ...        1.08  (12.5)

[11 rows x 3 columns]

Glossary

for short_name, spec in specs.items():
    print(f'{short_name}\t{spec}')
Model_000000    model_catalog:logit
Model_000001    model_catalog:nested existing
Model_000002    model_catalog:nested public

Estimation results of the Pareto optimal models.

pareto_results = pareto_optimal(dict_of_results)
compiled_pareto_results, pareto_specs = compile_estimation_results(
    pareto_results, use_short_names=True
)
No Pareto file has been provided
display(compiled_pareto_results)
                                       Model_000000     Model_000001
Number of estimated parameters                    5                4
Sample size                                   10719            10719
Final log likelihood                     -8526.8899     -8670.163646
Akaike Information Criterion             17063.7798     17348.327292
Bayesian Information Criterion         17100.178666     17377.446385
ASC_CAR (t-test)                -0.00129  (-0.0378)  0.0159  (0.429)
ASC_TRAIN (t-test)                  -0.373  (-7.17)    -0.652  (-12)
B_COST (t-test)                     -0.629  (-14.8)  -0.789  (-15.5)
B_TIME (t-test)                     -0.958  (-14.7)   -1.28  (-19.5)
mu_existing (t-test)                   2.05  (15.8)

Glossary.

for short_name, spec in pareto_specs.items():
    print(f'{short_name}\t{spec}')
Model_000000    model_catalog:nested existing
Model_000001    model_catalog:logit

Total running time of the script: (0 minutes 0.970 seconds)

Gallery generated by Sphinx-Gallery