.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/swissmetro/plot_b14nested_endogenous_sampling.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_swissmetro_plot_b14nested_endogenous_sampling.py: Nested logit with corrections for endogeneous sampling ====================================================== The sample is said to be endogenous if the probability for an individual to be in the sample depends on the choice that has been made. In that case, the ESML estimator is not appropriate anymore, and corrections need to be made. See `Bierlaire, bolduc, McFadden (2008) `_. This is illustrated in this example. :author: Michel Bierlaire, EPFL :date: Sun Apr 9 18:25:03 2023 .. GENERATED FROM PYTHON SOURCE LINES 18-26 .. code-block:: default import numpy as np import biogeme.biogeme_logging as blog import biogeme.biogeme as bio from biogeme import models from biogeme.expressions import Beta from biogeme.nests import OneNestForNestedLogit, NestsForNestedLogit .. GENERATED FROM PYTHON SOURCE LINES 27-28 See the data processing script: :ref:`swissmetro_data`. .. GENERATED FROM PYTHON SOURCE LINES 28-45 .. code-block:: default from swissmetro_data import ( database, CHOICE, SM_AV, CAR_AV_SP, TRAIN_AV_SP, TRAIN_TT_SCALED, TRAIN_COST_SCALED, SM_TT_SCALED, SM_COST_SCALED, CAR_TT_SCALED, CAR_CO_SCALED, ) logger = blog.get_screen_logger(level=blog.INFO) logger.info('Example b14nested_endogenous_sampling.py') .. rst-class:: sphx-glr-script-out .. code-block:: none Example b14nested_endogenous_sampling.py .. GENERATED FROM PYTHON SOURCE LINES 46-47 Parameters to be estimated. .. GENERATED FROM PYTHON SOURCE LINES 47-54 .. code-block:: default ASC_CAR = Beta('ASC_CAR', 0, None, None, 0) ASC_TRAIN = Beta('ASC_TRAIN', 0, None, None, 0) ASC_SM = Beta('ASC_SM', 0, None, None, 1) B_TIME = Beta('B_TIME', 0, None, None, 0) B_COST = Beta('B_COST', 0, None, None, 0) MU = Beta('MU', 1, 1, 10, 0) .. GENERATED FROM PYTHON SOURCE LINES 55-58 In this example, we assume that the three modes exist, and that the sampling protocol is choice-based. The probability that a respondent belongs to the sample is R_i. .. GENERATED FROM PYTHON SOURCE LINES 58-62 .. code-block:: default R_TRAIN = 4.42e-2 R_SM = 3.36e-3 R_CAR = 7.5e-3 .. GENERATED FROM PYTHON SOURCE LINES 63-64 The correction terms are the log of these quantities .. GENERATED FROM PYTHON SOURCE LINES 64-66 .. code-block:: default correction = {1: np.log(R_TRAIN), 2: np.log(R_SM), 3: np.log(R_CAR)} .. GENERATED FROM PYTHON SOURCE LINES 67-68 Definition of the utility functions. .. GENERATED FROM PYTHON SOURCE LINES 68-72 .. code-block:: default V1 = ASC_TRAIN + B_TIME * TRAIN_TT_SCALED + B_COST * TRAIN_COST_SCALED V2 = ASC_SM + B_TIME * SM_TT_SCALED + B_COST * SM_COST_SCALED V3 = ASC_CAR + B_TIME * CAR_TT_SCALED + B_COST * CAR_CO_SCALED .. GENERATED FROM PYTHON SOURCE LINES 73-74 Associate utility functions with the numbering of alternatives. .. GENERATED FROM PYTHON SOURCE LINES 74-76 .. code-block:: default V = {1: V1, 2: V2, 3: V3} .. GENERATED FROM PYTHON SOURCE LINES 77-78 Associate the availability conditions with the alternatives. .. GENERATED FROM PYTHON SOURCE LINES 78-80 .. code-block:: default av = {1: TRAIN_AV_SP, 2: SM_AV, 3: CAR_AV_SP} .. GENERATED FROM PYTHON SOURCE LINES 81-85 Definition of nests. Only the non trivial nests must be defined. A trivial nest is a nest containing exactly one alternative. In this example, we create a nest for the existing modes, that is train (1) and car (3). .. GENERATED FROM PYTHON SOURCE LINES 85-92 .. code-block:: default existing = OneNestForNestedLogit( nest_param=MU, list_of_alternatives=[1, 3], name='existing' ) nests = NestsForNestedLogit(choice_set=list(V), tuple_of_nests=(existing,)) .. GENERATED FROM PYTHON SOURCE LINES 93-95 The choice model is a nested logit, with corrections for endogenous sampling We first obtain the expression of the Gi function for nested logit. .. GENERATED FROM PYTHON SOURCE LINES 95-97 .. code-block:: default Gi = models.getMevForNested(V, av, nests) .. GENERATED FROM PYTHON SOURCE LINES 98-99 Then we calculate the MEV log probability, accounting for the correction. .. GENERATED FROM PYTHON SOURCE LINES 99-101 .. code-block:: default logprob = models.logmev_endogenousSampling(V, Gi, av, correction, CHOICE) .. GENERATED FROM PYTHON SOURCE LINES 102-103 Create the Biogeme object. .. GENERATED FROM PYTHON SOURCE LINES 103-106 .. code-block:: default the_biogeme = bio.BIOGEME(database, logprob) the_biogeme.modelName = 'b14nested_endogenous_eampling' .. rst-class:: sphx-glr-script-out .. code-block:: none File biogeme.toml has been parsed. .. GENERATED FROM PYTHON SOURCE LINES 107-108 Estimate the parameters. .. GENERATED FROM PYTHON SOURCE LINES 108-110 .. code-block:: default results = the_biogeme.estimate() .. rst-class:: sphx-glr-script-out .. code-block:: none *** Initial values of the parameters are obtained from the file __b14nested_endogenous_eampling.iter Cannot read file __b14nested_endogenous_eampling.iter. Statement is ignored. Optimization algorithm: hybrid Newton/BFGS with simple bounds [simple_bounds] ** Optimization: Newton with trust region for simple bounds Iter. ASC_CAR ASC_TRAIN B_COST B_TIME MU Function Relgrad Radius Rho 0 1 -1 0.47 -1 2 8.1e+03 0.15 1 0.56 + 1 0 -1.1 0.0036 -2 1.6 6.1e+03 0.068 10 1.1 ++ 2 -1.5 -3 -0.95 -0.12 2.2 5.4e+03 0.036 10 0.71 + 3 -1.4 -2.9 -0.91 -0.46 1.9 5.3e+03 0.015 1e+02 1.3 ++ 4 -1.2 -2.8 -0.98 -0.86 1.7 5.2e+03 0.0048 1e+03 1.1 ++ 5 -1.1 -2.8 -1 -0.97 1.6 5.2e+03 0.00035 1e+04 1 ++ 6 -1.1 -2.8 -1 -0.97 1.6 5.2e+03 2e-06 1e+04 1 ++ Results saved in file b14nested_endogenous_eampling.html Results saved in file b14nested_endogenous_eampling.pickle .. GENERATED FROM PYTHON SOURCE LINES 111-113 .. code-block:: default print(results.short_summary()) .. rst-class:: sphx-glr-script-out .. code-block:: none Results for model b14nested_endogenous_eampling Nbr of parameters: 5 Sample size: 6768 Excluded data: 3960 Final log likelihood: -5202.916 Akaike Information Criterion: 10415.83 Bayesian Information Criterion: 10449.93 .. GENERATED FROM PYTHON SOURCE LINES 114-116 .. code-block:: default pandas_results = results.getEstimatedParameters() pandas_results .. raw:: html
Value Rob. Std err Rob. t-test Rob. p-value
ASC_CAR -1.127438 0.060630 -18.595448 0.0
ASC_TRAIN -2.768608 0.080535 -34.377505 0.0
B_COST -0.999353 0.064141 -15.580617 0.0
B_TIME -0.974561 0.110774 -8.797759 0.0
MU 1.630982 0.062661 26.028460 0.0


.. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 0.394 seconds) .. _sphx_glr_download_auto_examples_swissmetro_plot_b14nested_endogenous_sampling.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_b14nested_endogenous_sampling.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_b14nested_endogenous_sampling.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_