Triangular mixture with panel data

Example of a mixture of logit models, using Monte-Carlo integration. The mixing distribution is user-defined (triangular, here). The datafile is organized as panel data.

author:

Michel Bierlaire, EPFL

date:

Tue Dec 6 18:30:44 2022

import numpy as np
import biogeme.biogeme as bio
from biogeme import models
import biogeme.biogeme_logging as blog
from biogeme.expressions import (
    Beta,
    bioDraws,
    MonteCarlo,
    PanelLikelihoodTrajectory,
    log,
)

See the data processing script: Panel data preparation for Swissmetro.

from swissmetro_panel import (
    database,
    CHOICE,
    CAR_AV_SP,
    TRAIN_AV_SP,
    TRAIN_TT_SCALED,
    TRAIN_COST_SCALED,
    SM_TT_SCALED,
    SM_COST_SCALED,
    CAR_TT_SCALED,
    CAR_CO_SCALED,
    SM_AV,
)

logger = blog.get_screen_logger(level=blog.INFO)
logger.info('Example b26triangular_panel_mixture.py')
Example b26triangular_panel_mixture.py

Function generating the draws.

def the_triangular_generator(sample_size: int, number_of_draws: int) -> np.ndarray:
    """
    Provide my own random number generator to the database.
    See the numpy.random documentation to obtain a list of other distributions.
    """
    return np.random.triangular(-1, 0, 1, (sample_size, number_of_draws))

Associate the function with a name.

myRandomNumberGenerators = {
    'TRIANGULAR': (
        the_triangular_generator,
        'Draws from a triangular distribution',
    )
}

Submit the generator to the database.

database.setRandomNumberGenerators(myRandomNumberGenerators)

Parameters to be estimated.

B_COST = Beta('B_COST', 0, None, None, 0)

Define a random parameter, normally distributed across individuals, designed to be used for Monte-Carlo simulation.

Mean of the distribution.

B_TIME = Beta('B_TIME', 0, None, None, 0)

Scale of the distribution. It is advised not to use 0 as starting value for the following parameter.

B_TIME_S = Beta('B_TIME_S', 1, None, None, 0)
B_TIME_RND = B_TIME + B_TIME_S * bioDraws('B_TIME_RND', 'TRIANGULAR')

We do the same for the constants, to address serial correlation.

ASC_CAR = Beta('ASC_CAR', 0, None, None, 0)
ASC_CAR_S = Beta('ASC_CAR_S', 1, None, None, 0)
ASC_CAR_RND = ASC_CAR + ASC_CAR_S * bioDraws('ASC_CAR_RND', 'TRIANGULAR')

ASC_TRAIN = Beta('ASC_TRAIN', 0, None, None, 0)
ASC_TRAIN_S = Beta('ASC_TRAIN_S', 1, None, None, 0)
ASC_TRAIN_RND = ASC_TRAIN + ASC_TRAIN_S * bioDraws('ASC_TRAIN_RND', 'TRIANGULAR')

ASC_SM = Beta('ASC_SM', 0, None, None, 1)
ASC_SM_S = Beta('ASC_SM_S', 1, None, None, 0)
ASC_SM_RND = ASC_SM + ASC_SM_S * bioDraws('ASC_SM_RND', 'TRIANGULAR')

Definition of the utility functions.

V1 = ASC_TRAIN_RND + B_TIME_RND * TRAIN_TT_SCALED + B_COST * TRAIN_COST_SCALED
V2 = ASC_SM_RND + B_TIME_RND * SM_TT_SCALED + B_COST * SM_COST_SCALED
V3 = ASC_CAR_RND + B_TIME_RND * CAR_TT_SCALED + B_COST * CAR_CO_SCALED

Associate utility functions with the numbering of alternatives.

V = {1: V1, 2: V2, 3: V3}

Associate the availability conditions with the alternatives.

av = {1: TRAIN_AV_SP, 2: SM_AV, 3: CAR_AV_SP}

Conditional to the random parameters, the likelihood of one observation is given by the logit model (called the kernel).

obsprob = models.logit(V, av, CHOICE)

Conditional on the random parameters, the likelihood of all observations for one individual (the trajectory) is the product of the likelihood of each observation.

condprobIndiv = PanelLikelihoodTrajectory(obsprob)

We integrate over the random parameters using Monte-Carlo

logprob = log(MonteCarlo(condprobIndiv))

Create the Biogeme object. As the objective is to illustrate the syntax, we calculate the Monte-Carlo approximation with a small number of draws. To achieve that, we provide a parameter file different from the default one.

the_biogeme = bio.BIOGEME(database, logprob, parameter_file='few_draws.toml')
the_biogeme.modelName = 'b26triangular_panel_mixture'
File few_draws.toml has been parsed.

Estimate the parameters.

results = the_biogeme.estimate()
*** Initial values of the parameters are obtained from the file __b26triangular_panel_mixture.iter
Cannot read file __b26triangular_panel_mixture.iter. Statement is ignored.
Optimization algorithm: hybrid Newton/BFGS with simple bounds [simple_bounds]
** Optimization: Newton with trust region for simple bounds
Iter.         ASC_CAR       ASC_CAR_S        ASC_SM_S       ASC_TRAIN     ASC_TRAIN_S          B_COST          B_TIME        B_TIME_S     Function    Relgrad   Radius      Rho
    0           -0.11             1.4             1.4           -0.55             1.1           -0.79              -1             1.1      4.6e+03      0.047       10      1.2   ++
    1           -0.11             1.4             1.4           -0.55             1.1           -0.79              -1             1.1      4.6e+03      0.047        5    -0.15    -
    2            0.65             3.6          -0.046            0.22             6.1            0.56            -2.3               4      4.2e+03      0.056        5     0.42    +
    3            0.65             3.6          -0.046            0.22             6.1            0.56            -2.3               4      4.2e+03      0.056      2.5    -0.79    -
    4            0.65             3.6          -0.046            0.22             6.1            0.56            -2.3               4      4.2e+03      0.056      1.2    -0.13    -
    5            0.41             4.6          -0.046              -1               6           -0.69            -3.6             3.9      3.8e+03      0.059      1.2     0.71    +
    6           0.098             4.8            0.06           -0.99             6.2            -1.9            -3.4             4.5      3.7e+03      0.027      1.2     0.84    +
    7           0.098             4.8            0.06           -0.99             6.2            -1.9            -3.4             4.5      3.7e+03      0.027     0.62     -0.2    -
    8           0.098             4.8            0.06           -0.99             6.2            -1.9            -3.4             4.5      3.7e+03      0.027     0.31     -0.2    -
    9           0.098             4.8            0.06           -0.99             6.2            -1.9            -3.4             4.5      3.7e+03      0.027     0.16   -0.097    -
   10           0.098             4.8            0.06           -0.99             6.2            -1.9            -3.4             4.5      3.7e+03      0.027    0.078    0.093    -
   11            0.02             4.8          -0.018              -1             6.2            -1.9            -3.5             4.5      3.7e+03      0.026    0.078     0.45    +
   12           0.098             4.9          -0.096           -0.94             6.3            -1.8            -3.5             4.5      3.7e+03      0.023     0.78        1   ++
   13            0.14             5.7           -0.16           -0.94             6.5            -1.9            -3.8             4.9      3.7e+03      0.021      7.8        1   ++
   14            0.14             5.7           -0.16           -0.94             6.5            -1.9            -3.8             4.9      3.7e+03      0.021      3.9    0.054    -
   15            0.37             7.9             3.7           0.089             4.9            -2.6            -5.9               8      3.7e+03      0.039      3.9     0.69    +
   16            0.37             7.9             3.7           0.089             4.9            -2.6            -5.9               8      3.7e+03      0.039        2    -0.15    -
   17            0.37             7.9             3.7           0.089             4.9            -2.6            -5.9               8      3.7e+03      0.039     0.98    -0.15    -
   18            0.37             7.9             3.7           0.089             4.9            -2.6            -5.9               8      3.7e+03      0.039     0.49    -0.15    -
   19            0.37             7.9             3.7           0.089             4.9            -2.6            -5.9               8      3.7e+03      0.039     0.24   -0.029    -
   20            0.37             7.9             3.7           0.089             4.9            -2.6            -5.9               8      3.7e+03      0.039     0.12    0.018    -
   21            0.48               8             3.7          -0.029             4.9            -2.7            -6.1             7.9      3.7e+03      0.012     0.12     0.37    +
   22            0.58             8.1             3.8           0.017             4.8            -2.7              -6             7.8      3.7e+03      0.012      1.2     0.99   ++
   23            0.47             9.3             4.2          -0.016             3.9            -2.9            -5.5             7.3      3.7e+03     0.0087      1.2     0.74    +
   24             0.9             9.1             5.3             0.4             2.7            -2.7            -6.2             8.1      3.7e+03     0.0068      1.2     0.56    +
   25            0.46             9.5             4.7            0.34             1.5            -2.9            -5.9             7.7      3.6e+03     0.0027      1.2     0.83    +
   26            0.65             9.4             5.1            0.42            0.71              -3            -6.1               8      3.6e+03    0.00054       12     0.95   ++
   27            0.65             9.5             5.1            0.42            0.69              -3            -6.1             8.1      3.6e+03    8.5e-06  1.2e+02        1   ++
   28            0.65             9.5             5.1            0.42            0.69              -3            -6.1             8.1      3.6e+03    2.6e-09  1.2e+02        1   ++
Results saved in file b26triangular_panel_mixture.html
Results saved in file b26triangular_panel_mixture.pickle
print(results.short_summary())
Results for model b26triangular_panel_mixture
Nbr of parameters:              8
Sample size:                    752
Observations:                   6768
Excluded data:                  3960
Final log likelihood:           -3645.824
Akaike Information Criterion:   7307.648
Bayesian Information Criterion: 7344.63
pandas_results = results.getEstimatedParameters()
pandas_results
Value Rob. Std err Rob. t-test Rob. p-value
ASC_CAR 0.646809 0.271284 2.384250 1.711398e-02
ASC_CAR_S 9.474742 0.846701 11.190188 0.000000e+00
ASC_SM_S 5.066061 0.426483 11.878705 0.000000e+00
ASC_TRAIN 0.421241 0.234063 1.799693 7.190920e-02
ASC_TRAIN_S 0.688281 1.558410 0.441656 6.587382e-01
B_COST -2.964488 0.517913 -5.723913 1.040981e-08
B_TIME -6.143703 0.345251 -17.794911 0.000000e+00
B_TIME_S 8.055966 0.437295 18.422268 0.000000e+00


Total running time of the script: (0 minutes 34.863 seconds)

Gallery generated by Sphinx-Gallery