Triangular mixture of logit

Example of a mixture of logit models, using Monte-Carlo integration. The mixing distribution is specified by the user. Here, a triangular distribution.

author:

Michel Bierlaire, EPFL

date:

Wed Apr 12 18:24:18 2023

import numpy as np
import biogeme.biogeme_logging as blog
import biogeme.biogeme as bio
from biogeme import models
from biogeme.expressions import Beta, bioDraws, log, MonteCarlo

See the data processing script: Data preparation for Swissmetro.

from swissmetro_data import (
    database,
    CHOICE,
    CAR_AV_SP,
    TRAIN_AV_SP,
    TRAIN_TT_SCALED,
    TRAIN_COST_SCALED,
    SM_TT_SCALED,
    SM_COST_SCALED,
    CAR_TT_SCALED,
    CAR_CO_SCALED,
    SM_AV,
)

logger = blog.get_screen_logger(level=blog.INFO)
logger.info('Example b25triangular_mixture.py')
Example b25triangular_mixture.py

Parameters to be estimated.

ASC_CAR = Beta('ASC_CAR', 0, None, None, 0)
ASC_TRAIN = Beta('ASC_TRAIN', 0, None, None, 0)
ASC_SM = Beta('ASC_SM', 0, None, None, 1)
B_COST = Beta('B_COST', 0, None, None, 0)

Define a random parameter with a triangular distribution, designed to be used for Monte-Carlo simulation. The triangular distribution is not directly available from Biogeme. The draws have to be generated by a function provided by the user.

Mean of the distribution.

B_TIME = Beta('B_TIME', 0, None, None, 0)

Scale of the distribution. It is advised not to use 0 as starting value for the following parameter.

B_TIME_S = Beta('B_TIME_S', 1, None, None, 0)

Function generating the draws.

def the_triangular_generator(sample_size: int, number_of_draws: int) -> np.ndarray:
    """
    User-defined random number generator to the database.
    See the numpy.random documentation to obtain a list of other distributions.
    """
    return np.random.triangular(-1, 0, 1, (sample_size, number_of_draws))

Associate the function with a name.

myRandomNumberGenerators = {
    'TRIANGULAR': (
        the_triangular_generator,
        'Draws from a triangular distribution',
    )
}

Submit the generator to the database.

database.setRandomNumberGenerators(myRandomNumberGenerators)

Define a random parameter with a triangular distribution, designed to be used for Monte-Carlo simulation.

B_TIME_RND = B_TIME + B_TIME_S * bioDraws('B_TIME_RND', 'TRIANGULAR')

Definition of the utility functions.

V1 = ASC_TRAIN + B_TIME_RND * TRAIN_TT_SCALED + B_COST * TRAIN_COST_SCALED
V2 = ASC_SM + B_TIME_RND * SM_TT_SCALED + B_COST * SM_COST_SCALED
V3 = ASC_CAR + B_TIME_RND * CAR_TT_SCALED + B_COST * CAR_CO_SCALED

Associate utility functions with the numbering of alternatives

V = {1: V1, 2: V2, 3: V3}

Associate the availability conditions with the alternatives

av = {1: TRAIN_AV_SP, 2: SM_AV, 3: CAR_AV_SP}

Conditional to B_TIME_RND, we have a logit model (called the kernel)

prob = models.logit(V, av, CHOICE)

We integrate over B_TIME_RND using Monte-Carlo

logprob = log(MonteCarlo(prob))

Create the Biogeme object. As the objective is to illustrate the syntax, we calculate the Monte-Carlo approximation with a small number of draws. To achieve that, we provide a parameter file different from the default one.

the_biogeme = bio.BIOGEME(database, logprob, parameter_file='few_draws.toml')
the_biogeme.modelName = 'b25triangular_mixture'
File few_draws.toml has been parsed.

Estimate the parameters

results = the_biogeme.estimate()
*** Initial values of the parameters are obtained from the file __b25triangular_mixture.iter
Cannot read file __b25triangular_mixture.iter. Statement is ignored.
Optimization algorithm: hybrid Newton/BFGS with simple bounds [simple_bounds]
** Optimization: Newton with trust region for simple bounds
Iter.         ASC_CAR       ASC_TRAIN          B_COST          B_TIME        B_TIME_S     Function    Relgrad   Radius      Rho
    0           -0.24           -0.64           -0.43              -1             0.9      5.4e+03      0.042       10        1   ++
    1           0.033            -0.5              -1            -1.7             3.1      5.2e+03      0.029    1e+02        1   ++
    2           0.095           -0.43            -1.2            -2.1             3.5      5.2e+03     0.0039    1e+03      1.1   ++
    3            0.13            -0.4            -1.3            -2.2             3.8      5.2e+03    0.00049    1e+04      1.1   ++
    4            0.13            -0.4            -1.3            -2.2             3.9      5.2e+03    6.4e-06    1e+05        1   ++
    5            0.13            -0.4            -1.3            -2.2             3.9      5.2e+03    1.1e-09    1e+05        1   ++
Results saved in file b25triangular_mixture.html
Results saved in file b25triangular_mixture.pickle
print(results.short_summary())
Results for model b25triangular_mixture
Nbr of parameters:              5
Sample size:                    6768
Excluded data:                  3960
Final log likelihood:           -5215.848
Akaike Information Criterion:   10441.7
Bayesian Information Criterion: 10475.8
pandas_results = results.getEstimatedParameters()
pandas_results
Value Rob. Std err Rob. t-test Rob. p-value
ASC_CAR 0.131377 0.052072 2.522989 1.163620e-02
ASC_TRAIN -0.400569 0.065834 -6.084497 1.168578e-09
B_COST -1.277153 0.085682 -14.905718 0.000000e+00
B_TIME -2.244201 0.117954 -19.026152 0.000000e+00
B_TIME_S 3.893612 0.292471 13.312801 0.000000e+00


Total running time of the script: (0 minutes 8.600 seconds)

Gallery generated by Sphinx-Gallery