Mixture with lognormal distribution

Example of a mixture of logit models, using Monte-Carlo integration. The mixing distribution is distributed as a log normal.

author:: Michel Bierlaire, EPFL
date:: Mon Apr 10 12:11:53 2023

import biogeme.biogeme_logging as blog
import biogeme.biogeme as bio
from biogeme import models
from biogeme.expressions import (
    Beta,
    exp,
    log,
    MonteCarlo,
    bioDraws,
)

See the data processing script: Data preparation for Swissmetro.

from swissmetro_data import (
    database,
    CHOICE,
    SM_AV,
    CAR_AV_SP,
    TRAIN_AV_SP,
    TRAIN_TT_SCALED,
    TRAIN_COST_SCALED,
    SM_TT_SCALED,
    SM_COST_SCALED,
    CAR_TT_SCALED,
    CAR_CO_SCALED,
)

logger = blog.get_screen_logger(level=blog.INFO)
logger.info('Example b17lognormal_mixture.py')

Example b17lognormal_mixture.py

Parameters to be estimated.

ASC_CAR = Beta('ASC_CAR', 0, None, None, 0)
ASC_TRAIN = Beta('ASC_TRAIN', 0, None, None, 0)
ASC_SM = Beta('ASC_SM', 0, None, None, 1)
B_COST = Beta('B_COST', 0, None, None, 0)

Define a random parameter, normally distributed, designed to be used for Monte-Carlo simulation.

B_TIME = Beta('B_TIME', 0, None, None, 0)

It is advised not to use 0 as starting value for the following parameter.

B_TIME_S = Beta('B_TIME_S', 1, -2, 2, 0)

Define a random parameter, log normally distributed, designed to be used for Monte-Carlo simulation.

B_TIME_RND = -exp(B_TIME + B_TIME_S * bioDraws('B_TIME_RND', 'NORMAL'))

Definition of the utility functions.

V1 = ASC_TRAIN + B_TIME_RND * TRAIN_TT_SCALED + B_COST * TRAIN_COST_SCALED
V2 = ASC_SM + B_TIME_RND * SM_TT_SCALED + B_COST * SM_COST_SCALED
V3 = ASC_CAR + B_TIME_RND * CAR_TT_SCALED + B_COST * CAR_CO_SCALED

Associate utility functions with the numbering of alternatives.

V = {1: V1, 2: V2, 3: V3}

Associate the availability conditions with the alternatives.

av = {1: TRAIN_AV_SP, 2: SM_AV, 3: CAR_AV_SP}

Conditional to B_TIME_RND, we have a logit model (called the kernel).

prob = models.logit(V, av, CHOICE)

We integrate over B_TIME_RND using Monte-Carlo.

logprob = log(MonteCarlo(prob))

# Create the Biogeme object. As the objective is to illustrate the
# syntax, we calculate the Monte-Carlo approximation with a small
# number of draws. To achieve that, we provide a parameter file
# different from the default one.
the_biogeme = bio.BIOGEME(database, logprob, parameter_file='few_draws.toml')
the_biogeme.modelName = '17lognormal_mixture'

File few_draws.toml has been parsed.

Estimate the parameters.

results = the_biogeme.estimate()

*** Initial values of the parameters are obtained from the file __17lognormal_mixture.iter
Cannot read file __17lognormal_mixture.iter. Statement is ignored.
Optimization algorithm: hybrid Newton/BFGS with simple bounds [simple_bounds]
** Optimization: Newton with trust region for simple bounds
Iter.         ASC_CAR       ASC_TRAIN          B_COST          B_TIME        B_TIME_S     Function    Relgrad   Radius      Rho
    0            0.18            -0.4              -1            0.36            0.98      5.3e+03      0.018       10        1   ++
    1            0.16           -0.36            -1.3            0.55             1.1      5.2e+03     0.0026    1e+02      1.1   ++
    2            0.14           -0.37            -1.4            0.54             1.2      5.2e+03    8.8e-05    1e+03        1   ++
    3            0.14           -0.37            -1.4            0.54             1.2      5.2e+03    3.3e-07    1e+03        1   ++
Results saved in file 17lognormal_mixture.html
Results saved in file 17lognormal_mixture.pickle

print(results.short_summary())

Results for model 17lognormal_mixture
Nbr of parameters:              5
Sample size:                    6768
Excluded data:                  3960
Final log likelihood:           -5239.842
Akaike Information Criterion:   10489.68
Bayesian Information Criterion: 10523.78

pandas_results = results.getEstimatedParameters()
pandas_results

	Value	Rob. Std err	Rob. t-test	Rob. p-value
ASC_CAR	0.144575	0.058545	2.469481	1.353091e-02
ASC_TRAIN	-0.373762	0.071593	-5.220664	1.782824e-07
B_COST	-1.357323	0.092802	-14.626080	0.000000e+00
B_TIME	0.543241	0.068780	7.898244	2.886580e-15
B_TIME_S	1.161331	0.100738	11.528253	0.000000e+00

Total running time of the script: (0 minutes 6.844 seconds)

Gallery generated by Sphinx-Gallery