Note
Go to the end to download the full example code
Triangular mixture of logit
Example of a mixture of logit models, using Monte-Carlo integration. The mixing distribution is specified by the user. Here, a triangular distribution.
- author:
Michel Bierlaire, EPFL
- date:
Wed Apr 12 18:24:18 2023
import numpy as np
import biogeme.biogeme_logging as blog
import biogeme.biogeme as bio
from biogeme import models
from biogeme.expressions import Beta, bioDraws, log, MonteCarlo
See the data processing script: Data preparation for Swissmetro.
from swissmetro_data import (
database,
CHOICE,
CAR_AV_SP,
TRAIN_AV_SP,
TRAIN_TT_SCALED,
TRAIN_COST_SCALED,
SM_TT_SCALED,
SM_COST_SCALED,
CAR_TT_SCALED,
CAR_CO_SCALED,
SM_AV,
)
logger = blog.get_screen_logger(level=blog.INFO)
logger.info('Example b25triangular_mixture.py')
Example b25triangular_mixture.py
Parameters to be estimated.
ASC_CAR = Beta('ASC_CAR', 0, None, None, 0)
ASC_TRAIN = Beta('ASC_TRAIN', 0, None, None, 0)
ASC_SM = Beta('ASC_SM', 0, None, None, 1)
B_COST = Beta('B_COST', 0, None, None, 0)
Define a random parameter with a triangular distribution, designed to be used for Monte-Carlo simulation. The triangular distribution is not directly available from Biogeme. The draws have to be generated by a function provided by the user.
Mean of the distribution.
B_TIME = Beta('B_TIME', 0, None, None, 0)
Scale of the distribution. It is advised not to use 0 as starting value for the following parameter.
B_TIME_S = Beta('B_TIME_S', 1, None, None, 0)
Function generating the draws.
def the_triangular_generator(sample_size: int, number_of_draws: int) -> np.ndarray:
"""
User-defined random number generator to the database.
See the numpy.random documentation to obtain a list of other distributions.
"""
return np.random.triangular(-1, 0, 1, (sample_size, number_of_draws))
Associate the function with a name.
myRandomNumberGenerators = {
'TRIANGULAR': (
the_triangular_generator,
'Draws from a triangular distribution',
)
}
Submit the generator to the database.
database.setRandomNumberGenerators(myRandomNumberGenerators)
Define a random parameter with a triangular distribution, designed to be used for Monte-Carlo simulation.
B_TIME_RND = B_TIME + B_TIME_S * bioDraws('B_TIME_RND', 'TRIANGULAR')
Definition of the utility functions.
V1 = ASC_TRAIN + B_TIME_RND * TRAIN_TT_SCALED + B_COST * TRAIN_COST_SCALED
V2 = ASC_SM + B_TIME_RND * SM_TT_SCALED + B_COST * SM_COST_SCALED
V3 = ASC_CAR + B_TIME_RND * CAR_TT_SCALED + B_COST * CAR_CO_SCALED
Associate utility functions with the numbering of alternatives
V = {1: V1, 2: V2, 3: V3}
Associate the availability conditions with the alternatives
av = {1: TRAIN_AV_SP, 2: SM_AV, 3: CAR_AV_SP}
Conditional to B_TIME_RND, we have a logit model (called the kernel)
prob = models.logit(V, av, CHOICE)
We integrate over B_TIME_RND using Monte-Carlo
logprob = log(MonteCarlo(prob))
Create the Biogeme object. As the objective is to illustrate the syntax, we calculate the Monte-Carlo approximation with a small number of draws. To achieve that, we provide a parameter file different from the default one.
the_biogeme = bio.BIOGEME(database, logprob, parameter_file='few_draws.toml')
the_biogeme.modelName = 'b25triangular_mixture'
File few_draws.toml has been parsed.
Estimate the parameters
results = the_biogeme.estimate()
*** Initial values of the parameters are obtained from the file __b25triangular_mixture.iter
Cannot read file __b25triangular_mixture.iter. Statement is ignored.
Optimization algorithm: hybrid Newton/BFGS with simple bounds [simple_bounds]
** Optimization: Newton with trust region for simple bounds
Iter. ASC_CAR ASC_TRAIN B_COST B_TIME B_TIME_S Function Relgrad Radius Rho
0 -0.24 -0.64 -0.43 -1 0.9 5.4e+03 0.042 10 1 ++
1 0.033 -0.5 -1 -1.7 3.1 5.2e+03 0.029 1e+02 1 ++
2 0.095 -0.43 -1.2 -2.1 3.5 5.2e+03 0.0039 1e+03 1.1 ++
3 0.13 -0.4 -1.3 -2.2 3.8 5.2e+03 0.00049 1e+04 1.1 ++
4 0.13 -0.4 -1.3 -2.2 3.9 5.2e+03 6.4e-06 1e+05 1 ++
5 0.13 -0.4 -1.3 -2.2 3.9 5.2e+03 1.1e-09 1e+05 1 ++
Results saved in file b25triangular_mixture.html
Results saved in file b25triangular_mixture.pickle
print(results.short_summary())
Results for model b25triangular_mixture
Nbr of parameters: 5
Sample size: 6768
Excluded data: 3960
Final log likelihood: -5215.848
Akaike Information Criterion: 10441.7
Bayesian Information Criterion: 10475.8
pandas_results = results.getEstimatedParameters()
pandas_results
Total running time of the script: (0 minutes 8.600 seconds)