Note
Go to the end to download the full example code.
Simulation of a cross-nested logit model
Illustration of the application of an estimated model.
- author:
Michel Bierlaire, EPFL
- date:
Sun Apr 9 18:08:29 2023
import sys
import biogeme.biogeme as bio
import biogeme.exceptions as excep
import biogeme.results as res
from biogeme import models
from biogeme.expressions import Beta, Derive
from biogeme.nests import OneNestForCrossNestedLogit, NestsForCrossNestedLogit
See the data processing script: Data preparation for Swissmetro.
from swissmetro_data import (
database,
SM_AV,
CAR_AV_SP,
TRAIN_AV_SP,
TRAIN_TT,
TRAIN_COST_SCALED,
SM_TT,
SM_COST_SCALED,
CAR_TT,
CAR_CO_SCALED,
)
Simulation must be done with estimated value of the parameters. We set them to zero here , and read the values from the file created after estimation.
ASC_CAR = Beta('ASC_CAR', 0, None, None, 0)
ASC_TRAIN = Beta('ASC_TRAIN', 0, None, None, 0)
ASC_SM = Beta('ASC_SM', 0, None, None, 1)
B_TIME = Beta('B_TIME', 0, None, None, 0)
B_COST = Beta('B_COST', 0, None, None, 0)
Nest parameters.
MU_EXISTING = Beta('MU_EXISTING', 1, 1, None, 0)
MU_PUBLIC = Beta('MU_PUBLIC', 1, 1, None, 0)
Nest membership parameters.
ALPHA_EXISTING = Beta('ALPHA_EXISTING', 0.5, 0, 1, 0)
ALPHA_PUBLIC = 1 - ALPHA_EXISTING
Definition of the utility functions. Note that we need to have explicitly the unscaled variables if we want to calculate the elasticities.
V1 = ASC_TRAIN + B_TIME * TRAIN_TT / 100 + B_COST * TRAIN_COST_SCALED
V2 = ASC_SM + B_TIME * SM_TT / 100 + B_COST * SM_COST_SCALED
V3 = ASC_CAR + B_TIME * CAR_TT / 100 + B_COST * CAR_CO_SCALED
Associate utility functions with the numbering of alternatives.
V = {1: V1, 2: V2, 3: V3}
Associate the availability conditions with the alternatives.
av = {1: TRAIN_AV_SP, 2: SM_AV, 3: CAR_AV_SP}
Definition of nests.
nest_existing = OneNestForCrossNestedLogit(
nest_param=MU_EXISTING,
dict_of_alpha={1: ALPHA_EXISTING, 2: 0.0, 3: 1.0},
name='existing',
)
nest_public = OneNestForCrossNestedLogit(
nest_param=MU_PUBLIC, dict_of_alpha={1: ALPHA_PUBLIC, 2: 1.0, 3: 0.0}, name='public'
)
nests = NestsForCrossNestedLogit(
choice_set=[1, 2, 3], tuple_of_nests=(nest_existing, nest_public)
)
Read the estimation results from the pickle file.
try:
results = res.bioResults(pickle_file='saved_results/b11cnl.pickle')
except excep.BiogemeError:
print('Run first the script b11cnl.py in order to generate the file 11cnl.pickle.')
sys.exit()
print('Estimation results: ', results.get_estimated_parameters())
Estimation results: Value Rob. Std err Rob. t-test Rob. p-value
ALPHA_EXISTING 0.495072 0.034752 14.246000 0.000000e+00
ASC_CAR -0.240459 0.053450 -4.498764 6.834958e-06
ASC_TRAIN 0.098277 0.069978 1.404403 1.601989e-01
B_COST -0.818885 0.058971 -13.886130 0.000000e+00
B_TIME -0.776846 0.102380 -7.587854 3.241851e-14
MU_EXISTING 2.514876 0.248325 10.127363 0.000000e+00
MU_PUBLIC 4.113595 0.496722 8.281479 2.220446e-16
print('Calculating correlation matrix. It may generate numerical warnings from scipy.')
corr = nests.correlation(
parameters=results.get_beta_values(),
alternatives_names={1: 'Train', 2: 'Swissmetro', 3: 'Car'},
)
corr
Calculating correlation matrix. It may generate numerical warnings from scipy.
/Users/bierlair/venv312/lib/python3.12/site-packages/scipy/integrate/_quadpack_py.py:1260: IntegrationWarning: The algorithm does not converge. Roundoff error is detected
in the extrapolation table. It is assumed that the requested tolerance
cannot be achieved, and that the returned result (if full_output = 1) is
the best which can be obtained.
quad_r = quad(f, low, high, args=args, full_output=self.full_output,
/Users/bierlair/venv312/lib/python3.12/site-packages/scipy/integrate/_quadpack_py.py:1260: IntegrationWarning: The integral is probably divergent, or slowly convergent.
quad_r = quad(f, low, high, args=args, full_output=self.full_output,
/Users/bierlair/venv312/lib/python3.12/site-packages/scipy/integrate/_quadpack_py.py:1260: IntegrationWarning: The maximum number of subdivisions (50) has been achieved.
If increasing the limit yields no improvement it is advised to analyze
the integrand in order to determine the difficulties. If the position of a
local difficulty can be determined (singularity, discontinuity) one will
probably gain from splitting up the interval and calling the integrator
on the subranges. Perhaps a special-purpose integrator should be used.
quad_r = quad(f, low, high, args=args, full_output=self.full_output,
The choice model is a cross-nested logit, with availability conditions.
prob1 = models.cnl(V, av, nests, 1)
prob2 = models.cnl(V, av, nests, 2)
prob3 = models.cnl(V, av, nests, 3)
We calculate elasticities. It is important that the variables explicitly appear as such in the model. If not, the derivative will be zero, as well as the elasticities.
genelas1 = Derive(prob1, 'TRAIN_TT') * TRAIN_TT / prob1
genelas2 = Derive(prob2, 'SM_TT') * SM_TT / prob2
genelas3 = Derive(prob3, 'CAR_TT') * CAR_TT / prob3
We report the probability of each alternative and the elasticities.
simulate = {
'Prob. train': prob1,
'Prob. Swissmetro': prob2,
'Prob. car': prob3,
'Elas. 1': genelas1,
'Elas. 2': genelas2,
'Elas. 3': genelas3,
}
Create the Biogeme object.
biosim = bio.BIOGEME(database, simulate)
biosim.modelName = 'b11cnl_simul'
Perform the simulation.
simresults = biosim.simulate(results.get_beta_values())
print('Simulation results')
simresults
Simulation results
print(f'Aggregate share of train: {100*simresults["Prob. train"].mean():.1f}%')
Aggregate share of train: 13.1%
Total running time of the script: (3 minutes 18.801 seconds)