Simulation of a cross-nested logit model

Illustration of the application of an estimated model.

author:: Michel Bierlaire, EPFL
date:: Sun Apr 9 18:08:29 2023

import sys

import biogeme.biogeme as bio
import biogeme.exceptions as excep
import biogeme.results as res
from biogeme import models
from biogeme.expressions import Beta, Derive
from biogeme.nests import OneNestForCrossNestedLogit, NestsForCrossNestedLogit

See the data processing script: Data preparation for Swissmetro.

from swissmetro_data import (
    database,
    SM_AV,
    CAR_AV_SP,
    TRAIN_AV_SP,
    TRAIN_TT,
    TRAIN_COST_SCALED,
    SM_TT,
    SM_COST_SCALED,
    CAR_TT,
    CAR_CO_SCALED,
)

Simulation must be done with estimated value of the parameters. We set them to zero here , and read the values from the file created after estimation.

ASC_CAR = Beta('ASC_CAR', 0, None, None, 0)
ASC_TRAIN = Beta('ASC_TRAIN', 0, None, None, 0)
ASC_SM = Beta('ASC_SM', 0, None, None, 1)
B_TIME = Beta('B_TIME', 0, None, None, 0)
B_COST = Beta('B_COST', 0, None, None, 0)

Nest parameters.

MU_EXISTING = Beta('MU_EXISTING', 1, 1, None, 0)
MU_PUBLIC = Beta('MU_PUBLIC', 1, 1, None, 0)

Nest membership parameters.

ALPHA_EXISTING = Beta('ALPHA_EXISTING', 0.5, 0, 1, 0)
ALPHA_PUBLIC = 1 - ALPHA_EXISTING

Definition of the utility functions. Note that we need to have explicitly the unscaled variables if we want to calculate the elasticities.

V1 = ASC_TRAIN + B_TIME * TRAIN_TT / 100 + B_COST * TRAIN_COST_SCALED
V2 = ASC_SM + B_TIME * SM_TT / 100 + B_COST * SM_COST_SCALED
V3 = ASC_CAR + B_TIME * CAR_TT / 100 + B_COST * CAR_CO_SCALED

Associate utility functions with the numbering of alternatives.

V = {1: V1, 2: V2, 3: V3}

Associate the availability conditions with the alternatives.

av = {1: TRAIN_AV_SP, 2: SM_AV, 3: CAR_AV_SP}

Definition of nests.

nest_existing = OneNestForCrossNestedLogit(
    nest_param=MU_EXISTING,
    dict_of_alpha={1: ALPHA_EXISTING, 2: 0.0, 3: 1.0},
    name='existing',
)

nest_public = OneNestForCrossNestedLogit(
    nest_param=MU_PUBLIC, dict_of_alpha={1: ALPHA_PUBLIC, 2: 1.0, 3: 0.0}, name='public'
)

nests = NestsForCrossNestedLogit(
    choice_set=[1, 2, 3], tuple_of_nests=(nest_existing, nest_public)
)

Read the estimation results from the pickle file.

try:
    results = res.bioResults(pickle_file='saved_results/b11cnl.pickle')
except excep.BiogemeError:
    print('Run first the script b11cnl.py in order to generate the file 11cnl.pickle.')
    sys.exit()

print('Estimation results: ', results.get_estimated_parameters())

Estimation results:                     Value  Rob. Std err  Rob. t-test  Rob. p-value
ALPHA_EXISTING  0.495072      0.034752    14.246000  0.000000e+00
ASC_CAR        -0.240459      0.053450    -4.498764  6.834958e-06
ASC_TRAIN       0.098277      0.069978     1.404403  1.601989e-01
B_COST         -0.818885      0.058971   -13.886130  0.000000e+00
B_TIME         -0.776846      0.102380    -7.587854  3.241851e-14
MU_EXISTING     2.514876      0.248325    10.127363  0.000000e+00
MU_PUBLIC       4.113595      0.496722     8.281479  2.220446e-16

print('Calculating correlation matrix. It may generate numerical warnings from scipy.')
corr = nests.correlation(
    parameters=results.get_beta_values(),
    alternatives_names={1: 'Train', 2: 'Swissmetro', 3: 'Car'},
)
corr

Calculating correlation matrix. It may generate numerical warnings from scipy.
/Users/bierlair/venv312/lib/python3.12/site-packages/scipy/integrate/_quadpack_py.py:1260: IntegrationWarning: The algorithm does not converge.  Roundoff error is detected
  in the extrapolation table.  It is assumed that the requested tolerance
  cannot be achieved, and that the returned result (if full_output = 1) is
  the best which can be obtained.
  quad_r = quad(f, low, high, args=args, full_output=self.full_output,
/Users/bierlair/venv312/lib/python3.12/site-packages/scipy/integrate/_quadpack_py.py:1260: IntegrationWarning: The integral is probably divergent, or slowly convergent.
  quad_r = quad(f, low, high, args=args, full_output=self.full_output,
/Users/bierlair/venv312/lib/python3.12/site-packages/scipy/integrate/_quadpack_py.py:1260: IntegrationWarning: The maximum number of subdivisions (50) has been achieved.
  If increasing the limit yields no improvement it is advised to analyze
  the integrand in order to determine the difficulties.  If the position of a
  local difficulty can be determined (singularity, discontinuity) one will
  probably gain from splitting up the interval and calling the integrator
  on the subranges.  Perhaps a special-purpose integrator should be used.
  quad_r = quad(f, low, high, args=args, full_output=self.full_output,

	Train	Swissmetro	Car
Train	1.000000	6.178004e-01	5.527228e-01
Swissmetro	0.617800	1.000000e+00	8.281820e-12
Car	0.552723	8.281820e-12	1.000000e+00

The choice model is a cross-nested logit, with availability conditions.

prob1 = models.cnl(V, av, nests, 1)
prob2 = models.cnl(V, av, nests, 2)
prob3 = models.cnl(V, av, nests, 3)

We calculate elasticities. It is important that the variables explicitly appear as such in the model. If not, the derivative will be zero, as well as the elasticities.

genelas1 = Derive(prob1, 'TRAIN_TT') * TRAIN_TT / prob1
genelas2 = Derive(prob2, 'SM_TT') * SM_TT / prob2
genelas3 = Derive(prob3, 'CAR_TT') * CAR_TT / prob3

We report the probability of each alternative and the elasticities.

simulate = {
    'Prob. train': prob1,
    'Prob. Swissmetro': prob2,
    'Prob. car': prob3,
    'Elas. 1': genelas1,
    'Elas. 2': genelas2,
    'Elas. 3': genelas3,
}

Create the Biogeme object.

biosim = bio.BIOGEME(database, simulate)
biosim.modelName = 'b11cnl_simul'

Perform the simulation.

simresults = biosim.simulate(results.get_beta_values())

print('Simulation results')
simresults

Simulation results

	Prob. train	Prob. Swissmetro	Prob. car	Elas. 1	Elas. 2	Elas. 3
0	0.151845	0.627166	0.220989	-1.712384	-0.214602	-1.238134
1	0.191734	0.648405	0.159861	-1.371095	-0.197321	-1.485995
2	0.107228	0.604797	0.287975	-2.231103	-0.232560	-0.994183
3	0.117302	0.534994	0.347704	-1.902932	-0.282801	-0.549944
4	0.122367	0.645963	0.231670	-2.053924	-0.192771	-0.886473
...	...	...	...	...	...	...
8446	0.202324	0.688887	0.108788	-1.246225	-0.140072	-1.875101
8447	0.170793	0.659829	0.169378	-1.482446	-0.162456	-0.972154
8448	0.138390	0.653981	0.207629	-1.626039	-0.151241	-0.853487
8449	0.147761	0.704414	0.147825	-1.709362	-0.132484	-0.990522
8450	0.198138	0.666110	0.135751	-1.349649	-0.163036	-1.349502

6768 rows × 6 columns

print(f'Aggregate share of train: {100*simresults["Prob. train"].mean():.1f}%')

Aggregate share of train: 13.1%

Total running time of the script: (3 minutes 18.801 seconds)

Gallery generated by Sphinx-Gallery