Note
Go to the end to download the full example code.
Segmentations and alternative specific specification
We consider 4 specifications for the constants:
Not segmented
Segmented by GA (yearly subscription to public transport)
Segmented by luggage
Segmented both by GA and luggage
We consider 6 specifications for the time coefficients:
Generic and not segmented
Generic and segmented with first class
Generic and segmented with trip purpose
Alternative specific and not segmented
Alternative specific and segmented with first class
Alternative specific and segmented with trip purpose
We consider 2 specifications for the cost coefficients:
Generic
Alternative specific
We obtain a total of 48 specifications. See Bierlaire and Ortelli (2023).
- author:
Michel Bierlaire, EPFL
- date:
Thu Jul 13 16:18:10 2023
import numpy as np
from IPython.core.display_functions import display
import biogeme.biogeme as bio
from biogeme import models
from biogeme.expressions import Beta
from biogeme.catalog import segmentation_catalogs, generic_alt_specific_catalogs
from biogeme.results import compile_estimation_results, pareto_optimal
from biogeme.data.swissmetro import (
read_data,
CHOICE,
SM_AV,
CAR_AV_SP,
TRAIN_AV_SP,
TRAIN_TT_SCALED,
TRAIN_COST_SCALED,
SM_TT_SCALED,
SM_COST_SCALED,
CAR_TT_SCALED,
CAR_CO_SCALED,
)
Read the data
database = read_data()
Definition of the segmentations.
segmentation_ga = database.generate_segmentation(
variable='GA', mapping={0: 'noGA', 1: 'GA'}
)
segmentation_luggage = database.generate_segmentation(
variable='LUGGAGE', mapping={0: 'no_lugg', 1: 'one_lugg', 3: 'several_lugg'}
)
segmentation_first = database.generate_segmentation(
variable='FIRST', mapping={0: '2nd_class', 1: '1st_class'}
)
We consider two trip purposes: ‘commuters’ and anything else. We need to define a binary variable first.
database.data['COMMUTERS'] = np.where(database.data['PURPOSE'] == 1, 1, 0)
segmentation_purpose = database.generate_segmentation(
variable='COMMUTERS', mapping={0: 'non_commuters', 1: 'commuters'}
)
Parameters to be estimated.
ASC_CAR = Beta('ASC_CAR', 0, None, None, 0)
ASC_TRAIN = Beta('ASC_TRAIN', 0, None, None, 0)
B_TIME = Beta('B_TIME', 0, None, None, 0)
B_COST = Beta('B_COST', 0, None, None, 0)
Catalogs for the alternative specific constants.
ASC_TRAIN_catalog, ASC_CAR_catalog = segmentation_catalogs(
generic_name='ASC',
beta_parameters=[ASC_TRAIN, ASC_CAR],
potential_segmentations=(
segmentation_ga,
segmentation_luggage,
),
maximum_number=2,
)
Catalog for the travel time coefficient. Note that the function returns a list of catalogs. Here, the list contains only one of them. This is why there is a comma after “B_TIME_catalog”.
(B_TIME_catalog_dict,) = generic_alt_specific_catalogs(
generic_name='B_TIME',
beta_parameters=[B_TIME],
alternatives=('TRAIN', 'SM', 'CAR'),
potential_segmentations=(
segmentation_first,
segmentation_purpose,
),
maximum_number=1,
)
Catalog for the travel cost coefficient.
(B_COST_catalog_dict,) = generic_alt_specific_catalogs(
generic_name='B_COST', beta_parameters=[B_COST], alternatives=('TRAIN', 'SM', 'CAR')
)
Definition of the utility functions.
V1 = (
ASC_TRAIN_catalog
+ B_TIME_catalog_dict['TRAIN'] * TRAIN_TT_SCALED
+ B_COST_catalog_dict['TRAIN'] * TRAIN_COST_SCALED
)
V2 = (
B_TIME_catalog_dict['SM'] * SM_TT_SCALED
+ B_COST_catalog_dict['SM'] * SM_COST_SCALED
)
V3 = (
ASC_CAR_catalog
+ B_TIME_catalog_dict['CAR'] * CAR_TT_SCALED
+ B_COST_catalog_dict['CAR'] * CAR_CO_SCALED
)
Associate utility functions with the numbering of alternatives.
V = {1: V1, 2: V2, 3: V3}
Associate the availability conditions with the alternatives.
av = {1: TRAIN_AV_SP, 2: SM_AV, 3: CAR_AV_SP}
Definition of the model. This is the contribution of each observation to the log likelihood function.
logprob = models.loglogit(V, av, CHOICE)
Create the Biogeme object.
the_biogeme = bio.BIOGEME(database, logprob)
the_biogeme.modelName = 'b05alt_spec_segmentation'
the_biogeme.generate_html = False
the_biogeme.generate_pickle = False
Estimate the parameters.
dict_of_results = the_biogeme.estimate_catalog()
Number of estimated models.
print(f'A total of {len(dict_of_results)} models have been estimated')
A total of 48 models have been estimated
All estimation results
compiled_results, specs = compile_estimation_results(
dict_of_results, use_short_names=True
)
display(compiled_results)
Model_000000 ... Model_000047
Number of estimated parameters 8 ... 10
Sample size 10719 ... 10719
Final log likelihood -8279.12343 ... -8281.995886
Akaike Information Criterion 16574.24686 ... 16583.991773
Bayesian Information Criterion 16632.485046 ... 16656.789504
ASC_CAR (t-test) -0.328 (-5.42) ... 0.0273 (0.551)
ASC_CAR_GA (t-test) -1.11 (-7.11) ... -1.23 (-7.85)
ASC_TRAIN (t-test) -0.85 (-9.83) ... -1.5 (-18)
ASC_TRAIN_GA (t-test) 1.29 (13.5) ... 1.37 (19)
B_COST_CAR (t-test) -0.354 (-4.62) ...
B_COST_SM (t-test) -0.757 (-13.9) ...
B_COST_TRAIN (t-test) -1.04 (-9.99) ...
B_TIME (t-test) -1.28 (-17.1) ... -1.19 (-18.2)
B_TIME_1st_class (t-test) ...
B_COST (t-test) ... -0.704 (-13.5)
B_TIME_CAR (t-test) ...
B_TIME_SM (t-test) ...
B_TIME_TRAIN (t-test) ...
ASC_CAR_one_lugg (t-test) ... -0.0298 (-0.592)
ASC_CAR_several_lugg (t-test) ... -0.455 (-2.09)
ASC_TRAIN_one_lugg (t-test) ... 0.562 (7.05)
ASC_TRAIN_several_lugg (t-test) ... 0.649 (3.75)
B_TIME_CAR_commuters (t-test) ...
B_TIME_SM_commuters (t-test) ...
B_TIME_TRAIN_commuters (t-test) ...
B_TIME_commuters (t-test) ...
B_TIME_CAR_1st_class (t-test) ...
B_TIME_SM_1st_class (t-test) ...
B_TIME_TRAIN_1st_class (t-test) ...
[29 rows x 48 columns]
Glossary
for short_name, spec in specs.items():
print(f'{short_name}\t{spec}')
Model_000000 ASC:GA;B_COST_gen_altspec:altspec;B_TIME:no_seg;B_TIME_gen_altspec:generic
Model_000001 ASC:no_seg;B_COST_gen_altspec:altspec;B_TIME:FIRST;B_TIME_gen_altspec:generic
Model_000002 ASC:no_seg;B_COST_gen_altspec:generic;B_TIME:no_seg;B_TIME_gen_altspec:altspec
Model_000003 ASC:no_seg;B_COST_gen_altspec:altspec;B_TIME:no_seg;B_TIME_gen_altspec:altspec
Model_000004 ASC:LUGGAGE;B_COST_gen_altspec:generic;B_TIME:COMMUTERS;B_TIME_gen_altspec:altspec
Model_000005 ASC:LUGGAGE;B_COST_gen_altspec:generic;B_TIME:no_seg;B_TIME_gen_altspec:generic
Model_000006 ASC:no_seg;B_COST_gen_altspec:altspec;B_TIME:COMMUTERS;B_TIME_gen_altspec:altspec
Model_000007 ASC:LUGGAGE;B_COST_gen_altspec:generic;B_TIME:COMMUTERS;B_TIME_gen_altspec:generic
Model_000008 ASC:GA;B_COST_gen_altspec:generic;B_TIME:COMMUTERS;B_TIME_gen_altspec:generic
Model_000009 ASC:GA;B_COST_gen_altspec:generic;B_TIME:FIRST;B_TIME_gen_altspec:altspec
Model_000010 ASC:LUGGAGE;B_COST_gen_altspec:generic;B_TIME:FIRST;B_TIME_gen_altspec:altspec
Model_000011 ASC:GA-LUGGAGE;B_COST_gen_altspec:generic;B_TIME:COMMUTERS;B_TIME_gen_altspec:altspec
Model_000012 ASC:GA-LUGGAGE;B_COST_gen_altspec:altspec;B_TIME:no_seg;B_TIME_gen_altspec:generic
Model_000013 ASC:GA;B_COST_gen_altspec:altspec;B_TIME:FIRST;B_TIME_gen_altspec:altspec
Model_000014 ASC:GA-LUGGAGE;B_COST_gen_altspec:altspec;B_TIME:COMMUTERS;B_TIME_gen_altspec:generic
Model_000015 ASC:LUGGAGE;B_COST_gen_altspec:altspec;B_TIME:FIRST;B_TIME_gen_altspec:altspec
Model_000016 ASC:no_seg;B_COST_gen_altspec:generic;B_TIME:COMMUTERS;B_TIME_gen_altspec:generic
Model_000017 ASC:LUGGAGE;B_COST_gen_altspec:altspec;B_TIME:no_seg;B_TIME_gen_altspec:generic
Model_000018 ASC:GA;B_COST_gen_altspec:generic;B_TIME:no_seg;B_TIME_gen_altspec:altspec
Model_000019 ASC:no_seg;B_COST_gen_altspec:generic;B_TIME:FIRST;B_TIME_gen_altspec:altspec
Model_000020 ASC:GA;B_COST_gen_altspec:generic;B_TIME:no_seg;B_TIME_gen_altspec:generic
Model_000021 ASC:GA-LUGGAGE;B_COST_gen_altspec:generic;B_TIME:COMMUTERS;B_TIME_gen_altspec:generic
Model_000022 ASC:LUGGAGE;B_COST_gen_altspec:generic;B_TIME:no_seg;B_TIME_gen_altspec:altspec
Model_000023 ASC:LUGGAGE;B_COST_gen_altspec:altspec;B_TIME:COMMUTERS;B_TIME_gen_altspec:generic
Model_000024 ASC:GA-LUGGAGE;B_COST_gen_altspec:altspec;B_TIME:no_seg;B_TIME_gen_altspec:altspec
Model_000025 ASC:GA;B_COST_gen_altspec:altspec;B_TIME:no_seg;B_TIME_gen_altspec:altspec
Model_000026 ASC:GA;B_COST_gen_altspec:generic;B_TIME:COMMUTERS;B_TIME_gen_altspec:altspec
Model_000027 ASC:GA;B_COST_gen_altspec:altspec;B_TIME:COMMUTERS;B_TIME_gen_altspec:generic
Model_000028 ASC:LUGGAGE;B_COST_gen_altspec:altspec;B_TIME:no_seg;B_TIME_gen_altspec:altspec
Model_000029 ASC:GA-LUGGAGE;B_COST_gen_altspec:generic;B_TIME:FIRST;B_TIME_gen_altspec:altspec
Model_000030 ASC:GA-LUGGAGE;B_COST_gen_altspec:altspec;B_TIME:COMMUTERS;B_TIME_gen_altspec:altspec
Model_000031 ASC:no_seg;B_COST_gen_altspec:generic;B_TIME:FIRST;B_TIME_gen_altspec:generic
Model_000032 ASC:GA-LUGGAGE;B_COST_gen_altspec:generic;B_TIME:no_seg;B_TIME_gen_altspec:altspec
Model_000033 ASC:LUGGAGE;B_COST_gen_altspec:altspec;B_TIME:COMMUTERS;B_TIME_gen_altspec:altspec
Model_000034 ASC:no_seg;B_COST_gen_altspec:generic;B_TIME:COMMUTERS;B_TIME_gen_altspec:altspec
Model_000035 ASC:GA;B_COST_gen_altspec:altspec;B_TIME:COMMUTERS;B_TIME_gen_altspec:altspec
Model_000036 ASC:GA-LUGGAGE;B_COST_gen_altspec:generic;B_TIME:FIRST;B_TIME_gen_altspec:generic
Model_000037 ASC:GA-LUGGAGE;B_COST_gen_altspec:altspec;B_TIME:FIRST;B_TIME_gen_altspec:altspec
Model_000038 ASC:LUGGAGE;B_COST_gen_altspec:generic;B_TIME:FIRST;B_TIME_gen_altspec:generic
Model_000039 ASC:no_seg;B_COST_gen_altspec:altspec;B_TIME:COMMUTERS;B_TIME_gen_altspec:generic
Model_000040 ASC:GA;B_COST_gen_altspec:altspec;B_TIME:FIRST;B_TIME_gen_altspec:generic
Model_000041 ASC:no_seg;B_COST_gen_altspec:altspec;B_TIME:FIRST;B_TIME_gen_altspec:altspec
Model_000042 ASC:LUGGAGE;B_COST_gen_altspec:altspec;B_TIME:FIRST;B_TIME_gen_altspec:generic
Model_000043 ASC:no_seg;B_COST_gen_altspec:altspec;B_TIME:no_seg;B_TIME_gen_altspec:generic
Model_000044 ASC:no_seg;B_COST_gen_altspec:generic;B_TIME:no_seg;B_TIME_gen_altspec:generic
Model_000045 ASC:GA-LUGGAGE;B_COST_gen_altspec:altspec;B_TIME:FIRST;B_TIME_gen_altspec:generic
Model_000046 ASC:GA;B_COST_gen_altspec:generic;B_TIME:FIRST;B_TIME_gen_altspec:generic
Model_000047 ASC:GA-LUGGAGE;B_COST_gen_altspec:generic;B_TIME:no_seg;B_TIME_gen_altspec:generic
Estimation results of the Pareto optimal models.
pareto_results = pareto_optimal(dict_of_results)
compiled_pareto_results, pareto_specs = compile_estimation_results(
pareto_results, use_short_names=True
)
display(compiled_pareto_results)
Model_000000 ... Model_000008
Number of estimated parameters 5 ... 9
Sample size 10719 ... 10719
Final log likelihood -8598.531023 ... -8213.705718
Akaike Information Criterion 17207.062045 ... 16445.411435
Bayesian Information Criterion 17243.460911 ... 16510.929394
ASC_CAR (t-test) 0.0091 (0.245) ... -0.449 (-7.17)
ASC_TRAIN (t-test) -0.707 (-12.5) ... -0.997 (-11.1)
B_COST (t-test) -0.87 (-15.6) ...
B_TIME (t-test) -0.916 (-10.9) ... -0.957 (-10.7)
B_TIME_1st_class (t-test) -0.688 (-9.57) ... -0.692 (-9.34)
ASC_CAR_GA (t-test) ... -1.02 (-6.5)
ASC_TRAIN_GA (t-test) ... 1.38 (14.6)
B_TIME_CAR (t-test) ...
B_TIME_CAR_commuters (t-test) ...
B_TIME_SM (t-test) ...
B_TIME_SM_commuters (t-test) ...
B_TIME_TRAIN (t-test) ...
B_TIME_TRAIN_commuters (t-test) ...
ASC_CAR_one_lugg (t-test) ...
ASC_CAR_several_lugg (t-test) ...
ASC_TRAIN_one_lugg (t-test) ...
ASC_TRAIN_several_lugg (t-test) ...
B_COST_CAR (t-test) ... -0.329 (-4.21)
B_COST_SM (t-test) ... -0.872 (-14.4)
B_COST_TRAIN (t-test) ... -1.02 (-9.66)
[25 rows x 9 columns]
Glossary.
for short_name, spec in pareto_specs.items():
print(f'{short_name}\t{spec}')
Model_000000 ASC:no_seg;B_COST_gen_altspec:generic;B_TIME:FIRST;B_TIME_gen_altspec:generic
Model_000001 ASC:GA;B_COST_gen_altspec:generic;B_TIME:COMMUTERS;B_TIME_gen_altspec:altspec
Model_000002 ASC:no_seg;B_COST_gen_altspec:generic;B_TIME:no_seg;B_TIME_gen_altspec:generic
Model_000003 ASC:GA-LUGGAGE;B_COST_gen_altspec:generic;B_TIME:COMMUTERS;B_TIME_gen_altspec:altspec
Model_000004 ASC:GA;B_COST_gen_altspec:generic;B_TIME:no_seg;B_TIME_gen_altspec:generic
Model_000005 ASC:GA;B_COST_gen_altspec:altspec;B_TIME:COMMUTERS;B_TIME_gen_altspec:altspec
Model_000006 ASC:GA;B_COST_gen_altspec:generic;B_TIME:FIRST;B_TIME_gen_altspec:generic
Model_000007 ASC:GA-LUGGAGE;B_COST_gen_altspec:altspec;B_TIME:COMMUTERS;B_TIME_gen_altspec:altspec
Model_000008 ASC:GA;B_COST_gen_altspec:altspec;B_TIME:FIRST;B_TIME_gen_altspec:generic
Total running time of the script: (0 minutes 18.528 seconds)