Note
Go to the end to download the full example code.
Segmentations and alternative specific specification¶
We consider 4 specifications for the constants:
Not segmented
Segmented by GA (yearly subscription to public transport)
Segmented by luggage
Segmented both by GA and luggage
We consider 6 specifications for the time coefficients:
Generic and not segmented
Generic and segmented with first class
Generic and segmented with trip purpose
Alternative specific and not segmented
Alternative specific and segmented with first class
Alternative specific and segmented with trip purpose
We consider 2 specifications for the cost coefficients:
Generic
Alternative specific
We obtain a total of 48 specifications. See Bierlaire and Ortelli (2023).
Michel Bierlaire, EPFL Sun Apr 27 2025, 15:54:33
import numpy as np
from IPython.core.display_functions import display
from biogeme.biogeme import BIOGEME
from biogeme.catalog import generic_alt_specific_catalogs, segmentation_catalogs
from biogeme.data.swissmetro import (
CAR_AV_SP,
CAR_CO_SCALED,
CAR_TT_SCALED,
CHOICE,
SM_AV,
SM_COST_SCALED,
SM_TT_SCALED,
TRAIN_AV_SP,
TRAIN_COST_SCALED,
TRAIN_TT_SCALED,
read_data,
)
from biogeme.expressions import Beta
from biogeme.models import loglogit
from biogeme.results_processing import compile_estimation_results, pareto_optimal
Read the data
database = read_data()
Definition of the segmentations.
segmentation_ga = database.generate_segmentation(
variable='GA', mapping={0: 'noGA', 1: 'GA'}
)
segmentation_luggage = database.generate_segmentation(
variable='LUGGAGE', mapping={0: 'no_lugg', 1: 'one_lugg', 3: 'several_lugg'}
)
segmentation_first = database.generate_segmentation(
variable='FIRST', mapping={0: '2nd_class', 1: '1st_class'}
)
We consider two trip purposes: ‘commuters’ and anything else. We need to define a binary variable first.
database.dataframe['COMMUTERS'] = np.where(database.dataframe['PURPOSE'] == 1, 1, 0)
segmentation_purpose = database.generate_segmentation(
variable='COMMUTERS', mapping={0: 'non_commuters', 1: 'commuters'}
)
Parameters to be estimated.
asc_car = Beta('asc_car', 0, None, None, 0)
asc_train = Beta('asc_train', 0, None, None, 0)
b_time = Beta('b_time', 0, None, None, 0)
b_cost = Beta('b_cost', 0, None, None, 0)
Catalogs for the alternative specific constants.
asc_train_catalog, asc_car_catalog = segmentation_catalogs(
generic_name='asc',
beta_parameters=[asc_train, asc_car],
potential_segmentations=(
segmentation_ga,
segmentation_luggage,
),
maximum_number=2,
)
Catalog for the travel time coefficient. Note that the function returns a list of catalogs. Here, the list contains only one of them. This is why there is a comma after “B_TIME_catalog”.
(b_time_catalog_dict,) = generic_alt_specific_catalogs(
generic_name='b_time',
beta_parameters=[b_time],
alternatives=('train', 'swissmetro', 'car'),
potential_segmentations=(
segmentation_first,
segmentation_purpose,
),
maximum_number=1,
)
Catalog for the travel cost coefficient.
(b_cost_catalog_dict,) = generic_alt_specific_catalogs(
generic_name='b_cost',
beta_parameters=[b_cost],
alternatives=('train', 'swissmetro', 'car'),
)
Definition of the utility functions.
v_train = (
asc_train_catalog
+ b_time_catalog_dict['train'] * TRAIN_TT_SCALED
+ b_cost_catalog_dict['train'] * TRAIN_COST_SCALED
)
v_swissmetro = (
b_time_catalog_dict['swissmetro'] * SM_TT_SCALED
+ b_cost_catalog_dict['swissmetro'] * SM_COST_SCALED
)
v_car = (
asc_car_catalog
+ b_time_catalog_dict['car'] * CAR_TT_SCALED
+ b_cost_catalog_dict['car'] * CAR_CO_SCALED
)
Associate utility functions with the numbering of alternatives.
v = {1: v_train, 2: v_swissmetro, 3: v_car}
Associate the availability conditions with the alternatives.
av = {1: TRAIN_AV_SP, 2: SM_AV, 3: CAR_AV_SP}
Definition of the model. This is the contribution of each observation to the log likelihood function.
log_probability = loglogit(v, av, CHOICE)
Create the Biogeme object.
the_biogeme = BIOGEME(
database, log_probability, generate_html=False, generate_yaml=False
)
the_biogeme.model_name = 'b05alt_spec_segmentation'
Estimate the parameters.
dict_of_results = the_biogeme.estimate_catalog()
Number of estimated models.
print(f'A total of {len(dict_of_results)} models have been estimated')
A total of 48 models have been estimated
All estimation results
compiled_results, specs = compile_estimation_results(
dict_of_results, use_short_names=True
)
display('All estimated models')
display(compiled_results)
All estimated models
Model_000000 ... Model_000047
Number of estimated parameters 15 ... 5
Sample size 10719 ... 10719
Final log likelihood -9654.356 ... -11093.63
Akaike Information Criterion 19338.71 ... 22197.25
Bayesian Information Criterion 19447.91 ... 22233.65
asc_train_ref (t-test) -0.694 (-9.55) ...
asc_train_diff_one_lugg (t-test) 0.772 (13.9) ...
asc_train_diff_several_lugg (t-test) 0.679 (4.89) ...
b_time_train_ref (t-test) 0 (0) ...
b_time_train_diff_commuters (t-test) 0 (0) ...
b_cost_train (t-test) -1.76 (-19.7) ...
b_time_swissmetro_ref (t-test) 0 (0) ...
b_time_swissmetro_diff_commuters (t-test) 0 (0) ...
b_cost_swissmetro (t-test) -0.819 (-13.9) ...
asc_car_ref (t-test) -0.371 (-5.27) ...
asc_car_diff_one_lugg (t-test) -0.0975 (-1.86) ...
asc_car_diff_several_lugg (t-test) -0.54 (-2.56) ...
b_time_car_ref (t-test) 0 (0) ...
b_time_car_diff_commuters (t-test) 0 (0) ...
b_cost_car (t-test) -0.366 (-5.18) ...
asc_train_diff_GA (t-test) ...
b_time_train (t-test) ...
b_time_swissmetro (t-test) ...
asc_car_diff_GA (t-test) ...
b_time_car (t-test) ...
b_cost (t-test) ... 0 (0)
b_time_ref (t-test) ... 0 (0)
b_time_diff_commuters (t-test) ...
asc_train (t-test) ... 0 (0)
asc_car (t-test) ... 0 (0)
b_time (t-test) ...
b_time_diff_1st_class (t-test) ... 0 (0)
b_time_train_diff_1st_class (t-test) ...
b_time_swissmetro_diff_1st_class (t-test) ...
b_time_car_diff_1st_class (t-test) ...
[35 rows x 48 columns]
Glossary
for short_name, spec in specs.items():
print(f'{short_name}\t{spec}')
Model_000000 asc:LUGGAGE;b_cost_gen_altspec:altspec;b_time:COMMUTERS;b_time_gen_altspec:altspec
Model_000001 asc:GA;b_cost_gen_altspec:altspec;b_time:no_seg;b_time_gen_altspec:altspec
Model_000002 asc:GA;b_cost_gen_altspec:altspec;b_time:COMMUTERS;b_time_gen_altspec:altspec
Model_000003 asc:GA-LUGGAGE;b_cost_gen_altspec:generic;b_time:COMMUTERS;b_time_gen_altspec:altspec
Model_000004 asc:GA-LUGGAGE;b_cost_gen_altspec:generic;b_time:COMMUTERS;b_time_gen_altspec:generic
Model_000005 asc:GA-LUGGAGE;b_cost_gen_altspec:altspec;b_time:COMMUTERS;b_time_gen_altspec:altspec
Model_000006 asc:no_seg;b_cost_gen_altspec:altspec;b_time:no_seg;b_time_gen_altspec:altspec
Model_000007 asc:LUGGAGE;b_cost_gen_altspec:generic;b_time:no_seg;b_time_gen_altspec:generic
Model_000008 asc:no_seg;b_cost_gen_altspec:altspec;b_time:no_seg;b_time_gen_altspec:generic
Model_000009 asc:GA;b_cost_gen_altspec:altspec;b_time:FIRST;b_time_gen_altspec:generic
Model_000010 asc:GA-LUGGAGE;b_cost_gen_altspec:altspec;b_time:FIRST;b_time_gen_altspec:generic
Model_000011 asc:GA-LUGGAGE;b_cost_gen_altspec:altspec;b_time:no_seg;b_time_gen_altspec:generic
Model_000012 asc:LUGGAGE;b_cost_gen_altspec:altspec;b_time:FIRST;b_time_gen_altspec:altspec
Model_000013 asc:GA;b_cost_gen_altspec:altspec;b_time:FIRST;b_time_gen_altspec:altspec
Model_000014 asc:LUGGAGE;b_cost_gen_altspec:generic;b_time:COMMUTERS;b_time_gen_altspec:altspec
Model_000015 asc:no_seg;b_cost_gen_altspec:altspec;b_time:FIRST;b_time_gen_altspec:generic
Model_000016 asc:no_seg;b_cost_gen_altspec:altspec;b_time:FIRST;b_time_gen_altspec:altspec
Model_000017 asc:LUGGAGE;b_cost_gen_altspec:generic;b_time:no_seg;b_time_gen_altspec:altspec
Model_000018 asc:no_seg;b_cost_gen_altspec:altspec;b_time:COMMUTERS;b_time_gen_altspec:altspec
Model_000019 asc:LUGGAGE;b_cost_gen_altspec:altspec;b_time:FIRST;b_time_gen_altspec:generic
Model_000020 asc:LUGGAGE;b_cost_gen_altspec:generic;b_time:FIRST;b_time_gen_altspec:altspec
Model_000021 asc:GA-LUGGAGE;b_cost_gen_altspec:altspec;b_time:FIRST;b_time_gen_altspec:altspec
Model_000022 asc:GA-LUGGAGE;b_cost_gen_altspec:altspec;b_time:COMMUTERS;b_time_gen_altspec:generic
Model_000023 asc:GA;b_cost_gen_altspec:generic;b_time:COMMUTERS;b_time_gen_altspec:generic
Model_000024 asc:no_seg;b_cost_gen_altspec:generic;b_time:FIRST;b_time_gen_altspec:altspec
Model_000025 asc:no_seg;b_cost_gen_altspec:generic;b_time:COMMUTERS;b_time_gen_altspec:altspec
Model_000026 asc:LUGGAGE;b_cost_gen_altspec:altspec;b_time:no_seg;b_time_gen_altspec:altspec
Model_000027 asc:no_seg;b_cost_gen_altspec:generic;b_time:COMMUTERS;b_time_gen_altspec:generic
Model_000028 asc:no_seg;b_cost_gen_altspec:generic;b_time:no_seg;b_time_gen_altspec:altspec
Model_000029 asc:GA-LUGGAGE;b_cost_gen_altspec:generic;b_time:FIRST;b_time_gen_altspec:generic
Model_000030 asc:GA;b_cost_gen_altspec:generic;b_time:COMMUTERS;b_time_gen_altspec:altspec
Model_000031 asc:GA;b_cost_gen_altspec:altspec;b_time:COMMUTERS;b_time_gen_altspec:generic
Model_000032 asc:GA-LUGGAGE;b_cost_gen_altspec:altspec;b_time:no_seg;b_time_gen_altspec:altspec
Model_000033 asc:no_seg;b_cost_gen_altspec:altspec;b_time:COMMUTERS;b_time_gen_altspec:generic
Model_000034 asc:GA;b_cost_gen_altspec:generic;b_time:no_seg;b_time_gen_altspec:altspec
Model_000035 asc:LUGGAGE;b_cost_gen_altspec:altspec;b_time:no_seg;b_time_gen_altspec:generic
Model_000036 asc:GA;b_cost_gen_altspec:generic;b_time:FIRST;b_time_gen_altspec:altspec
Model_000037 asc:GA-LUGGAGE;b_cost_gen_altspec:generic;b_time:FIRST;b_time_gen_altspec:altspec
Model_000038 asc:GA-LUGGAGE;b_cost_gen_altspec:generic;b_time:no_seg;b_time_gen_altspec:altspec
Model_000039 asc:GA;b_cost_gen_altspec:generic;b_time:no_seg;b_time_gen_altspec:generic
Model_000040 asc:LUGGAGE;b_cost_gen_altspec:generic;b_time:COMMUTERS;b_time_gen_altspec:generic
Model_000041 asc:GA;b_cost_gen_altspec:altspec;b_time:no_seg;b_time_gen_altspec:generic
Model_000042 asc:no_seg;b_cost_gen_altspec:generic;b_time:no_seg;b_time_gen_altspec:generic
Model_000043 asc:GA-LUGGAGE;b_cost_gen_altspec:generic;b_time:no_seg;b_time_gen_altspec:generic
Model_000044 asc:GA;b_cost_gen_altspec:generic;b_time:FIRST;b_time_gen_altspec:generic
Model_000045 asc:LUGGAGE;b_cost_gen_altspec:generic;b_time:FIRST;b_time_gen_altspec:generic
Model_000046 asc:LUGGAGE;b_cost_gen_altspec:altspec;b_time:COMMUTERS;b_time_gen_altspec:generic
Model_000047 asc:no_seg;b_cost_gen_altspec:generic;b_time:FIRST;b_time_gen_altspec:generic
Estimation results of the Pareto optimal models.
pareto_results = pareto_optimal(dict_of_results)
compiled_pareto_results, pareto_specs = compile_estimation_results(
pareto_results, use_short_names=True
)
display('Non dominated models')
display(compiled_pareto_results)
Non dominated models
Model_000000 ... Model_000002
Number of estimated parameters 11 ... 4
Sample size 10719 ... 10719
Final log likelihood -8382.084 ... -8821.77
Akaike Information Criterion 16786.17 ... 17651.54
Bayesian Information Criterion 16866.24 ... 17680.66
asc_train_ref (t-test) -0.689 (-6.92) ...
asc_train_diff_one_lugg (t-test) 0.768 (9.67) ...
asc_train_diff_several_lugg (t-test) 0.672 (3.76) ...
b_time_ref (t-test) -1.27 (-12.3) ...
b_time_diff_1st_class (t-test) -0.00497 (-0.0617) ...
b_cost_train (t-test) -1.77 (-16.1) ...
b_cost_swissmetro (t-test) -0.814 (-15.2) ...
asc_car_ref (t-test) -0.367 (-5.59) ...
asc_car_diff_one_lugg (t-test) -0.0982 (-1.97) ...
asc_car_diff_several_lugg (t-test) -0.548 (-2.55) ...
b_cost_car (t-test) -0.369 (-4.87) ...
asc_train_diff_GA (t-test) ...
b_cost (t-test) ... -1.44 (-17.3)
asc_car_diff_GA (t-test) ...
asc_train (t-test) ... -0.58 (-9.64)
b_time (t-test) ... -1.59 (-19.4)
asc_car (t-test) ... -0.0397 (-0.981)
[22 rows x 3 columns]
Glossary.
for short_name, spec in pareto_specs.items():
print(f'{short_name}\t{spec}')
Model_000000 asc:LUGGAGE;b_cost_gen_altspec:altspec;b_time:FIRST;b_time_gen_altspec:generic
Model_000001 asc:GA;b_cost_gen_altspec:generic;b_time:FIRST;b_time_gen_altspec:generic
Model_000002 asc:no_seg;b_cost_gen_altspec:generic;b_time:no_seg;b_time_gen_altspec:generic
Total running time of the script: (0 minutes 55.995 seconds)