Note

Go to the end to download the full example code.

Segmentations and alternative specific specification

We consider 4 specifications for the constants:

Not segmented
Segmented by GA (yearly subscription to public transport)
Segmented by luggage
Segmented both by GA and luggage

We consider 6 specifications for the time coefficients:

Generic and not segmented
Generic and segmented with first class
Generic and segmented with trip purpose
Alternative specific and not segmented
Alternative specific and segmented with first class
Alternative specific and segmented with trip purpose

We consider 2 specifications for the cost coefficients:

Generic
Alternative specific

We obtain a total of 48 specifications. See Bierlaire and Ortelli (2023).

author:: Michel Bierlaire, EPFL
date:: Thu Jul 13 16:18:10 2023

import numpy as np
from IPython.core.display_functions import display

import biogeme.biogeme as bio
from biogeme import models
from biogeme.expressions import Beta
from biogeme.catalog import segmentation_catalogs, generic_alt_specific_catalogs
from biogeme.results import compile_estimation_results, pareto_optimal

from biogeme.data.swissmetro import (
    read_data,
    CHOICE,
    SM_AV,
    CAR_AV_SP,
    TRAIN_AV_SP,
    TRAIN_TT_SCALED,
    TRAIN_COST_SCALED,
    SM_TT_SCALED,
    SM_COST_SCALED,
    CAR_TT_SCALED,
    CAR_CO_SCALED,
)

Read the data

database = read_data()

Definition of the segmentations.

segmentation_ga = database.generate_segmentation(
    variable='GA', mapping={0: 'noGA', 1: 'GA'}
)

segmentation_luggage = database.generate_segmentation(
    variable='LUGGAGE', mapping={0: 'no_lugg', 1: 'one_lugg', 3: 'several_lugg'}
)

segmentation_first = database.generate_segmentation(
    variable='FIRST', mapping={0: '2nd_class', 1: '1st_class'}
)

We consider two trip purposes: ‘commuters’ and anything else. We need to define a binary variable first.

database.data['COMMUTERS'] = np.where(database.data['PURPOSE'] == 1, 1, 0)

segmentation_purpose = database.generate_segmentation(
    variable='COMMUTERS', mapping={0: 'non_commuters', 1: 'commuters'}
)

Parameters to be estimated.

ASC_CAR = Beta('ASC_CAR', 0, None, None, 0)
ASC_TRAIN = Beta('ASC_TRAIN', 0, None, None, 0)
B_TIME = Beta('B_TIME', 0, None, None, 0)
B_COST = Beta('B_COST', 0, None, None, 0)

Catalogs for the alternative specific constants.

ASC_TRAIN_catalog, ASC_CAR_catalog = segmentation_catalogs(
    generic_name='ASC',
    beta_parameters=[ASC_TRAIN, ASC_CAR],
    potential_segmentations=(
        segmentation_ga,
        segmentation_luggage,
    ),
    maximum_number=2,
)

Catalog for the travel time coefficient. Note that the function returns a list of catalogs. Here, the list contains only one of them. This is why there is a comma after “B_TIME_catalog”.

(B_TIME_catalog_dict,) = generic_alt_specific_catalogs(
    generic_name='B_TIME',
    beta_parameters=[B_TIME],
    alternatives=('TRAIN', 'SM', 'CAR'),
    potential_segmentations=(
        segmentation_first,
        segmentation_purpose,
    ),
    maximum_number=1,
)

Catalog for the travel cost coefficient.

(B_COST_catalog_dict,) = generic_alt_specific_catalogs(
    generic_name='B_COST', beta_parameters=[B_COST], alternatives=('TRAIN', 'SM', 'CAR')
)

Definition of the utility functions.

V1 = (
    ASC_TRAIN_catalog
    + B_TIME_catalog_dict['TRAIN'] * TRAIN_TT_SCALED
    + B_COST_catalog_dict['TRAIN'] * TRAIN_COST_SCALED
)
V2 = (
    B_TIME_catalog_dict['SM'] * SM_TT_SCALED
    + B_COST_catalog_dict['SM'] * SM_COST_SCALED
)
V3 = (
    ASC_CAR_catalog
    + B_TIME_catalog_dict['CAR'] * CAR_TT_SCALED
    + B_COST_catalog_dict['CAR'] * CAR_CO_SCALED
)

Associate utility functions with the numbering of alternatives.

V = {1: V1, 2: V2, 3: V3}

Associate the availability conditions with the alternatives.

av = {1: TRAIN_AV_SP, 2: SM_AV, 3: CAR_AV_SP}

Definition of the model. This is the contribution of each observation to the log likelihood function.

logprob = models.loglogit(V, av, CHOICE)

Create the Biogeme object.

the_biogeme = bio.BIOGEME(database, logprob)
the_biogeme.modelName = 'b05alt_spec_segmentation'
the_biogeme.generate_html = False
the_biogeme.generate_pickle = False

Estimate the parameters.

dict_of_results = the_biogeme.estimate_catalog()

Number of estimated models.

print(f'A total of {len(dict_of_results)} models have been estimated')

A total of 48 models have been estimated

All estimation results

compiled_results, specs = compile_estimation_results(
    dict_of_results, use_short_names=True
)

display(compiled_results)

                                    Model_000000  ...       Model_000047
Number of estimated parameters                 8  ...                 10
Sample size                                10719  ...              10719
Final log likelihood                 -8279.12343  ...       -8281.995886
Akaike Information Criterion         16574.24686  ...       16583.991773
Bayesian Information Criterion      16632.485046  ...       16656.789504
ASC_CAR (t-test)                 -0.328  (-5.42)  ...    0.0273  (0.551)
ASC_CAR_GA (t-test)               -1.11  (-7.11)  ...     -1.23  (-7.85)
ASC_TRAIN (t-test)                -0.85  (-9.83)  ...        -1.5  (-18)
ASC_TRAIN_GA (t-test)               1.29  (13.5)  ...         1.37  (19)
B_COST_CAR (t-test)              -0.354  (-4.62)  ...
B_COST_SM (t-test)               -0.757  (-13.9)  ...
B_COST_TRAIN (t-test)             -1.04  (-9.99)  ...
B_TIME (t-test)                   -1.28  (-17.1)  ...     -1.19  (-18.2)
B_TIME_1st_class (t-test)                         ...
B_COST (t-test)                                   ...    -0.704  (-13.5)
B_TIME_CAR (t-test)                               ...
B_TIME_SM (t-test)                                ...
B_TIME_TRAIN (t-test)                             ...
ASC_CAR_one_lugg (t-test)                         ...  -0.0298  (-0.592)
ASC_CAR_several_lugg (t-test)                     ...    -0.455  (-2.09)
ASC_TRAIN_one_lugg (t-test)                       ...      0.562  (7.05)
ASC_TRAIN_several_lugg (t-test)                   ...      0.649  (3.75)
B_TIME_CAR_commuters (t-test)                     ...
B_TIME_SM_commuters (t-test)                      ...
B_TIME_TRAIN_commuters (t-test)                   ...
B_TIME_commuters (t-test)                         ...
B_TIME_CAR_1st_class (t-test)                     ...
B_TIME_SM_1st_class (t-test)                      ...
B_TIME_TRAIN_1st_class (t-test)                   ...

[29 rows x 48 columns]

Glossary

for short_name, spec in specs.items():
    print(f'{short_name}\t{spec}')

Model_000000    ASC:GA;B_COST_gen_altspec:altspec;B_TIME:no_seg;B_TIME_gen_altspec:generic
Model_000001    ASC:no_seg;B_COST_gen_altspec:altspec;B_TIME:FIRST;B_TIME_gen_altspec:generic
Model_000002    ASC:no_seg;B_COST_gen_altspec:generic;B_TIME:no_seg;B_TIME_gen_altspec:altspec
Model_000003    ASC:no_seg;B_COST_gen_altspec:altspec;B_TIME:no_seg;B_TIME_gen_altspec:altspec
Model_000004    ASC:LUGGAGE;B_COST_gen_altspec:generic;B_TIME:COMMUTERS;B_TIME_gen_altspec:altspec
Model_000005    ASC:LUGGAGE;B_COST_gen_altspec:generic;B_TIME:no_seg;B_TIME_gen_altspec:generic
Model_000006    ASC:no_seg;B_COST_gen_altspec:altspec;B_TIME:COMMUTERS;B_TIME_gen_altspec:altspec
Model_000007    ASC:LUGGAGE;B_COST_gen_altspec:generic;B_TIME:COMMUTERS;B_TIME_gen_altspec:generic
Model_000008    ASC:GA;B_COST_gen_altspec:generic;B_TIME:COMMUTERS;B_TIME_gen_altspec:generic
Model_000009    ASC:GA;B_COST_gen_altspec:generic;B_TIME:FIRST;B_TIME_gen_altspec:altspec
Model_000010    ASC:LUGGAGE;B_COST_gen_altspec:generic;B_TIME:FIRST;B_TIME_gen_altspec:altspec
Model_000011    ASC:GA-LUGGAGE;B_COST_gen_altspec:generic;B_TIME:COMMUTERS;B_TIME_gen_altspec:altspec
Model_000012    ASC:GA-LUGGAGE;B_COST_gen_altspec:altspec;B_TIME:no_seg;B_TIME_gen_altspec:generic
Model_000013    ASC:GA;B_COST_gen_altspec:altspec;B_TIME:FIRST;B_TIME_gen_altspec:altspec
Model_000014    ASC:GA-LUGGAGE;B_COST_gen_altspec:altspec;B_TIME:COMMUTERS;B_TIME_gen_altspec:generic
Model_000015    ASC:LUGGAGE;B_COST_gen_altspec:altspec;B_TIME:FIRST;B_TIME_gen_altspec:altspec
Model_000016    ASC:no_seg;B_COST_gen_altspec:generic;B_TIME:COMMUTERS;B_TIME_gen_altspec:generic
Model_000017    ASC:LUGGAGE;B_COST_gen_altspec:altspec;B_TIME:no_seg;B_TIME_gen_altspec:generic
Model_000018    ASC:GA;B_COST_gen_altspec:generic;B_TIME:no_seg;B_TIME_gen_altspec:altspec
Model_000019    ASC:no_seg;B_COST_gen_altspec:generic;B_TIME:FIRST;B_TIME_gen_altspec:altspec
Model_000020    ASC:GA;B_COST_gen_altspec:generic;B_TIME:no_seg;B_TIME_gen_altspec:generic
Model_000021    ASC:GA-LUGGAGE;B_COST_gen_altspec:generic;B_TIME:COMMUTERS;B_TIME_gen_altspec:generic
Model_000022    ASC:LUGGAGE;B_COST_gen_altspec:generic;B_TIME:no_seg;B_TIME_gen_altspec:altspec
Model_000023    ASC:LUGGAGE;B_COST_gen_altspec:altspec;B_TIME:COMMUTERS;B_TIME_gen_altspec:generic
Model_000024    ASC:GA-LUGGAGE;B_COST_gen_altspec:altspec;B_TIME:no_seg;B_TIME_gen_altspec:altspec
Model_000025    ASC:GA;B_COST_gen_altspec:altspec;B_TIME:no_seg;B_TIME_gen_altspec:altspec
Model_000026    ASC:GA;B_COST_gen_altspec:generic;B_TIME:COMMUTERS;B_TIME_gen_altspec:altspec
Model_000027    ASC:GA;B_COST_gen_altspec:altspec;B_TIME:COMMUTERS;B_TIME_gen_altspec:generic
Model_000028    ASC:LUGGAGE;B_COST_gen_altspec:altspec;B_TIME:no_seg;B_TIME_gen_altspec:altspec
Model_000029    ASC:GA-LUGGAGE;B_COST_gen_altspec:generic;B_TIME:FIRST;B_TIME_gen_altspec:altspec
Model_000030    ASC:GA-LUGGAGE;B_COST_gen_altspec:altspec;B_TIME:COMMUTERS;B_TIME_gen_altspec:altspec
Model_000031    ASC:no_seg;B_COST_gen_altspec:generic;B_TIME:FIRST;B_TIME_gen_altspec:generic
Model_000032    ASC:GA-LUGGAGE;B_COST_gen_altspec:generic;B_TIME:no_seg;B_TIME_gen_altspec:altspec
Model_000033    ASC:LUGGAGE;B_COST_gen_altspec:altspec;B_TIME:COMMUTERS;B_TIME_gen_altspec:altspec
Model_000034    ASC:no_seg;B_COST_gen_altspec:generic;B_TIME:COMMUTERS;B_TIME_gen_altspec:altspec
Model_000035    ASC:GA;B_COST_gen_altspec:altspec;B_TIME:COMMUTERS;B_TIME_gen_altspec:altspec
Model_000036    ASC:GA-LUGGAGE;B_COST_gen_altspec:generic;B_TIME:FIRST;B_TIME_gen_altspec:generic
Model_000037    ASC:GA-LUGGAGE;B_COST_gen_altspec:altspec;B_TIME:FIRST;B_TIME_gen_altspec:altspec
Model_000038    ASC:LUGGAGE;B_COST_gen_altspec:generic;B_TIME:FIRST;B_TIME_gen_altspec:generic
Model_000039    ASC:no_seg;B_COST_gen_altspec:altspec;B_TIME:COMMUTERS;B_TIME_gen_altspec:generic
Model_000040    ASC:GA;B_COST_gen_altspec:altspec;B_TIME:FIRST;B_TIME_gen_altspec:generic
Model_000041    ASC:no_seg;B_COST_gen_altspec:altspec;B_TIME:FIRST;B_TIME_gen_altspec:altspec
Model_000042    ASC:LUGGAGE;B_COST_gen_altspec:altspec;B_TIME:FIRST;B_TIME_gen_altspec:generic
Model_000043    ASC:no_seg;B_COST_gen_altspec:altspec;B_TIME:no_seg;B_TIME_gen_altspec:generic
Model_000044    ASC:no_seg;B_COST_gen_altspec:generic;B_TIME:no_seg;B_TIME_gen_altspec:generic
Model_000045    ASC:GA-LUGGAGE;B_COST_gen_altspec:altspec;B_TIME:FIRST;B_TIME_gen_altspec:generic
Model_000046    ASC:GA;B_COST_gen_altspec:generic;B_TIME:FIRST;B_TIME_gen_altspec:generic
Model_000047    ASC:GA-LUGGAGE;B_COST_gen_altspec:generic;B_TIME:no_seg;B_TIME_gen_altspec:generic

Estimation results of the Pareto optimal models.

pareto_results = pareto_optimal(dict_of_results)
compiled_pareto_results, pareto_specs = compile_estimation_results(
    pareto_results, use_short_names=True
)

display(compiled_pareto_results)

                                    Model_000000  ...     Model_000008
Number of estimated parameters                 5  ...                9
Sample size                                10719  ...            10719
Final log likelihood                -8598.531023  ...     -8213.705718
Akaike Information Criterion        17207.062045  ...     16445.411435
Bayesian Information Criterion      17243.460911  ...     16510.929394
ASC_CAR (t-test)                 0.0091  (0.245)  ...  -0.449  (-7.17)
ASC_TRAIN (t-test)               -0.707  (-12.5)  ...  -0.997  (-11.1)
B_COST (t-test)                   -0.87  (-15.6)  ...
B_TIME (t-test)                  -0.916  (-10.9)  ...  -0.957  (-10.7)
B_TIME_1st_class (t-test)        -0.688  (-9.57)  ...  -0.692  (-9.34)
ASC_CAR_GA (t-test)                               ...    -1.02  (-6.5)
ASC_TRAIN_GA (t-test)                             ...     1.38  (14.6)
B_TIME_CAR (t-test)                               ...
B_TIME_CAR_commuters (t-test)                     ...
B_TIME_SM (t-test)                                ...
B_TIME_SM_commuters (t-test)                      ...
B_TIME_TRAIN (t-test)                             ...
B_TIME_TRAIN_commuters (t-test)                   ...
ASC_CAR_one_lugg (t-test)                         ...
ASC_CAR_several_lugg (t-test)                     ...
ASC_TRAIN_one_lugg (t-test)                       ...
ASC_TRAIN_several_lugg (t-test)                   ...
B_COST_CAR (t-test)                               ...  -0.329  (-4.21)
B_COST_SM (t-test)                                ...  -0.872  (-14.4)
B_COST_TRAIN (t-test)                             ...   -1.02  (-9.66)

[25 rows x 9 columns]

Glossary.

for short_name, spec in pareto_specs.items():
    print(f'{short_name}\t{spec}')

Model_000000    ASC:no_seg;B_COST_gen_altspec:generic;B_TIME:FIRST;B_TIME_gen_altspec:generic
Model_000001    ASC:GA;B_COST_gen_altspec:generic;B_TIME:COMMUTERS;B_TIME_gen_altspec:altspec
Model_000002    ASC:no_seg;B_COST_gen_altspec:generic;B_TIME:no_seg;B_TIME_gen_altspec:generic
Model_000003    ASC:GA-LUGGAGE;B_COST_gen_altspec:generic;B_TIME:COMMUTERS;B_TIME_gen_altspec:altspec
Model_000004    ASC:GA;B_COST_gen_altspec:generic;B_TIME:no_seg;B_TIME_gen_altspec:generic
Model_000005    ASC:GA;B_COST_gen_altspec:altspec;B_TIME:COMMUTERS;B_TIME_gen_altspec:altspec
Model_000006    ASC:GA;B_COST_gen_altspec:generic;B_TIME:FIRST;B_TIME_gen_altspec:generic
Model_000007    ASC:GA-LUGGAGE;B_COST_gen_altspec:altspec;B_TIME:COMMUTERS;B_TIME_gen_altspec:altspec
Model_000008    ASC:GA;B_COST_gen_altspec:altspec;B_TIME:FIRST;B_TIME_gen_altspec:generic

Total running time of the script: (0 minutes 18.528 seconds)

Gallery generated by Sphinx-Gallery