Segmentations and alternative specific specification

We consider 4 specifications for the constants:

  • Not segmented

  • Segmented by GA (yearly subscription to public transport)

  • Segmented by luggage

  • Segmented both by GA and luggage

We consider 6 specifications for the time coefficients:

  • Generic and not segmented

  • Generic and segmented with first class

  • Generic and segmented with trip purpose

  • Alternative specific and not segmented

  • Alternative specific and segmented with first class

  • Alternative specific and segmented with trip purpose

We consider 2 specifications for the cost coefficients:

  • Generic

  • Alternative specific

We obtain a total of 48 specifications. See Bierlaire and Ortelli (2023).

author:

Michel Bierlaire, EPFL

date:

Thu Jul 13 16:18:10 2023

import numpy as np
from IPython.core.display_functions import display

import biogeme.biogeme as bio
from biogeme import models
from biogeme.expressions import Beta
from biogeme.catalog import segmentation_catalogs, generic_alt_specific_catalogs
from biogeme.results import compile_estimation_results, pareto_optimal

from biogeme.data.swissmetro import (
    read_data,
    CHOICE,
    SM_AV,
    CAR_AV_SP,
    TRAIN_AV_SP,
    TRAIN_TT_SCALED,
    TRAIN_COST_SCALED,
    SM_TT_SCALED,
    SM_COST_SCALED,
    CAR_TT_SCALED,
    CAR_CO_SCALED,
)

Read the data

database = read_data()

Definition of the segmentations.

segmentation_ga = database.generate_segmentation(
    variable='GA', mapping={0: 'noGA', 1: 'GA'}
)

segmentation_luggage = database.generate_segmentation(
    variable='LUGGAGE', mapping={0: 'no_lugg', 1: 'one_lugg', 3: 'several_lugg'}
)

segmentation_first = database.generate_segmentation(
    variable='FIRST', mapping={0: '2nd_class', 1: '1st_class'}
)

We consider two trip purposes: ‘commuters’ and anything else. We need to define a binary variable first.

database.data['COMMUTERS'] = np.where(database.data['PURPOSE'] == 1, 1, 0)

segmentation_purpose = database.generate_segmentation(
    variable='COMMUTERS', mapping={0: 'non_commuters', 1: 'commuters'}
)

Parameters to be estimated.

ASC_CAR = Beta('ASC_CAR', 0, None, None, 0)
ASC_TRAIN = Beta('ASC_TRAIN', 0, None, None, 0)
B_TIME = Beta('B_TIME', 0, None, None, 0)
B_COST = Beta('B_COST', 0, None, None, 0)

Catalogs for the alternative specific constants.

ASC_TRAIN_catalog, ASC_CAR_catalog = segmentation_catalogs(
    generic_name='ASC',
    beta_parameters=[ASC_TRAIN, ASC_CAR],
    potential_segmentations=(
        segmentation_ga,
        segmentation_luggage,
    ),
    maximum_number=2,
)

Catalog for the travel time coefficient. Note that the function returns a list of catalogs. Here, the list contains only one of them. This is why there is a comma after “B_TIME_catalog”.

(B_TIME_catalog_dict,) = generic_alt_specific_catalogs(
    generic_name='B_TIME',
    beta_parameters=[B_TIME],
    alternatives=('TRAIN', 'SM', 'CAR'),
    potential_segmentations=(
        segmentation_first,
        segmentation_purpose,
    ),
    maximum_number=1,
)

Catalog for the travel cost coefficient.

(B_COST_catalog_dict,) = generic_alt_specific_catalogs(
    generic_name='B_COST', beta_parameters=[B_COST], alternatives=('TRAIN', 'SM', 'CAR')
)

Definition of the utility functions.

V1 = (
    ASC_TRAIN_catalog
    + B_TIME_catalog_dict['TRAIN'] * TRAIN_TT_SCALED
    + B_COST_catalog_dict['TRAIN'] * TRAIN_COST_SCALED
)
V2 = (
    B_TIME_catalog_dict['SM'] * SM_TT_SCALED
    + B_COST_catalog_dict['SM'] * SM_COST_SCALED
)
V3 = (
    ASC_CAR_catalog
    + B_TIME_catalog_dict['CAR'] * CAR_TT_SCALED
    + B_COST_catalog_dict['CAR'] * CAR_CO_SCALED
)

Associate utility functions with the numbering of alternatives.

V = {1: V1, 2: V2, 3: V3}

Associate the availability conditions with the alternatives.

av = {1: TRAIN_AV_SP, 2: SM_AV, 3: CAR_AV_SP}

Definition of the model. This is the contribution of each observation to the log likelihood function.

logprob = models.loglogit(V, av, CHOICE)

Create the Biogeme object.

the_biogeme = bio.BIOGEME(database, logprob)
the_biogeme.modelName = 'b05alt_spec_segmentation'
the_biogeme.generate_html = False
the_biogeme.generate_pickle = False

Estimate the parameters.

dict_of_results = the_biogeme.estimate_catalog()

Number of estimated models.

print(f'A total of {len(dict_of_results)} models have been estimated')
A total of 48 models have been estimated

All estimation results

compiled_results, specs = compile_estimation_results(
    dict_of_results, use_short_names=True
)
display(compiled_results)
                                    Model_000000  ...       Model_000047
Number of estimated parameters                 8  ...                 10
Sample size                                10719  ...              10719
Final log likelihood                 -8279.12343  ...       -8281.995886
Akaike Information Criterion         16574.24686  ...       16583.991773
Bayesian Information Criterion      16632.485046  ...       16656.789504
ASC_CAR (t-test)                 -0.328  (-5.42)  ...    0.0273  (0.551)
ASC_CAR_GA (t-test)               -1.11  (-7.11)  ...     -1.23  (-7.85)
ASC_TRAIN (t-test)                -0.85  (-9.83)  ...        -1.5  (-18)
ASC_TRAIN_GA (t-test)               1.29  (13.5)  ...         1.37  (19)
B_COST_CAR (t-test)              -0.354  (-4.62)  ...
B_COST_SM (t-test)               -0.757  (-13.9)  ...
B_COST_TRAIN (t-test)             -1.04  (-9.99)  ...
B_TIME (t-test)                   -1.28  (-17.1)  ...     -1.19  (-18.2)
B_TIME_1st_class (t-test)                         ...
B_COST (t-test)                                   ...    -0.704  (-13.5)
B_TIME_CAR (t-test)                               ...
B_TIME_SM (t-test)                                ...
B_TIME_TRAIN (t-test)                             ...
ASC_CAR_one_lugg (t-test)                         ...  -0.0298  (-0.592)
ASC_CAR_several_lugg (t-test)                     ...    -0.455  (-2.09)
ASC_TRAIN_one_lugg (t-test)                       ...      0.562  (7.05)
ASC_TRAIN_several_lugg (t-test)                   ...      0.649  (3.75)
B_TIME_CAR_commuters (t-test)                     ...
B_TIME_SM_commuters (t-test)                      ...
B_TIME_TRAIN_commuters (t-test)                   ...
B_TIME_commuters (t-test)                         ...
B_TIME_CAR_1st_class (t-test)                     ...
B_TIME_SM_1st_class (t-test)                      ...
B_TIME_TRAIN_1st_class (t-test)                   ...

[29 rows x 48 columns]

Glossary

for short_name, spec in specs.items():
    print(f'{short_name}\t{spec}')
Model_000000    ASC:GA;B_COST_gen_altspec:altspec;B_TIME:no_seg;B_TIME_gen_altspec:generic
Model_000001    ASC:no_seg;B_COST_gen_altspec:altspec;B_TIME:FIRST;B_TIME_gen_altspec:generic
Model_000002    ASC:no_seg;B_COST_gen_altspec:generic;B_TIME:no_seg;B_TIME_gen_altspec:altspec
Model_000003    ASC:no_seg;B_COST_gen_altspec:altspec;B_TIME:no_seg;B_TIME_gen_altspec:altspec
Model_000004    ASC:LUGGAGE;B_COST_gen_altspec:generic;B_TIME:COMMUTERS;B_TIME_gen_altspec:altspec
Model_000005    ASC:LUGGAGE;B_COST_gen_altspec:generic;B_TIME:no_seg;B_TIME_gen_altspec:generic
Model_000006    ASC:no_seg;B_COST_gen_altspec:altspec;B_TIME:COMMUTERS;B_TIME_gen_altspec:altspec
Model_000007    ASC:LUGGAGE;B_COST_gen_altspec:generic;B_TIME:COMMUTERS;B_TIME_gen_altspec:generic
Model_000008    ASC:GA;B_COST_gen_altspec:generic;B_TIME:COMMUTERS;B_TIME_gen_altspec:generic
Model_000009    ASC:GA;B_COST_gen_altspec:generic;B_TIME:FIRST;B_TIME_gen_altspec:altspec
Model_000010    ASC:LUGGAGE;B_COST_gen_altspec:generic;B_TIME:FIRST;B_TIME_gen_altspec:altspec
Model_000011    ASC:GA-LUGGAGE;B_COST_gen_altspec:generic;B_TIME:COMMUTERS;B_TIME_gen_altspec:altspec
Model_000012    ASC:GA-LUGGAGE;B_COST_gen_altspec:altspec;B_TIME:no_seg;B_TIME_gen_altspec:generic
Model_000013    ASC:GA;B_COST_gen_altspec:altspec;B_TIME:FIRST;B_TIME_gen_altspec:altspec
Model_000014    ASC:GA-LUGGAGE;B_COST_gen_altspec:altspec;B_TIME:COMMUTERS;B_TIME_gen_altspec:generic
Model_000015    ASC:LUGGAGE;B_COST_gen_altspec:altspec;B_TIME:FIRST;B_TIME_gen_altspec:altspec
Model_000016    ASC:no_seg;B_COST_gen_altspec:generic;B_TIME:COMMUTERS;B_TIME_gen_altspec:generic
Model_000017    ASC:LUGGAGE;B_COST_gen_altspec:altspec;B_TIME:no_seg;B_TIME_gen_altspec:generic
Model_000018    ASC:GA;B_COST_gen_altspec:generic;B_TIME:no_seg;B_TIME_gen_altspec:altspec
Model_000019    ASC:no_seg;B_COST_gen_altspec:generic;B_TIME:FIRST;B_TIME_gen_altspec:altspec
Model_000020    ASC:GA;B_COST_gen_altspec:generic;B_TIME:no_seg;B_TIME_gen_altspec:generic
Model_000021    ASC:GA-LUGGAGE;B_COST_gen_altspec:generic;B_TIME:COMMUTERS;B_TIME_gen_altspec:generic
Model_000022    ASC:LUGGAGE;B_COST_gen_altspec:generic;B_TIME:no_seg;B_TIME_gen_altspec:altspec
Model_000023    ASC:LUGGAGE;B_COST_gen_altspec:altspec;B_TIME:COMMUTERS;B_TIME_gen_altspec:generic
Model_000024    ASC:GA-LUGGAGE;B_COST_gen_altspec:altspec;B_TIME:no_seg;B_TIME_gen_altspec:altspec
Model_000025    ASC:GA;B_COST_gen_altspec:altspec;B_TIME:no_seg;B_TIME_gen_altspec:altspec
Model_000026    ASC:GA;B_COST_gen_altspec:generic;B_TIME:COMMUTERS;B_TIME_gen_altspec:altspec
Model_000027    ASC:GA;B_COST_gen_altspec:altspec;B_TIME:COMMUTERS;B_TIME_gen_altspec:generic
Model_000028    ASC:LUGGAGE;B_COST_gen_altspec:altspec;B_TIME:no_seg;B_TIME_gen_altspec:altspec
Model_000029    ASC:GA-LUGGAGE;B_COST_gen_altspec:generic;B_TIME:FIRST;B_TIME_gen_altspec:altspec
Model_000030    ASC:GA-LUGGAGE;B_COST_gen_altspec:altspec;B_TIME:COMMUTERS;B_TIME_gen_altspec:altspec
Model_000031    ASC:no_seg;B_COST_gen_altspec:generic;B_TIME:FIRST;B_TIME_gen_altspec:generic
Model_000032    ASC:GA-LUGGAGE;B_COST_gen_altspec:generic;B_TIME:no_seg;B_TIME_gen_altspec:altspec
Model_000033    ASC:LUGGAGE;B_COST_gen_altspec:altspec;B_TIME:COMMUTERS;B_TIME_gen_altspec:altspec
Model_000034    ASC:no_seg;B_COST_gen_altspec:generic;B_TIME:COMMUTERS;B_TIME_gen_altspec:altspec
Model_000035    ASC:GA;B_COST_gen_altspec:altspec;B_TIME:COMMUTERS;B_TIME_gen_altspec:altspec
Model_000036    ASC:GA-LUGGAGE;B_COST_gen_altspec:generic;B_TIME:FIRST;B_TIME_gen_altspec:generic
Model_000037    ASC:GA-LUGGAGE;B_COST_gen_altspec:altspec;B_TIME:FIRST;B_TIME_gen_altspec:altspec
Model_000038    ASC:LUGGAGE;B_COST_gen_altspec:generic;B_TIME:FIRST;B_TIME_gen_altspec:generic
Model_000039    ASC:no_seg;B_COST_gen_altspec:altspec;B_TIME:COMMUTERS;B_TIME_gen_altspec:generic
Model_000040    ASC:GA;B_COST_gen_altspec:altspec;B_TIME:FIRST;B_TIME_gen_altspec:generic
Model_000041    ASC:no_seg;B_COST_gen_altspec:altspec;B_TIME:FIRST;B_TIME_gen_altspec:altspec
Model_000042    ASC:LUGGAGE;B_COST_gen_altspec:altspec;B_TIME:FIRST;B_TIME_gen_altspec:generic
Model_000043    ASC:no_seg;B_COST_gen_altspec:altspec;B_TIME:no_seg;B_TIME_gen_altspec:generic
Model_000044    ASC:no_seg;B_COST_gen_altspec:generic;B_TIME:no_seg;B_TIME_gen_altspec:generic
Model_000045    ASC:GA-LUGGAGE;B_COST_gen_altspec:altspec;B_TIME:FIRST;B_TIME_gen_altspec:generic
Model_000046    ASC:GA;B_COST_gen_altspec:generic;B_TIME:FIRST;B_TIME_gen_altspec:generic
Model_000047    ASC:GA-LUGGAGE;B_COST_gen_altspec:generic;B_TIME:no_seg;B_TIME_gen_altspec:generic

Estimation results of the Pareto optimal models.

pareto_results = pareto_optimal(dict_of_results)
compiled_pareto_results, pareto_specs = compile_estimation_results(
    pareto_results, use_short_names=True
)
display(compiled_pareto_results)
                                    Model_000000  ...     Model_000008
Number of estimated parameters                 5  ...                9
Sample size                                10719  ...            10719
Final log likelihood                -8598.531023  ...     -8213.705718
Akaike Information Criterion        17207.062045  ...     16445.411435
Bayesian Information Criterion      17243.460911  ...     16510.929394
ASC_CAR (t-test)                 0.0091  (0.245)  ...  -0.449  (-7.17)
ASC_TRAIN (t-test)               -0.707  (-12.5)  ...  -0.997  (-11.1)
B_COST (t-test)                   -0.87  (-15.6)  ...
B_TIME (t-test)                  -0.916  (-10.9)  ...  -0.957  (-10.7)
B_TIME_1st_class (t-test)        -0.688  (-9.57)  ...  -0.692  (-9.34)
ASC_CAR_GA (t-test)                               ...    -1.02  (-6.5)
ASC_TRAIN_GA (t-test)                             ...     1.38  (14.6)
B_TIME_CAR (t-test)                               ...
B_TIME_CAR_commuters (t-test)                     ...
B_TIME_SM (t-test)                                ...
B_TIME_SM_commuters (t-test)                      ...
B_TIME_TRAIN (t-test)                             ...
B_TIME_TRAIN_commuters (t-test)                   ...
ASC_CAR_one_lugg (t-test)                         ...
ASC_CAR_several_lugg (t-test)                     ...
ASC_TRAIN_one_lugg (t-test)                       ...
ASC_TRAIN_several_lugg (t-test)                   ...
B_COST_CAR (t-test)                               ...  -0.329  (-4.21)
B_COST_SM (t-test)                                ...  -0.872  (-14.4)
B_COST_TRAIN (t-test)                             ...   -1.02  (-9.66)

[25 rows x 9 columns]

Glossary.

for short_name, spec in pareto_specs.items():
    print(f'{short_name}\t{spec}')
Model_000000    ASC:no_seg;B_COST_gen_altspec:generic;B_TIME:FIRST;B_TIME_gen_altspec:generic
Model_000001    ASC:GA;B_COST_gen_altspec:generic;B_TIME:COMMUTERS;B_TIME_gen_altspec:altspec
Model_000002    ASC:no_seg;B_COST_gen_altspec:generic;B_TIME:no_seg;B_TIME_gen_altspec:generic
Model_000003    ASC:GA-LUGGAGE;B_COST_gen_altspec:generic;B_TIME:COMMUTERS;B_TIME_gen_altspec:altspec
Model_000004    ASC:GA;B_COST_gen_altspec:generic;B_TIME:no_seg;B_TIME_gen_altspec:generic
Model_000005    ASC:GA;B_COST_gen_altspec:altspec;B_TIME:COMMUTERS;B_TIME_gen_altspec:altspec
Model_000006    ASC:GA;B_COST_gen_altspec:generic;B_TIME:FIRST;B_TIME_gen_altspec:generic
Model_000007    ASC:GA-LUGGAGE;B_COST_gen_altspec:altspec;B_TIME:COMMUTERS;B_TIME_gen_altspec:altspec
Model_000008    ASC:GA;B_COST_gen_altspec:altspec;B_TIME:FIRST;B_TIME_gen_altspec:generic

Total running time of the script: (0 minutes 18.528 seconds)

Gallery generated by Sphinx-Gallery