Logit

Estimation of a logit model using sampling of alternatives.

author:

Michel Bierlaire

date:

Wed Nov 1 17:39:47 2023

import pandas as pd
from biogeme.sampling_of_alternatives import (
    SamplingContext,
    ChoiceSetsGeneration,
    GenerateModel,
    generate_segment_size,
)
import biogeme.biogeme_logging as blog
import biogeme.biogeme as bio
from compare import compare
from specification import V, combined_variables
from alternatives import (
    alternatives,
    ID_COLUMN,
    partitions,
)
Number of asian restaurants: 33
logger = blog.get_screen_logger(level=blog.INFO)

The data file contains several columns associated with synthetic choices. Here we arbitrarily select logit_4.

CHOICE_COLUMN = 'logit_4'
SAMPLE_SIZE = 10
PARTITION = 'asian'
MODEL_NAME = f'logit_{PARTITION}_{SAMPLE_SIZE}_alt'
FILE_NAME = f'{MODEL_NAME}.dat'
OBS_FILE = 'obs_choice.dat'
the_partition = partitions.get(PARTITION)
if the_partition is None:
    raise ValueError(f'Unknown partition: {PARTITION}')
segment_sizes = generate_segment_size(SAMPLE_SIZE, the_partition.number_of_segments())
observations = pd.read_csv(OBS_FILE)
context = SamplingContext(
    the_partition=the_partition,
    sample_sizes=segment_sizes,
    individuals=observations,
    choice_column=CHOICE_COLUMN,
    alternatives=alternatives,
    id_column=ID_COLUMN,
    biogeme_file_name=FILE_NAME,
    utility_function=V,
    combined_variables=combined_variables,
)
logger.info(context.reporting())
Size of the choice set: 100
Main partition: 2 segment(s) of size 33, 67
Main sample: 10: 5/33, 5/67
the_data_generation = ChoiceSetsGeneration(context=context)
the_model_generation = GenerateModel(context=context)
biogeme_database = the_data_generation.sample_and_merge(recycle=False)
Generating 10 alternatives for 10000 observations

  0%|          | 0/10000 [00:00<?, ?it/s]
  1%|          | 77/10000 [00:00<00:12, 768.19it/s]
  2%|▏         | 154/10000 [00:00<00:13, 749.42it/s]
  2%|▏         | 229/10000 [00:00<00:13, 735.04it/s]
  3%|▎         | 303/10000 [00:00<00:13, 729.89it/s]
  4%|▍         | 385/10000 [00:00<00:12, 759.12it/s]
  5%|▍         | 470/10000 [00:00<00:12, 789.23it/s]
  5%|▌         | 549/10000 [00:00<00:12, 767.74it/s]
  6%|▋         | 626/10000 [00:00<00:12, 756.75it/s]
  7%|▋         | 712/10000 [00:00<00:11, 787.32it/s]
  8%|▊         | 794/10000 [00:01<00:11, 795.66it/s]
  9%|▊         | 874/10000 [00:01<00:11, 779.76it/s]
 10%|▉         | 953/10000 [00:01<00:11, 768.28it/s]
 10%|█         | 1032/10000 [00:01<00:11, 774.31it/s]
 11%|█         | 1117/10000 [00:01<00:11, 795.44it/s]
 12%|█▏        | 1198/10000 [00:01<00:11, 798.44it/s]
 13%|█▎        | 1278/10000 [00:01<00:11, 787.08it/s]
 14%|█▎        | 1357/10000 [00:01<00:11, 762.93it/s]
 14%|█▍        | 1434/10000 [00:01<00:11, 748.33it/s]
 15%|█▌        | 1510/10000 [00:01<00:11, 748.53it/s]
 16%|█▌        | 1585/10000 [00:02<00:11, 733.15it/s]
 17%|█▋        | 1659/10000 [00:02<00:11, 726.96it/s]
 17%|█▋        | 1732/10000 [00:02<00:11, 721.21it/s]
 18%|█▊        | 1807/10000 [00:02<00:11, 728.32it/s]
 19%|█▉        | 1896/10000 [00:02<00:10, 773.43it/s]
 20%|█▉        | 1974/10000 [00:02<00:10, 757.47it/s]
 20%|██        | 2050/10000 [00:02<00:10, 739.83it/s]
 21%|██▏       | 2132/10000 [00:02<00:10, 762.45it/s]
 22%|██▏       | 2209/10000 [00:02<00:10, 749.80it/s]
 23%|██▎       | 2297/10000 [00:03<00:09, 784.74it/s]
 24%|██▍       | 2379/10000 [00:03<00:09, 794.96it/s]
 25%|██▍       | 2459/10000 [00:03<00:09, 762.82it/s]
 25%|██▌       | 2536/10000 [00:03<00:09, 760.76it/s]
 26%|██▌       | 2620/10000 [00:03<00:09, 781.79it/s]
 27%|██▋       | 2699/10000 [00:03<00:09, 767.27it/s]
 28%|██▊       | 2776/10000 [00:03<00:09, 762.11it/s]
 29%|██▊       | 2853/10000 [00:03<00:09, 754.02it/s]
 29%|██▉       | 2929/10000 [00:03<00:09, 753.36it/s]
 30%|███       | 3006/10000 [00:03<00:09, 758.22it/s]
 31%|███       | 3082/10000 [00:04<00:09, 741.74it/s]
 32%|███▏      | 3157/10000 [00:04<00:09, 737.00it/s]
 32%|███▏      | 3231/10000 [00:04<00:09, 732.87it/s]
 33%|███▎      | 3307/10000 [00:04<00:09, 740.03it/s]
 34%|███▍      | 3388/10000 [00:04<00:08, 759.61it/s]
 35%|███▍      | 3465/10000 [00:04<00:08, 754.25it/s]
 35%|███▌      | 3541/10000 [00:04<00:08, 753.29it/s]
 36%|███▌      | 3617/10000 [00:04<00:08, 747.23it/s]
 37%|███▋      | 3692/10000 [00:04<00:08, 742.36it/s]
 38%|███▊      | 3771/10000 [00:04<00:08, 755.80it/s]
 39%|███▊      | 3856/10000 [00:05<00:07, 783.21it/s]
 39%|███▉      | 3935/10000 [00:05<00:07, 770.56it/s]
 40%|████      | 4013/10000 [00:05<00:07, 765.32it/s]
 41%|████      | 4090/10000 [00:05<00:07, 757.94it/s]
 42%|████▏     | 4175/10000 [00:05<00:07, 783.39it/s]
 43%|████▎     | 4254/10000 [00:05<00:07, 774.80it/s]
 43%|████▎     | 4332/10000 [00:05<00:07, 748.24it/s]
 44%|████▍     | 4408/10000 [00:05<00:07, 740.02it/s]
 45%|████▍     | 4485/10000 [00:05<00:07, 747.57it/s]
 46%|████▌     | 4572/10000 [00:06<00:06, 782.36it/s]
 47%|████▋     | 4651/10000 [00:06<00:06, 782.30it/s]
 47%|████▋     | 4730/10000 [00:06<00:06, 783.70it/s]
 48%|████▊     | 4827/10000 [00:06<00:06, 836.16it/s]
 49%|████▉     | 4911/10000 [00:06<00:06, 824.42it/s]
 50%|█████     | 5007/10000 [00:06<00:05, 861.97it/s]
 51%|█████     | 5094/10000 [00:06<00:05, 854.64it/s]
 52%|█████▏    | 5202/10000 [00:06<00:05, 918.73it/s]
 53%|█████▎    | 5307/10000 [00:06<00:04, 954.44it/s]
 54%|█████▍    | 5403/10000 [00:06<00:04, 931.00it/s]
 55%|█████▍    | 5497/10000 [00:07<00:04, 905.93it/s]
 56%|█████▌    | 5588/10000 [00:07<00:04, 902.71it/s]
 57%|█████▋    | 5684/10000 [00:07<00:04, 918.30it/s]
 58%|█████▊    | 5776/10000 [00:07<00:04, 895.85it/s]
 59%|█████▉    | 5875/10000 [00:07<00:04, 922.85it/s]
 60%|█████▉    | 5968/10000 [00:07<00:04, 898.42it/s]
 61%|██████    | 6059/10000 [00:07<00:04, 861.87it/s]
 61%|██████▏   | 6146/10000 [00:07<00:04, 859.69it/s]
 62%|██████▏   | 6233/10000 [00:07<00:04, 840.06it/s]
 63%|██████▎   | 6319/10000 [00:07<00:04, 845.42it/s]
 64%|██████▍   | 6406/10000 [00:08<00:04, 849.68it/s]
 65%|██████▍   | 6492/10000 [00:08<00:04, 835.57it/s]
 66%|██████▌   | 6576/10000 [00:08<00:04, 826.30it/s]
 67%|██████▋   | 6659/10000 [00:08<00:04, 826.40it/s]
 68%|██████▊   | 6751/10000 [00:08<00:03, 852.65it/s]
 68%|██████▊   | 6837/10000 [00:08<00:03, 853.99it/s]
 69%|██████▉   | 6923/10000 [00:08<00:03, 836.16it/s]
 70%|███████   | 7014/10000 [00:08<00:03, 854.97it/s]
 71%|███████   | 7100/10000 [00:08<00:03, 855.61it/s]
 72%|███████▏  | 7192/10000 [00:09<00:03, 872.73it/s]
 73%|███████▎  | 7281/10000 [00:09<00:03, 874.97it/s]
 74%|███████▎  | 7369/10000 [00:09<00:03, 871.66it/s]
 75%|███████▍  | 7460/10000 [00:09<00:02, 881.51it/s]
 76%|███████▌  | 7554/10000 [00:09<00:02, 896.59it/s]
 76%|███████▋  | 7649/10000 [00:09<00:02, 912.32it/s]
 77%|███████▋  | 7741/10000 [00:09<00:02, 893.66it/s]
 78%|███████▊  | 7831/10000 [00:09<00:02, 888.19it/s]
 79%|███████▉  | 7920/10000 [00:09<00:02, 883.60it/s]
 80%|████████  | 8009/10000 [00:09<00:02, 871.79it/s]
 81%|████████  | 8108/10000 [00:10<00:02, 905.32it/s]
 82%|████████▏ | 8199/10000 [00:10<00:02, 870.72it/s]
 83%|████████▎ | 8331/10000 [00:10<00:01, 998.58it/s]
 84%|████████▍ | 8432/10000 [00:10<00:01, 892.53it/s]
 85%|████████▌ | 8539/10000 [00:10<00:01, 935.87it/s]
 86%|████████▋ | 8635/10000 [00:10<00:01, 913.63it/s]
 87%|████████▋ | 8728/10000 [00:10<00:01, 858.82it/s]
 89%|████████▊ | 8861/10000 [00:10<00:01, 984.83it/s]
 90%|████████▉ | 8962/10000 [00:10<00:01, 942.82it/s]
 91%|█████████ | 9059/10000 [00:11<00:01, 935.75it/s]
 92%|█████████▏| 9154/10000 [00:11<00:00, 897.79it/s]
 92%|█████████▏| 9245/10000 [00:11<00:00, 892.81it/s]
 93%|█████████▎| 9335/10000 [00:11<00:00, 822.33it/s]
 94%|█████████▍| 9419/10000 [00:11<00:00, 805.43it/s]
 95%|█████████▌| 9501/10000 [00:11<00:00, 800.46it/s]
 96%|█████████▌| 9582/10000 [00:11<00:00, 786.48it/s]
 97%|█████████▋| 9661/10000 [00:11<00:00, 782.33it/s]
 97%|█████████▋| 9740/10000 [00:11<00:00, 775.33it/s]
 98%|█████████▊| 9822/10000 [00:12<00:00, 786.60it/s]
 99%|█████████▉| 9901/10000 [00:12<00:00, 782.60it/s]
100%|█████████▉| 9980/10000 [00:12<00:00, 765.56it/s]
100%|██████████| 10000/10000 [00:12<00:00, 792.30it/s]
Define new variables

Defining new variables...:   0%|          | 0/10 [00:00<?, ?it/s]
Defining new variables...:  20%|██        | 2/10 [00:00<00:00, 10.09it/s]
Defining new variables...:  40%|████      | 4/10 [00:00<00:00,  9.89it/s]
Defining new variables...:  60%|██████    | 6/10 [00:00<00:00, 10.24it/s]
Defining new variables...:  80%|████████  | 8/10 [00:00<00:00,  9.85it/s]
Defining new variables...:  90%|█████████ | 9/10 [00:00<00:00,  9.68it/s]
Defining new variables...: 100%|██████████| 10/10 [00:01<00:00,  9.57it/s]
Defining new variables...: 100%|██████████| 10/10 [00:01<00:00,  9.77it/s]
File logit_asian_10_alt.dat has been created.
logprob = the_model_generation.get_logit()
the_biogeme = bio.BIOGEME(biogeme_database, logprob)
the_biogeme.modelName = MODEL_NAME
File biogeme.toml has been parsed.

Calculate the null log likelihood for reporting.

the_biogeme.calculateNullLoglikelihood({i: 1 for i in range(SAMPLE_SIZE)})
-23025.850929942502

Estimate the parameters

results = the_biogeme.estimate(recycle=False)
*** Initial values of the parameters are obtained from the file __logit_asian_10_alt.iter
Parameter values restored from __logit_asian_10_alt.iter
Optimization algorithm: hybrid Newton/BFGS with simple bounds [simple_bounds]
** Optimization: Newton with trust region for simple bounds
Iter.    beta_chinese  beta_ethiopian     beta_french     beta_indian   beta_japanese     beta_korean   beta_lebanese   beta_log_dist    beta_mexican      beta_price     beta_rating     Function    Relgrad   Radius      Rho
    0            0.62            0.44            0.64            0.93             1.2            0.73            0.71            -0.6             1.2           -0.41            0.76      1.8e+04    1.9e-05       10        1   ++
    1            0.62            0.44            0.64            0.93             1.2            0.73            0.71            -0.6             1.2           -0.41            0.76      1.8e+04    1.5e-09       10        1   ++
Results saved in file logit_asian_10_alt~00.html
Results saved in file logit_asian_10_alt~00.pickle
print(results.short_summary())
Results for model logit_asian_10_alt
Nbr of parameters:              11
Sample size:                    10000
Excluded data:                  0
Null log likelihood:            -23025.85
Final log likelihood:           -18419.15
Likelihood ratio test (null):           9213.409
Rho square (null):                      0.2
Rho bar square (null):                  0.2
Akaike Information Criterion:   36860.29
Bayesian Information Criterion: 36939.61
estimated_parameters = results.getEstimatedParameters()
estimated_parameters
Value Rob. Std err Rob. t-test Rob. p-value
beta_chinese 0.624533 0.050571 12.349548 0.0
beta_ethiopian 0.441458 0.050674 8.711710 0.0
beta_french 0.641939 0.062615 10.252160 0.0
beta_indian 0.927575 0.042896 21.623899 0.0
beta_japanese 1.191176 0.046636 25.541986 0.0
beta_korean 0.726871 0.042680 17.030570 0.0
beta_lebanese 0.708292 0.062616 11.311743 0.0
beta_log_dist -0.595134 0.015042 -39.564050 0.0
beta_mexican 1.216204 0.036573 33.254283 0.0
beta_price -0.405947 0.012733 -31.882617 0.0
beta_rating 0.759850 0.015470 49.116968 0.0


df, msg = compare(estimated_parameters)
print(df)
              Name  True Value  Estimated Value    T-Test
0      beta_rating        0.75         0.759850 -0.636700
1       beta_price       -0.40        -0.405947  0.467069
2     beta_chinese        0.75         0.624533  2.480988
3    beta_japanese        1.25         1.191176  1.261347
4      beta_korean        0.75         0.726871  0.541905
5      beta_indian        1.00         0.927575  1.688384
6      beta_french        0.75         0.641939  1.725793
7     beta_mexican        1.25         1.216204  0.924075
8    beta_lebanese        0.75         0.708292  0.666092
9   beta_ethiopian        0.50         0.441458  1.155270
10   beta_log_dist       -0.60        -0.595134 -0.323468
print(msg)
Parameters not estimated: ['mu_asian', 'mu_downtown']

Total running time of the script: (0 minutes 16.782 seconds)

Gallery generated by Sphinx-Gallery