Estimation and simulation of a nested logit model

We estimate a nested logit model and we perform simulation using the estimated model.

author:

Michel Bierlaire, EPFL

date:

Wed Apr 12 21:05:16 2023

from IPython.core.display_functions import display

from biogeme import models
import biogeme.biogeme as bio
from biogeme.data.optima import read_data
from scenarios import scenario

Obtain the specification for the default scenario. The definition of the scenarios is available in Specification of a nested logit model.

V, nests, Choice, _ = scenario()

The choice model is a nested logit, with availability conditions For estimation, we need the log of the probability.

logprob = models.lognested(util=V, availability=None, nests=nests, choice=Choice)

Get the database

database = read_data()

Create the Biogeme object for estimation.

the_biogeme = bio.BIOGEME(database, logprob)
the_biogeme.modelName = 'b02estimation'

Estimate the parameters. Perform bootstrapping.

the_biogeme.bootstrap_samples = 100
results = the_biogeme.estimate(run_bootstrap=True)
  0%|          | 0/100 [00:00<?, ?it/s]
  1%|          | 1/100 [00:00<00:25,  3.84it/s]
  2%|▏         | 2/100 [00:00<00:22,  4.41it/s]
  3%|▎         | 3/100 [00:00<00:24,  3.92it/s]
  4%|▍         | 4/100 [00:00<00:22,  4.18it/s]
  5%|▌         | 5/100 [00:01<00:23,  4.00it/s]
  6%|▌         | 6/100 [00:01<00:23,  3.99it/s]
  7%|▋         | 7/100 [00:01<00:21,  4.30it/s]
  8%|▊         | 8/100 [00:01<00:22,  4.00it/s]
  9%|▉         | 9/100 [00:02<00:23,  3.80it/s]
 10%|█         | 10/100 [00:02<00:23,  3.79it/s]
 11%|█         | 11/100 [00:02<00:23,  3.84it/s]
 12%|█▏        | 12/100 [00:03<00:25,  3.46it/s]
 13%|█▎        | 13/100 [00:03<00:27,  3.17it/s]
 14%|█▍        | 14/100 [00:03<00:26,  3.26it/s]
 15%|█▌        | 15/100 [00:04<00:25,  3.32it/s]
 16%|█▌        | 16/100 [00:04<00:28,  2.96it/s]
 17%|█▋        | 17/100 [00:04<00:25,  3.23it/s]
 18%|█▊        | 18/100 [00:05<00:27,  2.95it/s]
 19%|█▉        | 19/100 [00:05<00:29,  2.76it/s]
 20%|██        | 20/100 [00:05<00:26,  3.06it/s]
 21%|██        | 21/100 [00:06<00:25,  3.06it/s]
 22%|██▏       | 22/100 [00:06<00:23,  3.28it/s]
 23%|██▎       | 23/100 [00:06<00:24,  3.21it/s]
 24%|██▍       | 24/100 [00:06<00:21,  3.47it/s]
 25%|██▌       | 25/100 [00:07<00:19,  3.79it/s]
 26%|██▌       | 26/100 [00:07<00:20,  3.60it/s]
 27%|██▋       | 27/100 [00:07<00:19,  3.84it/s]
 28%|██▊       | 28/100 [00:07<00:19,  3.70it/s]
 29%|██▉       | 29/100 [00:08<00:20,  3.48it/s]
 30%|███       | 30/100 [00:08<00:21,  3.30it/s]
 31%|███       | 31/100 [00:08<00:19,  3.49it/s]
 32%|███▏      | 32/100 [00:09<00:23,  2.95it/s]
 33%|███▎      | 33/100 [00:09<00:26,  2.57it/s]
 34%|███▍      | 34/100 [00:10<00:24,  2.66it/s]
 35%|███▌      | 35/100 [00:10<00:23,  2.77it/s]
 36%|███▌      | 36/100 [00:10<00:21,  2.93it/s]
 37%|███▋      | 37/100 [00:11<00:22,  2.83it/s]
 38%|███▊      | 38/100 [00:11<00:22,  2.75it/s]
 39%|███▉      | 39/100 [00:12<00:23,  2.61it/s]
 40%|████      | 40/100 [00:12<00:21,  2.75it/s]
 41%|████      | 41/100 [00:12<00:21,  2.69it/s]
 42%|████▏     | 42/100 [00:12<00:19,  2.99it/s]
 43%|████▎     | 43/100 [00:13<00:16,  3.48it/s]
 44%|████▍     | 44/100 [00:13<00:14,  3.77it/s]
 45%|████▌     | 45/100 [00:13<00:14,  3.86it/s]
 46%|████▌     | 46/100 [00:13<00:15,  3.46it/s]
 47%|████▋     | 47/100 [00:14<00:15,  3.43it/s]
 48%|████▊     | 48/100 [00:14<00:15,  3.30it/s]
 49%|████▉     | 49/100 [00:14<00:15,  3.24it/s]
 50%|█████     | 50/100 [00:15<00:15,  3.13it/s]
 51%|█████     | 51/100 [00:15<00:14,  3.34it/s]
 52%|█████▏    | 52/100 [00:15<00:14,  3.31it/s]
 53%|█████▎    | 53/100 [00:16<00:13,  3.43it/s]
 54%|█████▍    | 54/100 [00:16<00:13,  3.37it/s]
 55%|█████▌    | 55/100 [00:16<00:11,  3.96it/s]
 56%|█████▌    | 56/100 [00:16<00:12,  3.59it/s]
 57%|█████▋    | 57/100 [00:17<00:12,  3.55it/s]
 58%|█████▊    | 58/100 [00:17<00:12,  3.42it/s]
 59%|█████▉    | 59/100 [00:17<00:10,  3.77it/s]
 60%|██████    | 60/100 [00:18<00:10,  3.64it/s]
 61%|██████    | 61/100 [00:18<00:11,  3.49it/s]
 62%|██████▏   | 62/100 [00:18<00:10,  3.69it/s]
 63%|██████▎   | 63/100 [00:18<00:10,  3.62it/s]
 64%|██████▍   | 64/100 [00:19<00:10,  3.41it/s]
 65%|██████▌   | 65/100 [00:19<00:10,  3.42it/s]
 66%|██████▌   | 66/100 [00:19<00:11,  3.00it/s]
 67%|██████▋   | 67/100 [00:20<00:10,  3.27it/s]
 68%|██████▊   | 68/100 [00:20<00:09,  3.40it/s]
 69%|██████▉   | 69/100 [00:20<00:09,  3.34it/s]
 70%|███████   | 70/100 [00:21<00:09,  3.19it/s]
 71%|███████   | 71/100 [00:21<00:08,  3.29it/s]
 72%|███████▏  | 72/100 [00:21<00:09,  3.09it/s]
 73%|███████▎  | 73/100 [00:21<00:08,  3.23it/s]
 74%|███████▍  | 74/100 [00:22<00:07,  3.25it/s]
 75%|███████▌  | 75/100 [00:22<00:07,  3.25it/s]
 76%|███████▌  | 76/100 [00:22<00:07,  3.39it/s]
 77%|███████▋  | 77/100 [00:23<00:06,  3.31it/s]
 78%|███████▊  | 78/100 [00:23<00:06,  3.64it/s]
 79%|███████▉  | 79/100 [00:23<00:05,  3.87it/s]
 80%|████████  | 80/100 [00:23<00:05,  3.51it/s]
 81%|████████  | 81/100 [00:24<00:05,  3.32it/s]
 82%|████████▏ | 82/100 [00:24<00:04,  3.60it/s]
 83%|████████▎ | 83/100 [00:24<00:05,  3.39it/s]
 84%|████████▍ | 84/100 [00:25<00:04,  3.64it/s]
 85%|████████▌ | 85/100 [00:25<00:04,  3.72it/s]
 86%|████████▌ | 86/100 [00:25<00:03,  3.88it/s]
 87%|████████▋ | 87/100 [00:25<00:03,  3.55it/s]
 88%|████████▊ | 88/100 [00:26<00:03,  3.65it/s]
 89%|████████▉ | 89/100 [00:26<00:03,  3.51it/s]
 90%|█████████ | 90/100 [00:26<00:02,  3.35it/s]
 91%|█████████ | 91/100 [00:27<00:02,  3.39it/s]
 92%|█████████▏| 92/100 [00:27<00:02,  3.29it/s]
 93%|█████████▎| 93/100 [00:27<00:02,  3.20it/s]
 94%|█████████▍| 94/100 [00:28<00:01,  3.19it/s]
 95%|█████████▌| 95/100 [00:28<00:01,  3.21it/s]
 96%|█████████▌| 96/100 [00:28<00:01,  3.10it/s]
 97%|█████████▋| 97/100 [00:29<00:00,  3.08it/s]
 98%|█████████▊| 98/100 [00:29<00:00,  3.37it/s]
 99%|█████████▉| 99/100 [00:29<00:00,  3.31it/s]
100%|██████████| 100/100 [00:29<00:00,  3.30it/s]
100%|██████████| 100/100 [00:29<00:00,  3.34it/s]

Get the results in a pandas table

pandas_results = results.get_estimated_parameters()
display(pandas_results)
                         Value  Rob. Std err  Rob. t-test  Rob. p-value
ASC_CAR               0.261239      0.100102     2.609731  9.061333e-03
ASC_SM                0.059004      0.216071     0.273076  7.847944e-01
BETA_COST            -0.715995      0.138326    -5.176125  2.265416e-07
BETA_DIST_FEMALE     -0.830051      0.192377    -4.314718  1.598070e-05
BETA_DIST_MALE       -0.685047      0.160237    -4.275212  1.909554e-05
BETA_DIST_UNREPORTED -0.700981      0.194954    -3.595632  3.236051e-04
BETA_TIME_FULLTIME   -1.596798      0.332759    -4.798660  1.597306e-06
BETA_TIME_OTHER      -0.577377      0.296434    -1.947740  5.144604e-02
mu_nocar              1.530883      0.305456     5.011799  5.392354e-07

Simulation

simulated_choices = logprob.get_value_c(
    betas=results.get_beta_values(), database=database
)
display(simulated_choices)
[-0.65046923 -1.4335904  -0.13265185 ... -0.37245933 -0.29545036
 -0.26736437]
loglikelihood = logprob.get_value_c(
    betas=results.get_beta_values(),
    database=database,
    aggregation=True,
)
print(f'Final log likelihood:     {results.data.logLike}')
print(f'Simulated log likelihood: {loglikelihood}')
Final log likelihood:     -1298.4984028081865
Simulated log likelihood: -1298.4984028081865

Total running time of the script: (0 minutes 31.239 seconds)

Gallery generated by Sphinx-Gallery