Note
Go to the end to download the full example code
Simulation of a choice model
We use an estimated model to perform various simulations.
- author:
Michel Bierlaire, EPFL
- date:
Wed Apr 12 21:04:33 2023
import sys
import time
import pandas as pd
from biogeme import models
import biogeme.biogeme as bio
import biogeme.exceptions as excep
import biogeme.results as res
from optima_data import database, normalized_weight
from scenarios import scenario
Obtain the specification for the default scenario. The definition of the scenarios is available in Specification of a nested logit model.
V, nests, _, _ = scenario()
V_PT = V[0]
V_CAR = V[1]
V_SM = V[2]
Obtain the expression for the choice probability of each alternative.
prob_PT = models.nested(V, None, nests, 0)
prob_CAR = models.nested(V, None, nests, 1)
prob_SM = models.nested(V, None, nests, 2)
# Read the estimation results from the file
try:
results = res.bioResults(pickleFile='saved_results/b02estimation.pickle')
except excep.BiogemeError:
sys.exit(
'Run first the script b02simulation.py '
'in order to generate the '
'file b02estimation.pickle.'
)
We now simulate various expressions on the database, and store the results in a Pandas dataframe.
start_time = time.time()
simulate_formulas = {
'weight': normalized_weight.getValue_c(
betas=results.getBetaValues(), database=database, prepareIds=True
),
'Utility PT': V_PT.getValue_c(
betas=results.getBetaValues(), database=database, prepareIds=True
),
'Utility car': V_CAR.getValue_c(
betas=results.getBetaValues(), database=database, prepareIds=True
),
'Utility SM': V_SM.getValue_c(
betas=results.getBetaValues(), database=database, prepareIds=True
),
'Prob. PT': prob_PT.getValue_c(
betas=results.getBetaValues(), database=database, prepareIds=True
),
'Prob. car': prob_CAR.getValue_c(
betas=results.getBetaValues(), database=database, prepareIds=True
),
'Prob. SM': prob_SM.getValue_c(
betas=results.getBetaValues(), database=database, prepareIds=True
),
}
simulated_values = pd.DataFrame.from_dict(
simulate_formulas,
)
print(
f'--- Execution time with getValue_c: '
f'{time.time() - start_time:.2f} seconds ---'
)
--- Execution time with getValue_c: 0.32 seconds ---
We now perform the same simulation using Biogeme. The results are identical, but the execution time is faster. Indeed, Biogeme recycles calculations performed for one expression for the other expressions.
A dictionary with the requested expression must be provided to Biogeme
simulate = {
'weight': normalized_weight,
'Utility PT': V_PT,
'Utility car': V_CAR,
'Utility SM': V_SM,
'Prob. PT': prob_PT,
'Prob. car': prob_CAR,
'Prob. SM': prob_SM,
}
start_time = time.time()
the_biogeme = bio.BIOGEME(database, simulate)
biogeme_simulation = the_biogeme.simulate(results.getBetaValues())
print(
f'--- Execution time with Biogeme: '
f'{time.time() - start_time:.2f} seconds ---'
)
--- Execution time with Biogeme: 0.42 seconds ---
Let’s print the two results, to show that they are identical
Without Biogeme
print(simulated_values)
weight Utility PT Utility car ... Prob. PT Prob. car Prob. SM
0 0.886023 -0.245379 -0.156238 ... 0.476807 0.521791 0.001402
1 0.861136 -0.451576 0.198134 ... 0.238681 0.563268 0.198051
2 0.861136 -2.027748 -0.047179 ... 0.119136 0.875824 0.005040
3 0.957386 -2.290720 0.030607 ... 0.051136 0.814110 0.134754
4 0.861136 -1.022414 0.009467 ... 0.257192 0.732059 0.010749
... ... ... ... ... ... ... ...
1901 2.036009 -1.172087 -0.256681 ... 0.285872 0.714085 0.000043
1902 0.861136 -2.141291 -0.408666 ... 0.149688 0.849040 0.001272
1903 0.861136 -0.996681 0.068780 ... 0.205901 0.689083 0.105016
1904 0.957386 -1.157102 0.010095 ... 0.219850 0.744274 0.035876
1905 0.957386 -1.306555 -0.048721 ... 0.209149 0.765468 0.025383
[1906 rows x 7 columns]
With Biogeme
print(biogeme_simulation)
weight Utility PT Utility car ... Prob. PT Prob. car Prob. SM
0 0.886023 -0.245379 -0.156238 ... 0.476807 0.521791 0.001402
2 0.861136 -0.451576 0.198134 ... 0.238681 0.563268 0.198051
3 0.861136 -2.027748 -0.047179 ... 0.119136 0.875824 0.005040
4 0.957386 -2.290720 0.030607 ... 0.051136 0.814110 0.134754
5 0.861136 -1.022414 0.009467 ... 0.257192 0.732059 0.010749
... ... ... ... ... ... ... ...
2259 2.036009 -1.172087 -0.256681 ... 0.285872 0.714085 0.000043
2261 0.861136 -2.141291 -0.408666 ... 0.149688 0.849040 0.001272
2262 0.861136 -0.996681 0.068780 ... 0.205901 0.689083 0.105016
2263 0.957386 -1.157102 0.010095 ... 0.219850 0.744274 0.035876
2264 0.957386 -1.306555 -0.048721 ... 0.209149 0.765468 0.025383
[1906 rows x 7 columns]
Total running time of the script: (0 minutes 0.777 seconds)