Note
Go to the end to download the full example code.
Simulation of a choice model¶
We use an estimated model to perform various simulations.
Michel Bierlaire, EPFL Sat Jun 28 2025, 16:56:26
import sys
import time
import pandas as pd
from biogeme.biogeme import BIOGEME
from biogeme.calculator import get_value_c
from biogeme.data.optima import normalized_weight, read_data
from biogeme.models import nested
from biogeme.results_processing import EstimationResults
from scenarios import scenario
Obtain the specification for the default scenario. The definition of the scenarios is available in Specification of a nested logit model.
v, nests, _, _ = scenario()
v_pt = v[0]
v_car = v[1]
v_sm = v[2]
Obtain the expression for the choice probability of each alternative.
prob_pt = nested(v, None, nests, 0)
prob_car = nested(v, None, nests, 1)
prob_sm = nested(v, None, nests, 2)
# Read the estimation results from the file
try:
results = EstimationResults.from_yaml_file(
filename='saved_results/b02estimation.yaml'
)
except FileNotFoundError:
sys.exit(
'Run first the script b02simulation.py '
'in order to generate the '
'file b02estimation.yaml.'
)
Read the database
database = read_data()
We now simulate various expressions on the database, and store the results in a Pandas dataframe. %%
start_time = time.time()
simulate_formulas = {
'weight': get_value_c(
expression=normalized_weight,
betas=results.get_beta_values(),
database=database,
numerically_safe=False,
use_jit=True,
),
'Utility PT': get_value_c(
expression=v_pt,
betas=results.get_beta_values(),
database=database,
numerically_safe=False,
use_jit=True,
),
'Utility car': get_value_c(
expression=v_car,
betas=results.get_beta_values(),
database=database,
numerically_safe=False,
use_jit=True,
),
'Utility SM': get_value_c(
expression=v_sm,
betas=results.get_beta_values(),
database=database,
numerically_safe=False,
use_jit=True,
),
'Prob. PT': get_value_c(
expression=prob_pt,
betas=results.get_beta_values(),
database=database,
numerically_safe=False,
use_jit=True,
),
'Prob. car': get_value_c(
expression=prob_car,
betas=results.get_beta_values(),
database=database,
numerically_safe=False,
use_jit=True,
),
'Prob. SM': get_value_c(
expression=prob_sm,
betas=results.get_beta_values(),
database=database,
numerically_safe=False,
use_jit=True,
),
}
simulated_values = pd.DataFrame.from_dict(simulate_formulas)
end_time = time.time()
print(
f'--- Execution time without Biogeme: '
f'{end_time - start_time:.2f} seconds ---'
)
--- Execution time without Biogeme: 0.77 seconds ---
We now perform the same simulation using Biogeme. The results are identical, but the syntax is simpler and the execution time is a little bit faster. Indeed, Biogeme recycles calculations performed for one expression for the other expressions.
A dictionary with the requested expression must be provided to Biogeme
simulate = {
'weight': normalized_weight,
'Utility PT': v_pt,
'Utility car': v_car,
'Utility SM': v_sm,
'Prob. PT': prob_pt,
'Prob. car': prob_car,
'Prob. SM': prob_sm,
}
start_time = time.time()
the_biogeme = BIOGEME(database, simulate)
the_betas = results.get_beta_values()
biogeme_simulation = the_biogeme.simulate(results.get_beta_values())
end_time = time.time()
print(
f'--- Execution time with Biogeme: '
f'{time.time() - start_time:.2f} seconds ---'
)
--- Execution time with Biogeme: 0.62 seconds ---
Let’s print the two results, to show that they are identical
Without Biogeme
print(simulated_values)
weight Utility PT Utility car ... Prob. PT Prob. car Prob. SM
0 0.893779 -0.234985 -0.156370 ... 0.479431 0.519165 0.001404
1 0.868674 -0.442406 0.195938 ... 0.241046 0.560868 0.198086
2 0.868674 -2.021524 -0.048079 ... 0.119893 0.875040 0.005067
3 0.965766 -2.293563 0.027404 ... 0.051229 0.813507 0.135264
4 0.868674 -1.010973 0.008391 ... 0.259609 0.729616 0.010775
... ... ... ... ... ... ... ...
1894 2.053830 -1.156962 -0.256143 ... 0.288859 0.711097 0.000044
1895 0.868674 -2.145229 -0.412661 ... 0.149688 0.849018 0.001294
1896 0.868674 -0.998305 0.065662 ... 0.205928 0.688197 0.105875
1897 0.965766 -1.145931 0.009557 ... 0.222025 0.742202 0.035772
1898 0.965766 -1.293037 -0.048959 ... 0.211584 0.763156 0.025259
[1899 rows x 7 columns]
With Biogeme
print(biogeme_simulation)
weight Utility PT Utility car ... Prob. PT Prob. car Prob. SM
0 0.893779 -0.234985 -0.156370 ... 0.479431 0.519165 0.001404
1 0.868674 -0.442406 0.195938 ... 0.241046 0.560868 0.198086
2 0.868674 -2.021524 -0.048079 ... 0.119893 0.875040 0.005067
3 0.965766 -2.293563 0.027404 ... 0.051229 0.813507 0.135264
4 0.868674 -1.010973 0.008391 ... 0.259609 0.729616 0.010775
... ... ... ... ... ... ... ...
1894 2.053830 -1.156962 -0.256143 ... 0.288859 0.711097 0.000044
1895 0.868674 -2.145229 -0.412661 ... 0.149688 0.849018 0.001294
1896 0.868674 -0.998305 0.065662 ... 0.205928 0.688197 0.105875
1897 0.965766 -1.145931 0.009557 ... 0.222025 0.742202 0.035772
1898 0.965766 -1.293037 -0.048959 ... 0.211584 0.763156 0.025259
[1899 rows x 7 columns]
Total running time of the script: (0 minutes 2.120 seconds)