Note
Go to the end to download the full example code.
Simulation of a choice model¶
We use an estimated model to perform various simulations.
Michel Bierlaire, EPFL Sat Jun 28 2025, 16:56:26
import sys
import time
import pandas as pd
from scenarios import scenario
from biogeme.biogeme import BIOGEME
from biogeme.data.optima import normalized_weight, read_data
from biogeme.jax_calculator import get_value_c
from biogeme.models import nested
from biogeme.results_processing import EstimationResults
Obtain the specification for the default scenario. The definition of the scenarios is available in Specification of a nested logit model.
v, nests, _, _ = scenario()
v_pt = v[0]
v_car = v[1]
v_sm = v[2]
Obtain the expression for the choice probability of each alternative.
prob_pt = nested(v, None, nests, 0)
prob_car = nested(v, None, nests, 1)
prob_sm = nested(v, None, nests, 2)
# Read the estimation results from the file
try:
results = EstimationResults.from_yaml_file(
filename='saved_results/b02estimation.yaml'
)
except FileNotFoundError:
sys.exit(
'Run first the script plot_b02estimation.py '
'in order to generate the '
'file b02estimation.yaml.'
)
Read the database
database = read_data()
We now simulate various expressions on the database, and store the results in a Pandas dataframe. %%
start_time = time.time()
simulate_formulas = {
'weight': get_value_c(
expression=normalized_weight,
betas=results.get_beta_values(),
database=database,
numerically_safe=False,
use_jit=True,
),
'Utility PT': get_value_c(
expression=v_pt,
betas=results.get_beta_values(),
database=database,
numerically_safe=False,
use_jit=True,
),
'Utility car': get_value_c(
expression=v_car,
betas=results.get_beta_values(),
database=database,
numerically_safe=False,
use_jit=True,
),
'Utility SM': get_value_c(
expression=v_sm,
betas=results.get_beta_values(),
database=database,
numerically_safe=False,
use_jit=True,
),
'Prob. PT': get_value_c(
expression=prob_pt,
betas=results.get_beta_values(),
database=database,
numerically_safe=False,
use_jit=True,
),
'Prob. car': get_value_c(
expression=prob_car,
betas=results.get_beta_values(),
database=database,
numerically_safe=False,
use_jit=True,
),
'Prob. SM': get_value_c(
expression=prob_sm,
betas=results.get_beta_values(),
database=database,
numerically_safe=False,
use_jit=True,
),
}
simulated_values = pd.DataFrame.from_dict(simulate_formulas)
end_time = time.time()
print(f'--- Execution time without Biogeme: {end_time - start_time:.2f} seconds ---')
--- Execution time without Biogeme: 0.82 seconds ---
We now perform the same simulation using Biogeme. The results are identical, but the syntax is simpler and the execution time is a little bit faster. Indeed, Biogeme recycles calculations performed for one expression for the other expressions.
A dictionary with the requested expression must be provided to Biogeme
simulate = {
'weight': normalized_weight,
'Utility PT': v_pt,
'Utility car': v_car,
'Utility SM': v_sm,
'Prob. PT': prob_pt,
'Prob. car': prob_car,
'Prob. SM': prob_sm,
}
start_time = time.time()
the_biogeme = BIOGEME(database, simulate)
the_betas = results.get_beta_values()
biogeme_simulation = the_biogeme.simulate(results.get_beta_values())
end_time = time.time()
print(
f'--- Execution time with Biogeme: {time.time() - start_time:.2f} seconds ---'
)
--- Execution time with Biogeme: 0.63 seconds ---
Let’s print the two results, to show that they are identical
Without Biogeme
print(simulated_values)
weight Utility PT Utility car ... Prob. PT Prob. car Prob. SM
0 0.893779 -0.234999 -0.156395 ... 0.479434 0.519162 0.001404
1 0.868674 -0.442428 0.195927 ... 0.241032 0.560860 0.198109
2 0.868674 -2.021605 -0.048100 ... 0.119886 0.875047 0.005067
3 0.965766 -2.293552 0.027394 ... 0.051229 0.813507 0.135264
4 0.868674 -1.011018 0.008372 ... 0.259604 0.729621 0.010774
... ... ... ... ... ... ... ...
1894 2.053830 -1.157014 -0.256173 ... 0.288854 0.711102 0.000044
1895 0.868674 -2.145238 -0.412673 ... 0.149688 0.849017 0.001294
1896 0.868674 -0.998306 0.065653 ... 0.205929 0.688197 0.105874
1897 0.965766 -1.145981 0.009538 ... 0.222020 0.742208 0.035772
1898 0.965766 -1.293093 -0.048980 ... 0.211578 0.763163 0.025259
[1899 rows x 7 columns]
With Biogeme
print(biogeme_simulation)
weight Utility PT Utility car ... Prob. PT Prob. car Prob. SM
0 0.893779 -0.234999 -0.156395 ... 0.479434 0.519162 0.001404
1 0.868674 -0.442428 0.195927 ... 0.241032 0.560860 0.198109
2 0.868674 -2.021605 -0.048100 ... 0.119886 0.875047 0.005067
3 0.965766 -2.293552 0.027394 ... 0.051229 0.813507 0.135264
4 0.868674 -1.011018 0.008372 ... 0.259604 0.729621 0.010774
... ... ... ... ... ... ... ...
1894 2.053830 -1.157014 -0.256173 ... 0.288854 0.711102 0.000044
1895 0.868674 -2.145238 -0.412673 ... 0.149688 0.849017 0.001294
1896 0.868674 -0.998306 0.065653 ... 0.205929 0.688197 0.105874
1897 0.965766 -1.145981 0.009538 ... 0.222020 0.742208 0.035772
1898 0.965766 -1.293093 -0.048980 ... 0.211578 0.763163 0.025259
[1899 rows x 7 columns]
Total running time of the script: (0 minutes 3.567 seconds)