Note
Go to the end to download the full example code.
Simulation of a choice model¶
We use an estimated model to perform various simulations.
Michel Bierlaire, EPFL Sat Jun 28 2025, 16:56:26
import sys
import time
import pandas as pd
from biogeme.biogeme import BIOGEME
from biogeme.data.optima import normalized_weight, read_data
from biogeme.jax_calculator import get_value_c
from biogeme.models import nested
from biogeme.results_processing import EstimationResults
from scenarios import scenario
Obtain the specification for the default scenario. The definition of the scenarios is available in Specification of a nested logit model.
v, nests, _, _ = scenario()
v_pt = v[0]
v_car = v[1]
v_sm = v[2]
Obtain the expression for the choice probability of each alternative.
prob_pt = nested(v, None, nests, 0)
prob_car = nested(v, None, nests, 1)
prob_sm = nested(v, None, nests, 2)
# Read the estimation results from the file
try:
results = EstimationResults.from_yaml_file(
filename='saved_results/b02estimation.yaml'
)
except FileNotFoundError:
sys.exit(
'Run first the script b02simulation.py '
'in order to generate the '
'file b02estimation.yaml.'
)
Read the database
database = read_data()
We now simulate various expressions on the database, and store the results in a Pandas dataframe. %%
start_time = time.time()
simulate_formulas = {
'weight': get_value_c(
expression=normalized_weight,
betas=results.get_beta_values(),
database=database,
numerically_safe=False,
use_jit=True,
),
'Utility PT': get_value_c(
expression=v_pt,
betas=results.get_beta_values(),
database=database,
numerically_safe=False,
use_jit=True,
),
'Utility car': get_value_c(
expression=v_car,
betas=results.get_beta_values(),
database=database,
numerically_safe=False,
use_jit=True,
),
'Utility SM': get_value_c(
expression=v_sm,
betas=results.get_beta_values(),
database=database,
numerically_safe=False,
use_jit=True,
),
'Prob. PT': get_value_c(
expression=prob_pt,
betas=results.get_beta_values(),
database=database,
numerically_safe=False,
use_jit=True,
),
'Prob. car': get_value_c(
expression=prob_car,
betas=results.get_beta_values(),
database=database,
numerically_safe=False,
use_jit=True,
),
'Prob. SM': get_value_c(
expression=prob_sm,
betas=results.get_beta_values(),
database=database,
numerically_safe=False,
use_jit=True,
),
}
simulated_values = pd.DataFrame.from_dict(simulate_formulas)
end_time = time.time()
print(
f'--- Execution time without Biogeme: '
f'{end_time - start_time:.2f} seconds ---'
)
--- Execution time without Biogeme: 0.83 seconds ---
We now perform the same simulation using Biogeme. The results are identical, but the syntax is simpler and the execution time is a little bit faster. Indeed, Biogeme recycles calculations performed for one expression for the other expressions.
A dictionary with the requested expression must be provided to Biogeme
simulate = {
'weight': normalized_weight,
'Utility PT': v_pt,
'Utility car': v_car,
'Utility SM': v_sm,
'Prob. PT': prob_pt,
'Prob. car': prob_car,
'Prob. SM': prob_sm,
}
start_time = time.time()
the_biogeme = BIOGEME(database, simulate)
the_betas = results.get_beta_values()
biogeme_simulation = the_biogeme.simulate(results.get_beta_values())
end_time = time.time()
print(
f'--- Execution time with Biogeme: '
f'{time.time() - start_time:.2f} seconds ---'
)
--- Execution time with Biogeme: 0.66 seconds ---
Let’s print the two results, to show that they are identical
Without Biogeme
print(simulated_values)
weight Utility PT Utility car ... Prob. PT Prob. car Prob. SM
0 0.893779 -0.234892 -0.156281 ... 0.479432 0.519163 0.001404
1 0.868674 -0.442323 0.196006 ... 0.241059 0.560879 0.198061
2 0.868674 -2.021461 -0.047998 ... 0.119891 0.875042 0.005068
3 0.965766 -2.293242 0.027489 ... 0.051242 0.813492 0.135266
4 0.868674 -1.010867 0.008470 ... 0.259614 0.729611 0.010775
... ... ... ... ... ... ... ...
1894 2.053830 -1.156823 -0.256049 ... 0.288868 0.711088 0.000044
1895 0.868674 -2.145009 -0.412530 ... 0.149699 0.849006 0.001294
1896 0.868674 -0.998189 0.065745 ... 0.205937 0.688196 0.105867
1897 0.965766 -1.145827 0.009641 ... 0.222026 0.742196 0.035778
1898 0.965766 -1.292911 -0.048872 ... 0.211589 0.763148 0.025264
[1899 rows x 7 columns]
With Biogeme
print(biogeme_simulation)
weight Utility PT Utility car ... Prob. PT Prob. car Prob. SM
0 0.893779 -0.234892 -0.156281 ... 0.479432 0.519163 0.001404
1 0.868674 -0.442323 0.196006 ... 0.241059 0.560879 0.198061
2 0.868674 -2.021461 -0.047998 ... 0.119891 0.875042 0.005068
3 0.965766 -2.293242 0.027489 ... 0.051242 0.813492 0.135266
4 0.868674 -1.010867 0.008470 ... 0.259614 0.729611 0.010775
... ... ... ... ... ... ... ...
1894 2.053830 -1.156823 -0.256049 ... 0.288868 0.711088 0.000044
1895 0.868674 -2.145009 -0.412530 ... 0.149699 0.849006 0.001294
1896 0.868674 -0.998189 0.065745 ... 0.205937 0.688196 0.105867
1897 0.965766 -1.145827 0.009641 ... 0.222026 0.742196 0.035778
1898 0.965766 -1.292911 -0.048872 ... 0.211589 0.763148 0.025264
[1899 rows x 7 columns]
Total running time of the script: (0 minutes 2.229 seconds)