Simulation of a choice model

We use an estimated model to perform various simulations.

Michel Bierlaire, EPFL Sat Jun 28 2025, 16:56:26

import sys
import time

import pandas as pd
from scenarios import scenario

from biogeme.biogeme import BIOGEME
from biogeme.data.optima import normalized_weight, read_data
from biogeme.jax_calculator import get_value_c
from biogeme.models import nested
from biogeme.results_processing import EstimationResults

Obtain the specification for the default scenario. The definition of the scenarios is available in Specification of a nested logit model.

v, nests, _, _ = scenario()

v_pt = v[0]
v_car = v[1]
v_sm = v[2]

Obtain the expression for the choice probability of each alternative.

prob_pt = nested(v, None, nests, 0)
prob_car = nested(v, None, nests, 1)
prob_sm = nested(v, None, nests, 2)

# Read the estimation results from the file
try:
    results = EstimationResults.from_yaml_file(
        filename='saved_results/b02estimation.yaml'
    )
except FileNotFoundError:
    sys.exit(
        'Run first the script plot_b02estimation.py '
        'in order to generate the '
        'file b02estimation.yaml.'
    )

Read the database

database = read_data()

We now simulate various expressions on the database, and store the results in a Pandas dataframe. %%

start_time = time.time()
simulate_formulas = {
    'weight': get_value_c(
        expression=normalized_weight,
        betas=results.get_beta_values(),
        database=database,
        numerically_safe=False,
        use_jit=True,
    ),
    'Utility PT': get_value_c(
        expression=v_pt,
        betas=results.get_beta_values(),
        database=database,
        numerically_safe=False,
        use_jit=True,
    ),
    'Utility car': get_value_c(
        expression=v_car,
        betas=results.get_beta_values(),
        database=database,
        numerically_safe=False,
        use_jit=True,
    ),
    'Utility SM': get_value_c(
        expression=v_sm,
        betas=results.get_beta_values(),
        database=database,
        numerically_safe=False,
        use_jit=True,
    ),
    'Prob. PT': get_value_c(
        expression=prob_pt,
        betas=results.get_beta_values(),
        database=database,
        numerically_safe=False,
        use_jit=True,
    ),
    'Prob. car': get_value_c(
        expression=prob_car,
        betas=results.get_beta_values(),
        database=database,
        numerically_safe=False,
        use_jit=True,
    ),
    'Prob. SM': get_value_c(
        expression=prob_sm,
        betas=results.get_beta_values(),
        database=database,
        numerically_safe=False,
        use_jit=True,
    ),
}

simulated_values = pd.DataFrame.from_dict(simulate_formulas)
end_time = time.time()
print(f'--- Execution time without Biogeme:    {end_time - start_time:.2f} seconds ---')
--- Execution time without Biogeme:    0.82 seconds ---

We now perform the same simulation using Biogeme. The results are identical, but the syntax is simpler and the execution time is a little bit faster. Indeed, Biogeme recycles calculations performed for one expression for the other expressions.

A dictionary with the requested expression must be provided to Biogeme

simulate = {
    'weight': normalized_weight,
    'Utility PT': v_pt,
    'Utility car': v_car,
    'Utility SM': v_sm,
    'Prob. PT': prob_pt,
    'Prob. car': prob_car,
    'Prob. SM': prob_sm,
}
start_time = time.time()
the_biogeme = BIOGEME(database, simulate)
the_betas = results.get_beta_values()
biogeme_simulation = the_biogeme.simulate(results.get_beta_values())
end_time = time.time()
print(
    f'--- Execution time with Biogeme:       {time.time() - start_time:.2f} seconds ---'
)
--- Execution time with Biogeme:       0.63 seconds ---

Let’s print the two results, to show that they are identical

Without Biogeme

print(simulated_values)
        weight  Utility PT  Utility car  ...  Prob. PT  Prob. car  Prob. SM
0     0.893779   -0.234999    -0.156395  ...  0.479434   0.519162  0.001404
1     0.868674   -0.442428     0.195927  ...  0.241032   0.560860  0.198109
2     0.868674   -2.021605    -0.048100  ...  0.119886   0.875047  0.005067
3     0.965766   -2.293552     0.027394  ...  0.051229   0.813507  0.135264
4     0.868674   -1.011018     0.008372  ...  0.259604   0.729621  0.010774
...        ...         ...          ...  ...       ...        ...       ...
1894  2.053830   -1.157014    -0.256173  ...  0.288854   0.711102  0.000044
1895  0.868674   -2.145238    -0.412673  ...  0.149688   0.849017  0.001294
1896  0.868674   -0.998306     0.065653  ...  0.205929   0.688197  0.105874
1897  0.965766   -1.145981     0.009538  ...  0.222020   0.742208  0.035772
1898  0.965766   -1.293093    -0.048980  ...  0.211578   0.763163  0.025259

[1899 rows x 7 columns]

With Biogeme

print(biogeme_simulation)
        weight  Utility PT  Utility car  ...  Prob. PT  Prob. car  Prob. SM
0     0.893779   -0.234999    -0.156395  ...  0.479434   0.519162  0.001404
1     0.868674   -0.442428     0.195927  ...  0.241032   0.560860  0.198109
2     0.868674   -2.021605    -0.048100  ...  0.119886   0.875047  0.005067
3     0.965766   -2.293552     0.027394  ...  0.051229   0.813507  0.135264
4     0.868674   -1.011018     0.008372  ...  0.259604   0.729621  0.010774
...        ...         ...          ...  ...       ...        ...       ...
1894  2.053830   -1.157014    -0.256173  ...  0.288854   0.711102  0.000044
1895  0.868674   -2.145238    -0.412673  ...  0.149688   0.849017  0.001294
1896  0.868674   -0.998306     0.065653  ...  0.205929   0.688197  0.105874
1897  0.965766   -1.145981     0.009538  ...  0.222020   0.742208  0.035772
1898  0.965766   -1.293093    -0.048980  ...  0.211578   0.763163  0.025259

[1899 rows x 7 columns]

Total running time of the script: (0 minutes 3.567 seconds)

Gallery generated by Sphinx-Gallery