Note
Go to the end to download the full example code.
Re-estimation of best modelsΒΆ
After running the assisted specification algorithm for the 432 specifications in Combination of many specifications, we use post-processing to re-estimate all Pareto optimal models, and display some information about the algorithm. See Bierlaire and Ortelli (2023).
Michel Bierlaire, EPFL Sun Apr 27 2025, 18:38:57
from IPython.core.display_functions import display
from biogeme.biogeme import BIOGEME
from biogeme.results_processing import get_pandas_estimated_parameters
try:
import matplotlib.pyplot as plt
can_plot = True
except ModuleNotFoundError:
can_plot = False
import biogeme.biogeme_logging as blog
from biogeme.assisted import ParetoPostProcessing
from everything_spec import model_catalog, database
logger = blog.get_screen_logger(level=blog.INFO)
logger.info('Example b08selected_specification')
PARETO_FILE_NAME = 'saved_results/b07everything_assisted.pareto'
Example b08selected_specification
Create the biogeme object from the catalog.
the_biogeme = BIOGEME(database, model_catalog)
the_biogeme.model_name = 'b09post_processing'
Biogeme parameters read from biogeme.toml.
Create the post-processing object.
post_processing = ParetoPostProcessing(
biogeme_object=the_biogeme, pareto_file_name=PARETO_FILE_NAME
)
Pareto set initialized from file with 162 elements [13 Pareto] and 0 invalid elements.
Re-estimate the models.
all_results = post_processing.reestimate(recycle=True)
Biogeme parameters provided by the user.
Estimation results read from b09post_processing_000000.yaml. There is no guarantee that they correspond to the specified model.
Biogeme parameters provided by the user.
Estimation results read from b09post_processing_000001.yaml. There is no guarantee that they correspond to the specified model.
Biogeme parameters provided by the user.
Estimation results read from b09post_processing_000002.yaml. There is no guarantee that they correspond to the specified model.
Biogeme parameters provided by the user.
Estimation results read from b09post_processing_000003.yaml. There is no guarantee that they correspond to the specified model.
Biogeme parameters provided by the user.
Estimation results read from b09post_processing_000004.yaml. There is no guarantee that they correspond to the specified model.
Biogeme parameters provided by the user.
Estimation results read from b09post_processing_000005.yaml. There is no guarantee that they correspond to the specified model.
Biogeme parameters provided by the user.
Estimation results read from b09post_processing_000006.yaml. There is no guarantee that they correspond to the specified model.
Biogeme parameters provided by the user.
Estimation results read from b09post_processing_000007.yaml. There is no guarantee that they correspond to the specified model.
Biogeme parameters provided by the user.
Estimation results read from b09post_processing_000008.yaml. There is no guarantee that they correspond to the specified model.
Biogeme parameters provided by the user.
Estimation results read from b09post_processing_000009.yaml. There is no guarantee that they correspond to the specified model.
Biogeme parameters provided by the user.
Estimation results read from b09post_processing_000010.yaml. There is no guarantee that they correspond to the specified model.
Biogeme parameters provided by the user.
Estimation results read from b09post_processing_000011.yaml. There is no guarantee that they correspond to the specified model.
Biogeme parameters provided by the user.
Estimation results read from b09post_processing_000012.yaml. There is no guarantee that they correspond to the specified model.
We retrieve the first estimation results for illustration.
spec, results = next(iter(all_results.items()))
print(spec)
asc:GA-LUGGAGE;b_cost_gen_altspec:altspec;b_time:COMMUTERS;b_time_gen_altspec:altspec;model_catalog:nested existing;train_tt_catalog:power
print(results.short_summary())
Results for model b09post_processing_000000
Nbr of parameters: 16
Sample size: 10719
Excluded data: 9
Final log likelihood: -8062.586
Akaike Information Criterion: 16157.17
Bayesian Information Criterion: 16273.65
estimated_parameters = get_pandas_estimated_parameters(estimation_results=results)
display(estimated_parameters)
Name Value ... Robust t-stat. Robust p-value
0 asc_train_ref -0.855115 ... -8.524096 0.000000e+00
1 asc_train_diff_GA 0.972271 ... 12.327532 0.000000e+00
2 asc_train_diff_one_lugg 0.326023 ... 4.900363 9.565975e-07
3 asc_train_diff_several_lugg 0.239988 ... 1.540490 1.234409e-01
4 b_time_train_ref -1.349547 ... -21.225551 0.000000e+00
5 b_time_train_diff_commuters 0.441164 ... 3.234451 1.218768e-03
6 b_cost -0.635630 ... -13.634236 0.000000e+00
7 mu_existing 1.728921 ... 15.527111 0.000000e+00
8 asc_car_ref -0.497087 ... -5.543355 2.967311e-08
9 asc_car_diff_GA -0.314420 ... -2.495324 1.258422e-02
10 asc_car_diff_one_lugg -0.070352 ... -1.485807 1.373301e-01
11 asc_car_diff_several_lugg -0.355340 ... -1.999446 4.556013e-02
12 b_time_car_ref -1.005688 ... -17.517694 0.000000e+00
13 b_time_car_diff_commuters 0.681442 ... 3.543630 3.946583e-04
14 b_time_swissmetro_ref -1.648466 ... -23.863946 0.000000e+00
15 b_time_swissmetro_diff_commuters 1.639449 ... 8.447529 0.000000e+00
[16 rows x 5 columns]
The following plot illustrates all models that have been estimated. Each dot corresponds to a model. The x-coordinate corresponds to the Akaike Information Criterion (AIC). The y-coordinate corresponds to the Bayesian Information Criterion (BIC). Note that there is a third objective that does not appear on this picture: the number of parameters. If the shape of the dot is a circle, it means that it corresponds to a Pareto optimal model. If the shape is a cross, it means that the model has been Pareto optimal at some point during the algorithm and later removed as a new model dominating it has been found. If the shape is a start, it means that the model has been deemed invalid.
if can_plot:
_ = post_processing.plot(
label_x='Nbr of parameters',
label_y='Negative log likelihood',
objective_x=1,
objective_y=0,
)

Total running time of the script: (0 minutes 1.024 seconds)