21c. Re-estimate the Pareto optimal modelsΒΆ

The assisted specification algorithm generates a file containing the pareto optimal specification. This script is designed to re-estimate the Pareto optimal models. The catalog of specifications is defined in 21b. Specification of a catalog of models .

Michel Bierlaire, EPFL Sat Jun 28 2025, 20:58:22

import biogeme.biogeme_logging as blog
from biogeme.results_processing import compile_estimation_results

try:
    import matplotlib.pyplot as plt

    can_plot = True
except ModuleNotFoundError:
    can_plot = False
from biogeme_optimization.exceptions import OptimizationError
from biogeme.assisted import ParetoPostProcessing
from plot_b21b_multiple_models_spec import the_biogeme, PARETO_FILE_NAME

PATH_PARETO_FILE_NAME = f'saved_results/{PARETO_FILE_NAME}'

logger = blog.get_screen_logger(blog.INFO)
logger.info('Example b21c_process_pareto.py')

CSV_FILE = 'b21_process_pareto.csv'
SEP_CSV = ','
income_segmentation=INCOME: [{0: 'inc-zero', 1: 'inc-under50', 2: 'inc-50-100', 3: 'inc-100+', 4: 'inc-unknown'}] ref: inc-zero
Example b21c_process_pareto.py

The constructor of the Pareto post processing object takes two arguments:

  • the biogeme object,

  • the name of the file where the algorithm has stored the estimated models.

the_pareto_post = ParetoPostProcessing(
    biogeme_object=the_biogeme,
    pareto_file_name=PATH_PARETO_FILE_NAME,
)
Unable to read file saved_results/b21_multiple_models.pareto. Pareto set empty.
the_pareto_post.log_statistics()
Pareto: 0
Considered: 0
Removed: 0

Complete re-estimation of the best models, including the calculation of the statistics.

all_results = the_pareto_post.reestimate(recycle=False)
summary, description = compile_estimation_results(all_results, use_short_names=True)
print(summary)
Empty DataFrame
Columns: []
Index: []
print(f'Summary table available in {CSV_FILE}')
summary.to_csv(CSV_FILE, sep=SEP_CSV)
Summary table available in b21_process_pareto.csv

Explanation of the short names of the models.

with open(CSV_FILE, 'a', encoding='utf-8') as f:
    print('\n\n', file=f)
    for k, v in description.items():
        if k != v:
            print(f'{k}: {v}')
            print(f'{k}{SEP_CSV}{v}', file=f)

The following plot illustrates all models that have been estimated. Each dot corresponds to a model. The x-coordinate corresponds to the negative log-likelihood. The y-coordinate corresponds to the number of parameters. If the shape of the dot is a circle, it means that it corresponds to a Pareto optimal model. If the shape is a cross, it means that the model has been Pareto optimal at some point during the algorithm and later removed as a new model dominating it has been found.

if can_plot:
    try:
        _ = the_pareto_post.plot(
            label_x='Negative loglikelihood', label_y='Number of parameters'
        )
        plt.show()
    except OptimizationError as e:
        print(f'No plot available: {e}')
plot b21c process pareto
No plot available: Cannot plot an empty Pareto set

Total running time of the script: (0 minutes 0.081 seconds)

Gallery generated by Sphinx-Gallery