.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_examples/swissmetro/plot_b21process_pareto.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_auto_examples_swissmetro_plot_b21process_pareto.py>`
        to download the full example code.

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_examples_swissmetro_plot_b21process_pareto.py:

.. _plot_b21process_pareto:

Re-estimate the Pareto optimal models
=====================================

The assisted specification algorithm generates a file containing the
pareto optimal specification. This script is designed to re-estimate
the Pareto optimal models. The catalog of specifications is defined in
:ref:`plot_b21multiple_models_spec` .

:author: Michel Bierlaire, EPFL
:date: Wed Apr 12 17:46:14 2023

.. GENERATED FROM PYTHON SOURCE LINES 15-37

.. code-block:: Python


    import biogeme.biogeme_logging as blog

    try:
        import matplotlib.pyplot as plt

        can_plot = True
    except ModuleNotFoundError:
        can_plot = False
    from biogeme_optimization.exceptions import OptimizationError
    from biogeme.assisted import ParetoPostProcessing
    from biogeme.results import compile_estimation_results
    from plot_b21multiple_models_spec import the_biogeme

    PARETO_FILE_NAME = 'saved_results/b21multiple_models.pareto'

    logger = blog.get_screen_logger(blog.INFO)
    logger.info('Example b21process_pareto.py')

    CSV_FILE = 'b21process_pareto.csv'
    SEP_CSV = ','


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Example b21process_pareto.py 


.. GENERATED FROM PYTHON SOURCE LINES 38-43

The constructor of the Pareto post processing object takes two arguments:

   - the biogeme object,
   - the name of the file where the algorithm has stored the
     estimated models.

.. GENERATED FROM PYTHON SOURCE LINES 43-48

.. code-block:: Python

    the_pareto_post = ParetoPostProcessing(
        biogeme_object=the_biogeme,
        pareto_file_name=PARETO_FILE_NAME,
    )


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Pareto set initialized from file with 36 elements [8 Pareto] and 0 invalid elements. 


.. GENERATED FROM PYTHON SOURCE LINES 49-51

.. code-block:: Python

    the_pareto_post.log_statistics()


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Pareto: 8  
    Considered: 36  
    Removed: 4 


.. GENERATED FROM PYTHON SOURCE LINES 52-54

Complete re-estimation of the best models, including the calculation
of the statistics.

.. GENERATED FROM PYTHON SOURCE LINES 54-56

.. code-block:: Python

    all_results = the_pareto_post.reestimate(recycle=False)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Biogeme parameters provided by the user. 
    As the model is not too complex, we activate the calculation of second derivatives. If you want to change it, change the name of the algorithm in the TOML file from "automatic" to "simple_bounds" 
    *** Initial values of the parameters are obtained from the file __b21multiple_models_000000.iter 
    Parameter values restored from __b21multiple_models_000000.iter 
    As the model is not too complex, we activate the calculation of second derivatives. If you want to change it, change the name of the algorithm in the TOML file from "automatic" to "simple_bounds" 
    Optimization algorithm: hybrid Newton/BFGS with simple bounds [simple_bounds] 
    ** Optimization: Newton with trust region for simple bounds 
    Iter.         ASC_CAR      ASC_CAR_GA       ASC_TRAIN    ASC_TRAIN_GA          B_COST          B_TIME     lambda_time     Function    Relgrad   Radius      Rho      
        0          -0.062           -0.34           -0.94             1.9            -1.1            -1.7            0.36        5e+03     0.0063       10        1   ++ 
        1          -0.064           -0.31              -1               2            -1.1            -1.7            0.38        5e+03    0.00022    1e+02        1   ++ 
        2          -0.064           -0.31              -1               2            -1.1            -1.7            0.38        5e+03    2.2e-07    1e+02        1   ++ 
    Results saved in file b21multiple_models_000000~00.html 
    Results saved in file b21multiple_models_000000~00.pickle 
    Biogeme parameters provided by the user. 
    As the model is not too complex, we activate the calculation of second derivatives. If you want to change it, change the name of the algorithm in the TOML file from "automatic" to "simple_bounds" 
    *** Initial values of the parameters are obtained from the file __b21multiple_models_000001.iter 
    Parameter values restored from __b21multiple_models_000001.iter 
    As the model is not too complex, we activate the calculation of second derivatives. If you want to change it, change the name of the algorithm in the TOML file from "automatic" to "simple_bounds" 
    Optimization algorithm: hybrid Newton/BFGS with simple bounds [simple_bounds] 
    ** Optimization: Newton with trust region for simple bounds 
    Iter.         ASC_CAR      ASC_CAR_GA    ASC_CAR_male       ASC_TRAIN    ASC_TRAIN_GA  ASC_TRAIN_male          B_COST          B_TIME     lambda_time     Function    Relgrad   Radius      Rho      
        0           -0.42           -0.45            0.41           -0.22               2            -1.1            -1.1            -1.7            0.34      4.9e+03     0.0002       10        1   ++ 
        1           -0.42           -0.45            0.41           -0.22               2            -1.1            -1.1            -1.7            0.34      4.9e+03    1.9e-07       10        1   ++ 
    Results saved in file b21multiple_models_000001~00.html 
    Results saved in file b21multiple_models_000001~00.pickle 
    Biogeme parameters provided by the user. 
    As the model is not too complex, we activate the calculation of second derivatives. If you want to change it, change the name of the algorithm in the TOML file from "automatic" to "simple_bounds" 
    *** Initial values of the parameters are obtained from the file __b21multiple_models_000002.iter 
    Parameter values restored from __b21multiple_models_000002.iter 
    As the model is not too complex, we activate the calculation of second derivatives. If you want to change it, change the name of the algorithm in the TOML file from "automatic" to "simple_bounds" 
    Optimization algorithm: hybrid Newton/BFGS with simple bounds [simple_bounds] 
    ** Optimization: Newton with trust region for simple bounds 
    Results saved in file b21multiple_models_000002~00.html 
    Results saved in file b21multiple_models_000002~00.pickle 
    Biogeme parameters provided by the user. 
    As the model is not too complex, we activate the calculation of second derivatives. If you want to change it, change the name of the algorithm in the TOML file from "automatic" to "simple_bounds" 
    *** Initial values of the parameters are obtained from the file __b21multiple_models_000003.iter 
    Parameter values restored from __b21multiple_models_000003.iter 
    As the model is not too complex, we activate the calculation of second derivatives. If you want to change it, change the name of the algorithm in the TOML file from "automatic" to "simple_bounds" 
    Optimization algorithm: hybrid Newton/BFGS with simple bounds [simple_bounds] 
    ** Optimization: Newton with trust region for simple bounds 
    Iter.         ASC_CAR      ASC_CAR_GA    ASC_CAR_male       ASC_TRAIN    ASC_TRAIN_GA  ASC_TRAIN_male          B_COST       B_COST_GA          B_TIME     lambda_time     Function    Relgrad   Radius      Rho      
        0            -0.4            -0.8            0.39           -0.25             1.9            -1.1              -1            0.89            -1.6            0.39      4.9e+03      0.012       10        1   ++ 
        1           -0.42              -1            0.41           -0.22               2            -1.2            -1.1            0.92            -1.7            0.33      4.9e+03     0.0004    1e+02        1   ++ 
        2           -0.42              -1            0.41           -0.22               2            -1.2            -1.1            0.92            -1.7            0.33      4.9e+03    2.5e-06    1e+02        1   ++ 
    Results saved in file b21multiple_models_000003~00.html 
    Results saved in file b21multiple_models_000003~00.pickle 
    Biogeme parameters provided by the user. 
    As the model is not too complex, we activate the calculation of second derivatives. If you want to change it, change the name of the algorithm in the TOML file from "automatic" to "simple_bounds" 
    *** Initial values of the parameters are obtained from the file __b21multiple_models_000004.iter 
    Parameter values restored from __b21multiple_models_000004.iter 
    As the model is not too complex, we activate the calculation of second derivatives. If you want to change it, change the name of the algorithm in the TOML file from "automatic" to "simple_bounds" 
    Optimization algorithm: hybrid Newton/BFGS with simple bounds [simple_bounds] 
    ** Optimization: Newton with trust region for simple bounds 
    Iter.         ASC_CAR      ASC_CAR_GA    ASC_CAR_male       ASC_TRAIN    ASC_TRAIN_GA  ASC_TRAIN_male          B_COST B_COST_inc-100+ B_COST_inc-50-1 B_COST_inc-unde B_COST_inc-unkn          B_TIME     lambda_time     Function    Relgrad   Radius      Rho      
        0           -0.46           -0.32            0.45           -0.28               2            -1.1            -1.5            0.58             0.2           -0.62            0.79            -1.7            0.33      4.9e+03     0.0021       10        1   ++ 
        1           -0.46           -0.32            0.45           -0.28               2            -1.1            -1.5            0.58             0.2           -0.62            0.79            -1.7            0.33      4.9e+03    3.5e-05       10        1   ++ 
    Results saved in file b21multiple_models_000004~00.html 
    Results saved in file b21multiple_models_000004~00.pickle 
    Biogeme parameters provided by the user. 
    As the model is not too complex, we activate the calculation of second derivatives. If you want to change it, change the name of the algorithm in the TOML file from "automatic" to "simple_bounds" 
    *** Initial values of the parameters are obtained from the file __b21multiple_models_000005.iter 
    Parameter values restored from __b21multiple_models_000005.iter 
    As the model is not too complex, we activate the calculation of second derivatives. If you want to change it, change the name of the algorithm in the TOML file from "automatic" to "simple_bounds" 
    Optimization algorithm: hybrid Newton/BFGS with simple bounds [simple_bounds] 
    ** Optimization: Newton with trust region for simple bounds 
    Iter.         ASC_CAR       ASC_TRAIN          B_COST          B_TIME     Function    Relgrad   Radius      Rho      
        0          -0.072           -0.73           -0.93            -1.2      5.3e+03      0.017        1     0.82    + 
        1           -0.15           -0.71            -1.1            -1.3      5.3e+03     0.0009       10        1   ++ 
        2           -0.15           -0.71            -1.1            -1.3      5.3e+03    3.9e-06       10        1   ++ 
    Results saved in file b21multiple_models_000005~00.html 
    Results saved in file b21multiple_models_000005~00.pickle 
    Biogeme parameters provided by the user. 
    As the model is not too complex, we activate the calculation of second derivatives. If you want to change it, change the name of the algorithm in the TOML file from "automatic" to "simple_bounds" 
    *** Initial values of the parameters are obtained from the file __b21multiple_models_000006.iter 
    Parameter values restored from __b21multiple_models_000006.iter 
    As the model is not too complex, we activate the calculation of second derivatives. If you want to change it, change the name of the algorithm in the TOML file from "automatic" to "simple_bounds" 
    Optimization algorithm: hybrid Newton/BFGS with simple bounds [simple_bounds] 
    ** Optimization: Newton with trust region for simple bounds 
    Iter.         ASC_CAR       ASC_TRAIN          B_COST          B_TIME     lambda_time     Function    Relgrad   Radius      Rho      
        0         -0.0036           -0.37            -1.1            -1.7             0.5      5.3e+03      0.016        1     0.83    + 
        1         -0.0049           -0.48            -1.1            -1.7            0.51      5.3e+03    0.00057       10        1   ++ 
        2         -0.0049           -0.48            -1.1            -1.7            0.51      5.3e+03    8.2e-07       10        1   ++ 
    Results saved in file b21multiple_models_000006~00.html 
    Results saved in file b21multiple_models_000006~00.pickle 
    Biogeme parameters provided by the user. 
    As the model is not too complex, we activate the calculation of second derivatives. If you want to change it, change the name of the algorithm in the TOML file from "automatic" to "simple_bounds" 
    *** Initial values of the parameters are obtained from the file __b21multiple_models_000007.iter 
    Parameter values restored from __b21multiple_models_000007.iter 
    As the model is not too complex, we activate the calculation of second derivatives. If you want to change it, change the name of the algorithm in the TOML file from "automatic" to "simple_bounds" 
    Optimization algorithm: hybrid Newton/BFGS with simple bounds [simple_bounds] 
    ** Optimization: Newton with trust region for simple bounds 
    Iter.         ASC_CAR      ASC_CAR_GA    ASC_CAR_male       ASC_TRAIN    ASC_TRAIN_GA  ASC_TRAIN_male          B_COST          B_TIME     Function    Relgrad   Radius      Rho      
        0           -0.42           -0.45            0.41           -0.22               2            -1.2            -1.1            -1.7      4.9e+03    3.7e-05        1        1      
    Results saved in file b21multiple_models_000007~00.html 
    Results saved in file b21multiple_models_000007~00.pickle 


.. GENERATED FROM PYTHON SOURCE LINES 57-60

.. code-block:: Python

    summary, description = compile_estimation_results(all_results, use_short_names=True)
    print(summary)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

                                       Model_000000  ...     Model_000007
    Number of estimated parameters                7  ...                8
    Sample size                                6768  ...             6768
    Final log likelihood               -4995.755387  ...     -4900.883444
    Akaike Information Criterion       10005.510775  ...      9817.766888
    Bayesian Information Criterion     10053.250501  ...      9872.326575
    ASC_CAR (t-test)                -0.064  (-1.22)  ...  -0.389  (-3.95)
    ASC_CAR_GA (t-test)             -0.313  (-1.59)  ...  -0.415  (-2.02)
    ASC_TRAIN (t-test)               -1.03  (-13.9)  ...  -0.203  (-2.23)
    ASC_TRAIN_GA (t-test)              2.04  (22.8)  ...     2.03  (22.4)
    B_COST (t-test)                   -1.1  (-14.8)  ...   -1.06  (-15.2)
    B_TIME (t-test)                  -1.67  (-21.3)  ...    -1.7  (-21.5)
    lambda_time (t-test)              0.382  (5.18)  ...                 
    ASC_CAR_male (t-test)                            ...    0.377  (3.65)
    ASC_TRAIN_male (t-test)                          ...    -1.2  (-14.1)
    B_COST_GA (t-test)                               ...                 
    B_COST_inc-100+ (t-test)                         ...                 
    B_COST_inc-50-100 (t-test)                       ...                 
    B_COST_inc-under50 (t-test)                      ...                 
    B_COST_inc-unknown (t-test)                      ...                 

    [19 rows x 8 columns]


.. GENERATED FROM PYTHON SOURCE LINES 61-64

.. code-block:: Python

    print(f'Summary table available in {CSV_FILE}')
    summary.to_csv(CSV_FILE, sep=SEP_CSV)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Summary table available in b21process_pareto.csv


.. GENERATED FROM PYTHON SOURCE LINES 65-66

Explanation of the short names of the models.

.. GENERATED FROM PYTHON SOURCE LINES 66-73

.. code-block:: Python

    with open(CSV_FILE, 'a', encoding='utf-8') as f:
        print('\n\n', file=f)
        for k, v in description.items():
            if k != v:
                print(f'{k}: {v}')
                print(f'{k}{SEP_CSV}{v}', file=f)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Model_000000: ASC:GA;B_COST:no_seg;TRAIN_TT:boxcox
    Model_000001: ASC:MALE-GA;B_COST:no_seg;TRAIN_TT:boxcox
    Model_000002: ASC:GA;B_COST:no_seg;TRAIN_TT:log
    Model_000003: ASC:MALE-GA;B_COST:GA;TRAIN_TT:boxcox
    Model_000004: ASC:MALE-GA;B_COST:INCOME;TRAIN_TT:boxcox
    Model_000005: ASC:no_seg;B_COST:no_seg;TRAIN_TT:linear
    Model_000006: ASC:no_seg;B_COST:no_seg;TRAIN_TT:boxcox
    Model_000007: ASC:MALE-GA;B_COST:no_seg;TRAIN_TT:log


.. GENERATED FROM PYTHON SOURCE LINES 74-82

The following plot illustrates all models that have been estimated.
Each dot corresponds to a model. The x-coordinate corresponds to the
negative log-likelihood. The y-coordinate corresponds to the number
of parameters. If the shape of the dot is a circle, it means that it
corresponds to a Pareto optimal model. If the shape is a cross, it
means that the model has been Pareto optimal at some point during
the algorithm and later removed as a new model dominating it has
been found.

.. GENERATED FROM PYTHON SOURCE LINES 82-90

.. code-block:: Python

    if can_plot:
        try:
            _ = the_pareto_post.plot(
                label_x='Negative loglikelihood', label_y='Number of parameters'
            )
            plt.show()
        except OptimizationError as e:
            print(f'No plot available: {e}')


.. image-sg:: /auto_examples/swissmetro/images/sphx_glr_plot_b21process_pareto_001.png
   :alt: plot b21process pareto
   :srcset: /auto_examples/swissmetro/images/sphx_glr_plot_b21process_pareto_001.png
   :class: sphx-glr-single-img


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** (0 minutes 1.547 seconds)


.. _sphx_glr_download_auto_examples_swissmetro_plot_b21process_pareto.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: plot_b21process_pareto.ipynb <plot_b21process_pareto.ipynb>`

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: plot_b21process_pareto.py <plot_b21process_pareto.py>`

    .. container:: sphx-glr-download sphx-glr-download-zip

      :download:`Download zipped: plot_b21process_pareto.zip <plot_b21process_pareto.zip>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_