.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/programmers/plot_biogeme.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_programmers_plot_biogeme.py: biogeme.biogeme =============== Examples of use of several functions. This is designed for programmers who need examples of use of the functions of the module. The examples are designed to illustrate the syntax. They do not correspond to any meaningful model. :author: Michel Bierlaire :date: Thu Nov 16 18:36:35 2023 .. GENERATED FROM PYTHON SOURCE LINES 15-23 .. code-block:: default import biogeme.version as ver import biogeme.biogeme as bio import biogeme.database as db import pandas as pd from biogeme.expressions import Beta, Variable, exp import biogeme.biogeme_logging as blog .. GENERATED FROM PYTHON SOURCE LINES 24-25 Version of Biogeme. .. GENERATED FROM PYTHON SOURCE LINES 25-28 .. code-block:: default print(ver.getText()) .. rst-class:: sphx-glr-script-out .. code-block:: none biogeme 3.2.13 [2023-12-23] Home page: http://biogeme.epfl.ch Submit questions to https://groups.google.com/d/forum/biogeme Michel Bierlaire, Transport and Mobility Laboratory, Ecole Polytechnique Fédérale de Lausanne (EPFL) .. GENERATED FROM PYTHON SOURCE LINES 29-30 Logger. .. GENERATED FROM PYTHON SOURCE LINES 30-34 .. code-block:: default logger = blog.get_screen_logger(level=blog.INFO) logger.info('Logger initalized') .. rst-class:: sphx-glr-script-out .. code-block:: none Logger initalized .. GENERATED FROM PYTHON SOURCE LINES 35-36 Definition of a database .. GENERATED FROM PYTHON SOURCE LINES 36-50 .. code-block:: default df = pd.DataFrame( { 'Person': [1, 1, 1, 2, 2], 'Exclude': [0, 0, 1, 0, 1], 'Variable1': [1, 2, 3, 4, 5], 'Variable2': [10, 20, 30, 40, 50], 'Choice': [1, 2, 3, 1, 2], 'Av1': [0, 1, 1, 1, 1], 'Av2': [1, 1, 1, 1, 1], 'Av3': [0, 1, 1, 1, 1], } ) myData = db.Database('test', df) .. GENERATED FROM PYTHON SOURCE LINES 51-52 Data .. GENERATED FROM PYTHON SOURCE LINES 52-55 .. code-block:: default myData.data .. raw:: html
Person Exclude Variable1 Variable2 Choice Av1 Av2 Av3
0 1 0 1 10 1 0 1 0
1 1 0 2 20 2 1 1 1
2 1 1 3 30 3 1 1 1
3 2 0 4 40 1 1 1 1
4 2 1 5 50 2 1 1 1


.. GENERATED FROM PYTHON SOURCE LINES 56-57 Definition of various expressions. .. GENERATED FROM PYTHON SOURCE LINES 57-65 .. code-block:: default Variable1 = Variable('Variable1') Variable2 = Variable('Variable2') beta1 = Beta('beta1', -1.0, -3, 3, 0) beta2 = Beta('beta2', 2.0, -3, 10, 0) likelihood = -(beta1**2) * Variable1 - exp(beta2 * beta1) * Variable2 - beta2**4 simul = beta1 / Variable1 + beta2 / Variable2 dictOfExpressions = {'loglike': likelihood, 'beta1': beta1, 'simul': simul} .. GENERATED FROM PYTHON SOURCE LINES 66-67 Creation of the BIOGEME object. .. GENERATED FROM PYTHON SOURCE LINES 67-71 .. code-block:: default myBiogeme = bio.BIOGEME(myData, dictOfExpressions) myBiogeme.modelName = 'simple_example' print(myBiogeme) .. rst-class:: sphx-glr-script-out .. code-block:: none File biogeme.toml has been created simple_example: database [test]{'loglike': ((((-(Beta('beta1', -1.0, -3, 3, 0) ** `2.0`)) * Variable1) - (exp((Beta('beta2', 2.0, -3, 10, 0) * Beta('beta1', -1.0, -3, 3, 0))) * Variable2)) - (Beta('beta2', 2.0, -3, 10, 0) ** `4.0`)), 'beta1': Beta('beta1', -1.0, -3, 3, 0), 'simul': ((Beta('beta1', -1.0, -3, 3, 0) / Variable1) + (Beta('beta2', 2.0, -3, 10, 0) / Variable2))} .. GENERATED FROM PYTHON SOURCE LINES 72-73 The data is stored in the Biogeme object. .. GENERATED FROM PYTHON SOURCE LINES 73-75 .. code-block:: default myBiogeme.database.data .. raw:: html
Person Exclude Variable1 Variable2 Choice Av1 Av2 Av3
0 1 0 1 10 1 0 1 0
1 1 0 2 20 2 1 1 1
2 1 1 3 30 3 1 1 1
3 2 0 4 40 1 1 1 1
4 2 1 5 50 2 1 1 1


.. GENERATED FROM PYTHON SOURCE LINES 76-77 Log likelihood with the initial values of the parameters. .. GENERATED FROM PYTHON SOURCE LINES 77-79 .. code-block:: default myBiogeme.calculateInitLikelihood() .. rst-class:: sphx-glr-script-out .. code-block:: none -115.30029248549191 .. GENERATED FROM PYTHON SOURCE LINES 80-82 Calculate the log-likelihood with a different value of the parameters. We retieve the current value and add 1 to each of them. .. GENERATED FROM PYTHON SOURCE LINES 82-86 .. code-block:: default x = myBiogeme.id_manager.free_betas_values xplus = [v + 1 for v in x] print(xplus) .. rst-class:: sphx-glr-script-out .. code-block:: none [0.0, 3.0] .. GENERATED FROM PYTHON SOURCE LINES 87-89 .. code-block:: default myBiogeme.calculateLikelihood(xplus, scaled=True) .. rst-class:: sphx-glr-script-out .. code-block:: none -111.0 .. GENERATED FROM PYTHON SOURCE LINES 90-91 Calculate the log-likelihood function and its derivatives. .. GENERATED FROM PYTHON SOURCE LINES 91-94 .. code-block:: default f, g, h, bhhh = myBiogeme.calculateLikelihoodAndDerivatives( xplus, scaled=True, hessian=True, bhhh=True ) .. GENERATED FROM PYTHON SOURCE LINES 95-97 .. code-block:: default print(f'f = {f}') .. rst-class:: sphx-glr-script-out .. code-block:: none f = -111.0 .. GENERATED FROM PYTHON SOURCE LINES 98-100 .. code-block:: default print(f'g = {g}') .. rst-class:: sphx-glr-script-out .. code-block:: none g = [ -90. -108.] .. GENERATED FROM PYTHON SOURCE LINES 101-103 .. code-block:: default pd.DataFrame(h) .. raw:: html
0 1
0 -270.0 -30.0
1 -30.0 -108.0


.. GENERATED FROM PYTHON SOURCE LINES 104-106 .. code-block:: default pd.DataFrame(bhhh) .. raw:: html
0 1
0 9900.0 9720.0
1 9720.0 11664.0


.. GENERATED FROM PYTHON SOURCE LINES 107-108 Now the unscaled version. .. GENERATED FROM PYTHON SOURCE LINES 108-111 .. code-block:: default f, g, h, bhhh = myBiogeme.calculateLikelihoodAndDerivatives( xplus, scaled=False, hessian=True, bhhh=True ) .. GENERATED FROM PYTHON SOURCE LINES 112-114 .. code-block:: default print(f'f = {f}') .. rst-class:: sphx-glr-script-out .. code-block:: none f = -555.0 .. GENERATED FROM PYTHON SOURCE LINES 115-117 .. code-block:: default print(f'g = {g}') .. rst-class:: sphx-glr-script-out .. code-block:: none g = [-450. -540.] .. GENERATED FROM PYTHON SOURCE LINES 118-120 .. code-block:: default pd.DataFrame(h) .. raw:: html
0 1
0 -1350.0 -150.0
1 -150.0 -540.0


.. GENERATED FROM PYTHON SOURCE LINES 121-124 .. code-block:: default pd.DataFrame(bhhh) .. raw:: html
0 1
0 49500.0 48600.0
1 48600.0 58320.0


.. GENERATED FROM PYTHON SOURCE LINES 125-126 Calculate the hessian of the log likelihood function using finite difference. .. GENERATED FROM PYTHON SOURCE LINES 126-129 .. code-block:: default fin_diff_hessian = myBiogeme.likelihoodFiniteDifferenceHessian(xplus) pd.DataFrame(fin_diff_hessian) .. raw:: html
0 1
0 -1380.000202 -150.000000
1 -150.000045 -540.000054


.. GENERATED FROM PYTHON SOURCE LINES 130-133 Check numerically the derivatives implementation. The analytical derivatives are compared to the numerical derivatives obtains by finite differences. .. GENERATED FROM PYTHON SOURCE LINES 133-135 .. code-block:: default f, g, h, gdiff, hdiff = myBiogeme.checkDerivatives(xplus, verbose=True) .. rst-class:: sphx-glr-script-out .. code-block:: none x Gradient FinDiff Difference beta1 -4.500000E+02 -4.500001E+02 +6.934970E-05 beta2 -5.400000E+02 -5.400001E+02 +8.087011E-05 Row Col Hessian FinDiff Difference beta1 beta1 -1.350000E+03 -1.380000E+03 +3.000020E+01 beta1 beta2 -1.500000E+02 -1.500000E+02 +2.425509E-10 beta2 beta1 -1.500000E+02 -1.500000E+02 +4.509602E-05 beta2 beta2 -5.400000E+02 -5.400001E+02 +5.396423E-05 .. GENERATED FROM PYTHON SOURCE LINES 136-137 .. code-block:: default print(f'f = {f}') .. rst-class:: sphx-glr-script-out .. code-block:: none f = -555.0 .. GENERATED FROM PYTHON SOURCE LINES 138-139 .. code-block:: default print(f'g = {g}') .. rst-class:: sphx-glr-script-out .. code-block:: none g = [-450. -540.] .. GENERATED FROM PYTHON SOURCE LINES 140-141 .. code-block:: default pd.DataFrame(h) .. raw:: html
0 1
0 -1350.0 -150.0
1 -150.0 -540.0


.. GENERATED FROM PYTHON SOURCE LINES 142-144 .. code-block:: default pd.DataFrame(gdiff) # print(f'gdiff = {gdiff}') .. raw:: html
0
0 0.000069
1 0.000081


.. GENERATED FROM PYTHON SOURCE LINES 145-148 .. code-block:: default pd.DataFrame(hdiff) # print(f'hdiff = {hdiff}') .. raw:: html
0 1
0 30.000202 2.425509e-10
1 0.000045 5.396423e-05


.. GENERATED FROM PYTHON SOURCE LINES 149-151 Estimation ---------- .. GENERATED FROM PYTHON SOURCE LINES 153-154 Estimation of the parameters, with bootstrapping .. GENERATED FROM PYTHON SOURCE LINES 154-157 .. code-block:: default myBiogeme.bootstrap_samples = 10 results = myBiogeme.estimate(run_bootstrap=True) .. rst-class:: sphx-glr-script-out .. code-block:: none *** Initial values of the parameters are obtained from the file __simple_example.iter Parameter values restored from __simple_example.iter Optimization algorithm: hybrid Newton/BFGS with simple bounds [simple_bounds] ** Optimization: Newton with trust region for simple bounds Iter. beta1 beta2 Function Relgrad Radius Rho 0 -0.23 2.1 1.8e+02 0.019 10 1.2 ++ 1 -0.6 1.5 93 0.013 1e+02 1.2 ++ 2 -1 1.3 70 0.01 1e+03 1.2 ++ 3 -1.2 1.3 67 0.0039 1e+04 1.1 ++ 4 -1.3 1.2 67 6.8e-05 1e+05 1 ++ 5 -1.3 1.2 67 1.2e-08 1e+05 1 ++ Re-estimate the model 10 times for bootstrapping 0%| | 0/10 [00:00
Value Rob. Std err Rob. t-test Rob. p-value
beta1 -1.273264 0.013724 -92.776769 0.0
beta2 1.248769 0.059086 21.134795 0.0


.. GENERATED FROM PYTHON SOURCE LINES 161-164 If the model has already been estimated, it is possible to recycle the estimation results. In that case, the other arguments are ignored, and the results are whatever is in the file. .. GENERATED FROM PYTHON SOURCE LINES 166-168 .. code-block:: default recycled_results = myBiogeme.estimate(recycle=True, run_bootstrap=True) .. rst-class:: sphx-glr-script-out .. code-block:: none Estimation results read from simple_example.pickle. There is no guarantee that they correspond to the specified model. .. GENERATED FROM PYTHON SOURCE LINES 169-171 .. code-block:: default print(recycled_results.short_summary()) .. rst-class:: sphx-glr-script-out .. code-block:: none Results for model simple_example Nbr of parameters: 2 Sample size: 5 Excluded data: 0 Final log likelihood: -67.06549 Akaike Information Criterion: 138.131 Bayesian Information Criterion: 137.3499 .. GENERATED FROM PYTHON SOURCE LINES 172-174 .. code-block:: default recycled_results.getEstimatedParameters() .. raw:: html
Value Rob. Std err Rob. t-test Rob. p-value
beta1 -1.273264 0.013724 -92.776769 0.0
beta2 1.248769 0.059086 21.134795 0.0


.. GENERATED FROM PYTHON SOURCE LINES 175-177 Simulation ---------- .. GENERATED FROM PYTHON SOURCE LINES 179-180 Simulate with the initial values for the parameters. .. GENERATED FROM PYTHON SOURCE LINES 180-183 .. code-block:: default simulation_with_default_betas = myBiogeme.simulate(myBiogeme.loglike.get_beta_values()) simulation_with_default_betas .. raw:: html
loglike beta1 simul
0 -101.0 0.0 0.15
1 -131.0 0.0 0.06
2 -131.0 0.0 0.06
3 -131.0 0.0 0.06
4 -101.0 0.0 0.15


.. GENERATED FROM PYTHON SOURCE LINES 184-185 Simulate with the estimated values for the parameters. .. GENERATED FROM PYTHON SOURCE LINES 187-189 .. code-block:: default print(results.getBetaValues()) .. rst-class:: sphx-glr-script-out .. code-block:: none {'beta1': -1.273263915009374, 'beta2': 1.248768825523196} .. GENERATED FROM PYTHON SOURCE LINES 190-193 .. code-block:: default simulation_with_estimated_betas = myBiogeme.simulate(results.getBetaValues()) simulation_with_estimated_betas .. raw:: html
loglike beta1 simul
0 -9.752666 -1.273264 -0.574194
1 -20.733962 -1.273264 -0.229677
2 -20.733962 -1.273264 -0.229677
3 -20.733962 -1.273264 -0.229677
4 -9.752666 -1.273264 -0.574194


.. GENERATED FROM PYTHON SOURCE LINES 194-196 Confidence intervals. First, we extract the values of betas from the bootstrapping draws. .. GENERATED FROM PYTHON SOURCE LINES 196-202 .. code-block:: default draws_from_betas = results.getBetasForSensitivityAnalysis( myBiogeme.id_manager.free_betas.names ) for draw in draws_from_betas: print(draw) .. rst-class:: sphx-glr-script-out .. code-block:: none {'beta1': -1.304007541668053, 'beta2': 1.1122455742294828} {'beta1': -1.264979774201378, 'beta2': 1.2842631765105155} {'beta1': -1.292557821467689, 'beta2': 1.1643222175104226} {'beta1': -1.2873325336195227, 'beta2': 1.1875198317201634} {'beta1': -1.2690260405244749, 'beta2': 1.26696688383924} {'beta1': -1.292557821467689, 'beta2': 1.1643222175104226} {'beta1': -1.273263915009374, 'beta2': 1.248768825523196} {'beta1': -1.2690260405244749, 'beta2': 1.26696688383924} {'beta1': -1.264979774201378, 'beta2': 1.2842631765105155} {'beta1': -1.2573978799517176, 'beta2': 1.3165120810083317} .. GENERATED FROM PYTHON SOURCE LINES 203-205 Then, we calculate the confidence intervals. The default interval size is 0.9. Here, we use a different one. .. GENERATED FROM PYTHON SOURCE LINES 205-208 .. code-block:: default left, right = myBiogeme.confidenceIntervals(draws_from_betas, interval_size=0.95) left .. raw:: html
loglike beta1 simul
0 -9.958158 -1.301431 -0.594518
1 -21.652272 -1.301431 -0.237807
2 -21.652272 -1.301431 -0.237807
3 -21.652272 -1.301431 -0.237807
4 -9.958158 -1.301431 -0.594518


.. GENERATED FROM PYTHON SOURCE LINES 209-211 .. code-block:: default right .. raw:: html
loglike beta1 simul
0 -9.619739 -1.259104 -0.564089
1 -20.485149 -1.259104 -0.225636
2 -20.485149 -1.259104 -0.225636
3 -20.485149 -1.259104 -0.225636
4 -9.619739 -1.259104 -0.564089


.. GENERATED FROM PYTHON SOURCE LINES 212-222 Validation ---------- The validation consists in organizing the data into several slices of about the same size, randomly defined. Each slide is considered as a validation dataset. The model is then re-estimated using all the data except the slice, and the estimated model is applied on the validation set (i.e. the slice). The value of the log likelihood for each observation in the validation set is reported in a dataframe. As this is done for each slice, the output is a list of dataframes, each corresponding to one of these exercises. .. GENERATED FROM PYTHON SOURCE LINES 224-227 .. code-block:: default validationData = myData.split(slices=5) validation_results = myBiogeme.validate(results, validationData) .. rst-class:: sphx-glr-script-out .. code-block:: none /Users/bierlair/venv312/lib/python3.12/site-packages/numpy/core/fromnumeric.py:59: FutureWarning: 'DataFrame.swapaxes' is deprecated and will be removed in a future version. Please use 'DataFrame.transpose' instead. return bound(*args, **kwds) File biogeme.toml has been parsed. *** Initial values of the parameters are obtained from the file __simple_example_val_est_1.iter Cannot read file __simple_example_val_est_1.iter. Statement is ignored. Optimization algorithm: hybrid Newton/BFGS with simple bounds [simple_bounds] ** Optimization: Newton with trust region for simple bounds Iter. beta1 beta2 Function Relgrad Radius Rho 0 -1.3 1.2 50 0.00048 10 1 ++ 1 -1.3 1.2 50 2.6e-08 10 1 ++ Results saved in file simple_example_val_est_1.html Results saved in file simple_example_val_est_1.pickle File biogeme.toml has been parsed. File biogeme.toml has been parsed. *** Initial values of the parameters are obtained from the file __simple_example_val_est_2.iter Cannot read file __simple_example_val_est_2.iter. Statement is ignored. Optimization algorithm: hybrid Newton/BFGS with simple bounds [simple_bounds] ** Optimization: Newton with trust region for simple bounds Results saved in file simple_example_val_est_2.html Results saved in file simple_example_val_est_2.pickle File biogeme.toml has been parsed. File biogeme.toml has been parsed. *** Initial values of the parameters are obtained from the file __simple_example_val_est_3.iter Cannot read file __simple_example_val_est_3.iter. Statement is ignored. Optimization algorithm: hybrid Newton/BFGS with simple bounds [simple_bounds] ** Optimization: Newton with trust region for simple bounds Iter. beta1 beta2 Function Relgrad Radius Rho 0 -1.3 1.3 57 0.00035 10 1 ++ 1 -1.3 1.3 57 1.3e-08 10 1 ++ Results saved in file simple_example_val_est_3.html Results saved in file simple_example_val_est_3.pickle File biogeme.toml has been parsed. File biogeme.toml has been parsed. *** Initial values of the parameters are obtained from the file __simple_example_val_est_4.iter Cannot read file __simple_example_val_est_4.iter. Statement is ignored. Optimization algorithm: hybrid Newton/BFGS with simple bounds [simple_bounds] ** Optimization: Newton with trust region for simple bounds Iter. beta1 beta2 Function Relgrad Radius Rho 0 -1.3 1.3 61 0.0012 10 0.99 ++ 1 -1.3 1.3 61 1.5e-07 10 1 ++ Results saved in file simple_example_val_est_4.html Results saved in file simple_example_val_est_4.pickle File biogeme.toml has been parsed. File biogeme.toml has been parsed. *** Initial values of the parameters are obtained from the file __simple_example_val_est_5.iter Cannot read file __simple_example_val_est_5.iter. Statement is ignored. Optimization algorithm: hybrid Newton/BFGS with simple bounds [simple_bounds] ** Optimization: Newton with trust region for simple bounds Iter. beta1 beta2 Function Relgrad Radius Rho 0 -1.3 1.2 46 0.0022 10 1 ++ 1 -1.3 1.2 46 6e-07 10 1 ++ Results saved in file simple_example_val_est_5.html Results saved in file simple_example_val_est_5.pickle File biogeme.toml has been parsed. Simulation results saved in file simple_example_validation.pickle .. GENERATED FROM PYTHON SOURCE LINES 228-235 .. code-block:: default for slide in validation_results: print( f'Log likelihood for {slide.shape[0]} ' f'validation data: {slide["Loglikelihood"].sum()}' ) .. rst-class:: sphx-glr-script-out .. code-block:: none Log likelihood for 1 validation data: -17.145326446024075 Log likelihood for 1 validation data: -13.413098095892746 Log likelihood for 1 validation data: -9.81771976465043 Log likelihood for 1 validation data: -6.341108765392212 Log likelihood for 1 validation data: -21.03742136293277 .. GENERATED FROM PYTHON SOURCE LINES 236-238 The following tools is used to find files with the model name and a specific extension. .. GENERATED FROM PYTHON SOURCE LINES 238-239 .. code-block:: default myBiogeme.files_of_type('pickle') .. rst-class:: sphx-glr-script-out .. code-block:: none ['simple_example.pickle'] .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 0.232 seconds) .. _sphx_glr_download_auto_examples_programmers_plot_biogeme.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_biogeme.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_biogeme.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_