.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/swissmetro/plot_b04validation.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_swissmetro_plot_b04validation.py: Out-of-sample validation ======================== Example of the out-of-sample validation of a logit model. Michel Bierlaire, EPFL Wed Jun 18 2025, 11:27:07 .. GENERATED FROM PYTHON SOURCE LINES 11-17 .. code-block:: Python from biogeme.biogeme import BIOGEME from biogeme.expressions import Beta from biogeme.models import loglogit from biogeme.validation import ValidationResult .. GENERATED FROM PYTHON SOURCE LINES 18-19 See the data processing script: :ref:`swissmetro_data`. .. GENERATED FROM PYTHON SOURCE LINES 19-33 .. code-block:: Python from swissmetro_data import ( CAR_AV_SP, CAR_CO_SCALED, CAR_TT_SCALED, CHOICE, SM_AV, SM_COST_SCALED, SM_TT_SCALED, TRAIN_AV_SP, TRAIN_COST_SCALED, TRAIN_TT_SCALED, database, ) .. GENERATED FROM PYTHON SOURCE LINES 34-35 Parameters to be estimated. .. GENERATED FROM PYTHON SOURCE LINES 35-41 .. code-block:: Python asc_car = Beta('asc_car', 0, None, None, 0) asc_train = Beta('asc_train', 0, None, None, 0) asc_sm = Beta('asc_sm', 0, None, None, 1) b_time = Beta('b_time', 0, None, None, 0) b_cost = Beta('b_cost', 0, None, None, 0) .. GENERATED FROM PYTHON SOURCE LINES 42-43 Definition of the utility functions. .. GENERATED FROM PYTHON SOURCE LINES 43-47 .. code-block:: Python v_train = asc_train + b_time * TRAIN_TT_SCALED + b_cost * TRAIN_COST_SCALED v_swissmetro = asc_sm + b_time * SM_TT_SCALED + b_cost * SM_COST_SCALED v_car = asc_car + b_time * CAR_TT_SCALED + b_cost * CAR_CO_SCALED .. GENERATED FROM PYTHON SOURCE LINES 48-49 Associate utility functions with the numbering of alternatives. .. GENERATED FROM PYTHON SOURCE LINES 49-51 .. code-block:: Python v = {1: v_train, 2: v_swissmetro, 3: v_car} .. GENERATED FROM PYTHON SOURCE LINES 52-53 Associate the availability conditions with the alternatives. .. GENERATED FROM PYTHON SOURCE LINES 53-55 .. code-block:: Python av = {1: TRAIN_AV_SP, 2: SM_AV, 3: CAR_AV_SP} .. GENERATED FROM PYTHON SOURCE LINES 56-58 Definition of the model. This is the contribution of each observation to the log likelihood function. .. GENERATED FROM PYTHON SOURCE LINES 58-60 .. code-block:: Python logprob = loglogit(v, av, CHOICE) .. GENERATED FROM PYTHON SOURCE LINES 61-62 Create the Biogeme object. .. GENERATED FROM PYTHON SOURCE LINES 62-65 .. code-block:: Python the_biogeme = BIOGEME(database, logprob) the_biogeme.model_name = 'b04validation' .. GENERATED FROM PYTHON SOURCE LINES 66-67 Estimate the parameters. .. GENERATED FROM PYTHON SOURCE LINES 67-69 .. code-block:: Python results = the_biogeme.estimate() .. GENERATED FROM PYTHON SOURCE LINES 70-78 The validation consists in organizing the data into several slices of about the same size, randomly defined. Each slice is considered as a validation dataset. The model is then re-estimated using all the data except the slice, and the estimated model is applied on the validation set (i.e. the slice). The value of the log likelihood for each observation in the validation set is reported in a dataframe. As this is done for each slice, the output is a list of dataframes, each corresponding to one of these exercises. .. GENERATED FROM PYTHON SOURCE LINES 78-85 .. code-block:: Python validation_results: list[ValidationResult] = the_biogeme.validate(results, slices=5) for slide in validation_results: print( f'Log likelihood for {slide.simulated_values.shape[0]} validation data: ' f'{slide.simulated_values.iloc[:, 0].sum()}' ) .. rst-class:: sphx-glr-script-out .. code-block:: none Log likelihood for 1354 validation data: -1088.076750429419 Log likelihood for 1354 validation data: -1055.4349569657106 Log likelihood for 1354 validation data: -1089.0420376157017 Log likelihood for 1353 validation data: -1040.6517241488427 Log likelihood for 1353 validation data: -1063.6932326057577 .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 2.904 seconds) .. _sphx_glr_download_auto_examples_swissmetro_plot_b04validation.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_b04validation.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_b04validation.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: plot_b04validation.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_