.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_examples/bayesian_swissmetro/plot_b16_panel_discrete_socio_eco.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_auto_examples_bayesian_swissmetro_plot_b16_panel_discrete_socio_eco.py>`
        to download the full example code.

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_examples_bayesian_swissmetro_plot_b16_panel_discrete_socio_eco.py:


16. Latent class model with panel data
======================================

Bayesian estimation of a discrete mixture of logit models, also called latent class model.
The class membership model includes socio-economic variables.
The datafile is organized as panel data.

Michel Bierlaire, EPFL
Sat Nov 15 2025, 18:12:05

.. GENERATED FROM PYTHON SOURCE LINES 13-27

.. code-block:: Python


    from IPython.core.display_functions import display

    import biogeme.biogeme_logging as blog
    from biogeme.bayesian_estimation import BayesianResults, get_pandas_estimated_parameters
    from biogeme.biogeme import BIOGEME
    from biogeme.expressions import (
        Beta,
        DistributedParameter,
        Draws,
        exp,
        log,
    )
    from biogeme.models import logit


.. GENERATED FROM PYTHON SOURCE LINES 28-29

See the data processing script: :ref:`swissmetro_panel`.

.. GENERATED FROM PYTHON SOURCE LINES 29-47

.. code-block:: Python

    from swissmetro_panel import (
        CAR_AV_SP,
        CAR_CO_SCALED,
        CAR_TT_SCALED,
        CHOICE,
        INCOME,
        SM_AV,
        SM_COST_SCALED,
        SM_TT_SCALED,
        TRAIN_AV_SP,
        TRAIN_COST_SCALED,
        TRAIN_TT_SCALED,
        database,
    )

    logger = blog.get_screen_logger(level=blog.INFO)
    logger.info('Example b16_panel_discrete_socio_eco.py')


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Example b16_panel_discrete_socio_eco.py 


.. GENERATED FROM PYTHON SOURCE LINES 48-49

Parameters to be estimated. One version for each latent_old class.

.. GENERATED FROM PYTHON SOURCE LINES 49-52

.. code-block:: Python

    NUMBER_OF_CLASSES = 2
    b_cost = [Beta(f'b_cost_class{i}', 0, None, None, 0) for i in range(NUMBER_OF_CLASSES)]


.. GENERATED FROM PYTHON SOURCE LINES 53-55

Define a random parameter, normally distributed across individuals,
designed to be used for Monte-Carlo simulation.

.. GENERATED FROM PYTHON SOURCE LINES 55-57

.. code-block:: Python

    b_time = [Beta(f'b_time_class{i}', 0, None, None, 0) for i in range(NUMBER_OF_CLASSES)]


.. GENERATED FROM PYTHON SOURCE LINES 58-59

It is advised not to use 0 as starting value for the following parameter.

.. GENERATED FROM PYTHON SOURCE LINES 59-70

.. code-block:: Python

    b_time_s = [
        Beta(f'b_time_s_class{i}', 1, None, None, 0) for i in range(NUMBER_OF_CLASSES)
    ]
    b_time_rnd = [
        DistributedParameter(
            f'b_time_rnd_class{i}',
            b_time[i] + b_time_s[i] * Draws(f'b_time_eps_class{i}', 'NORMAL'),
        )
        for i in range(NUMBER_OF_CLASSES)
    ]


.. GENERATED FROM PYTHON SOURCE LINES 71-72

We do the same for the constants, to address serial correlation.

.. GENERATED FROM PYTHON SOURCE LINES 72-112

.. code-block:: Python

    asc_car = [
        Beta(f'asc_car_class{i}', 0, None, None, 0) for i in range(NUMBER_OF_CLASSES)
    ]
    asc_car_s = [
        Beta(f'asc_car_s_class{i}', 1, None, None, 0) for i in range(NUMBER_OF_CLASSES)
    ]
    asc_car_rnd = [
        DistributedParameter(
            f'asc_car_rnd_class{i}',
            asc_car[i] + asc_car_s[i] * Draws(f'asc_car_eps_class{i}', 'NORMAL'),
        )
        for i in range(NUMBER_OF_CLASSES)
    ]

    asc_train = [
        Beta(f'asc_train_class{i}', 0, None, None, 0) for i in range(NUMBER_OF_CLASSES)
    ]
    asc_train_s = [
        Beta(f'asc_train_s_class{i}', 1, None, None, 0) for i in range(NUMBER_OF_CLASSES)
    ]
    asc_train_rnd = [
        DistributedParameter(
            f'asc_train_rnd_class{i}',
            asc_train[i] + asc_train_s[i] * Draws(f'asc_train_eps_class{i}', 'NORMAL'),
        )
        for i in range(NUMBER_OF_CLASSES)
    ]

    asc_sm = [Beta(f'asc_sm_class{i}', 0, None, None, 1) for i in range(NUMBER_OF_CLASSES)]
    asc_sm_s = [
        Beta(f'asc_sm_s_class{i}', 1, None, None, 0) for i in range(NUMBER_OF_CLASSES)
    ]
    asc_sm_rnd = [
        DistributedParameter(
            f'asc_sm_rnd_class{i}',
            asc_sm[i] + asc_sm_s[i] * Draws(f'asc_sm_eps_class{i}', 'NORMAL'),
        )
        for i in range(NUMBER_OF_CLASSES)
    ]


.. GENERATED FROM PYTHON SOURCE LINES 113-114

Parameters for the class membership model.

.. GENERATED FROM PYTHON SOURCE LINES 114-117

.. code-block:: Python

    class_cte = Beta('class_cte', 0, None, None, 0)
    class_inc = Beta('class_inc', 0, None, None, 0)


.. GENERATED FROM PYTHON SOURCE LINES 118-119

In class 0, it is assumed that the time coefficient is zero

.. GENERATED FROM PYTHON SOURCE LINES 119-121

.. code-block:: Python

    b_time_rnd[0] = 0


.. GENERATED FROM PYTHON SOURCE LINES 122-123

Utility functions.

.. GENERATED FROM PYTHON SOURCE LINES 123-140

.. code-block:: Python

    v_train_per_class = [
        asc_train_rnd[i] + b_time_rnd[i] * TRAIN_TT_SCALED + b_cost[i] * TRAIN_COST_SCALED
        for i in range(NUMBER_OF_CLASSES)
    ]
    v_swissmetro_per_class = [
        asc_sm_rnd[i] + b_time_rnd[i] * SM_TT_SCALED + b_cost[i] * SM_COST_SCALED
        for i in range(NUMBER_OF_CLASSES)
    ]
    v_car_per_class = [
        asc_car_rnd[i] + b_time_rnd[i] * CAR_TT_SCALED + b_cost[i] * CAR_CO_SCALED
        for i in range(NUMBER_OF_CLASSES)
    ]
    v_per_class = [
        {1: v_train_per_class[i], 2: v_swissmetro_per_class[i], 3: v_car_per_class[i]}
        for i in range(NUMBER_OF_CLASSES)
    ]


.. GENERATED FROM PYTHON SOURCE LINES 141-142

Associate the availability conditions with the alternatives

.. GENERATED FROM PYTHON SOURCE LINES 142-144

.. code-block:: Python

    av = {1: TRAIN_AV_SP, 2: SM_AV, 3: CAR_AV_SP}


.. GENERATED FROM PYTHON SOURCE LINES 145-147

The choice model is a discrete mixture of logit, with availability conditions
We calculate the conditional probability for each class.

.. GENERATED FROM PYTHON SOURCE LINES 147-151

.. code-block:: Python

    conditional_probability_per_class = [
        logit(v_per_class[i], av, CHOICE) for i in range(NUMBER_OF_CLASSES)
    ]


.. GENERATED FROM PYTHON SOURCE LINES 152-153

Class membership model.

.. GENERATED FROM PYTHON SOURCE LINES 153-157

.. code-block:: Python

    score_class_0 = class_cte + class_inc * INCOME
    probability_class_1 = 1 / (1 + exp(score_class_0))
    probability_class_0 = 1 - probability_class_1


.. GENERATED FROM PYTHON SOURCE LINES 158-159

Conditional on the random variables, likelihood for the individual.

.. GENERATED FROM PYTHON SOURCE LINES 159-164

.. code-block:: Python

    conditional_choice_probability = (
        probability_class_0 * conditional_probability_per_class[0]
        + probability_class_1 * conditional_probability_per_class[1]
    )


.. GENERATED FROM PYTHON SOURCE LINES 165-166

We need the log probability per observation

.. GENERATED FROM PYTHON SOURCE LINES 166-168

.. code-block:: Python

    conditional_log_probability = log(conditional_choice_probability)


.. GENERATED FROM PYTHON SOURCE LINES 169-170

%%

.. GENERATED FROM PYTHON SOURCE LINES 170-178

.. code-block:: Python

    the_biogeme = BIOGEME(
        database,
        conditional_log_probability,
        warmup=40,
        bayesian_draws=40,
        chains=1,
    )
    the_biogeme.model_name = 'b16_panel_discrete_socio_eco'


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Biogeme parameters read from biogeme.toml. 


.. GENERATED FROM PYTHON SOURCE LINES 179-180

Estimate the parameters.

.. GENERATED FROM PYTHON SOURCE LINES 180-187

.. code-block:: Python

    try:
        results = BayesianResults.from_netcdf(
            filename=f'saved_results/{the_biogeme.model_name}.nc'
        )
    except FileNotFoundError:
        results = the_biogeme.bayesian_estimation()


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Loaded NetCDF file size: 10.9 MB 
    load finished in 324 ms


.. GENERATED FROM PYTHON SOURCE LINES 188-190

.. code-block:: Python

    print(results.short_summary())


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    /Users/bierlair/python_envs/venv313/lib/python3.13/site-packages/arviz/stats/stats.py:1667: UserWarning: For one or more samples the posterior variance of the log predictive densities exceeds 0.4. This could be indication of WAIC starting to fail. 
    See http://arxiv.org/abs/1507.04544 for details
      warnings.warn(
    /Users/bierlair/python_envs/venv313/lib/python3.13/site-packages/arviz/stats/stats.py:797: UserWarning: Estimated shape parameter of Pareto distribution is greater than 0.38 for one or more samples. You should consider using a more robust model, this is because importance sampling is less likely to work well if the marginal posterior and LOO posterior are very different. This is more likely to happen with a non-robust model and highly influential observations.
      warnings.warn(
    loo_res finished in 52 ms
    loo finished in 52 ms
    Sample size                                              6768
    Sampler                                                  NUTS
    Number of chains                                         1
    Number of draws per chain                                40
    Total number of draws                                    40
    Acceptance rate target                                   0.9
    Run time                                                 0:01:06.651876
    Posterior predictive log-likelihood (sum of log mean p)  -2154.03
    Expected log-likelihood E[log L(Y|θ)]                    -2347.36
    Best-draw log-likelihood (posterior upper bound)         -2289.28
    WAIC (Widely Applicable Information Criterion)           -2744.28
    WAIC Standard Error                                      80.29
    Effective number of parameters (p_WAIC)                  590.25
    LOO (Leave-One-Out Cross-Validation)                     -2651.28
    LOO Standard Error                                       73.43
    Effective number of parameters (p_LOO)                   497.24


.. GENERATED FROM PYTHON SOURCE LINES 191-193

.. code-block:: Python

    pandas_results = get_pandas_estimated_parameters(estimation_results=results)
    display(pandas_results)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Shape validation failed: input_shape: (1, 40), minimum_shape: (chains=2, draws=4)
    Diagnostics computation took 18.3 seconds (cached). 
                      Name  Value (mean)  ...  ESS (bulk)  ESS (tail)
    0            class_cte     -1.985520  ...    1.993462   17.407238
    1            class_inc     -0.265144  ...    2.829976   21.105249
    2     asc_train_class0     -6.728808  ...    4.910675   21.105249
    3   asc_train_s_class0     -1.427167  ...    1.691975   24.157661
    4        b_cost_class0     -8.732684  ...    3.386415   18.886680
    5      asc_sm_s_class0      7.462313  ...    3.103692   44.679600
    6       asc_car_class0     -3.999915  ...    4.905896   40.190375
    7     asc_car_s_class0     -0.819028  ...   17.224894   40.190375
    8     asc_train_class1      0.203905  ...    6.125934   44.679600
    9   asc_train_s_class1      2.625066  ...    3.883404   40.190375
    10       b_time_class1     -8.874934  ...    1.623690   16.681299
    11     b_time_s_class1      5.466661  ...    1.944467   22.003474
    12       b_cost_class1     -4.827650  ...    2.680831   44.679600
    13     asc_sm_s_class1     -1.767727  ...    2.155726   16.681299
    14      asc_car_class1      1.067482  ...    2.194947   21.105249
    15    asc_car_s_class1      5.277336  ...    1.990461   16.681299

    [16 rows x 12 columns]


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** (0 minutes 18.722 seconds)


.. _sphx_glr_download_auto_examples_bayesian_swissmetro_plot_b16_panel_discrete_socio_eco.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: plot_b16_panel_discrete_socio_eco.ipynb <plot_b16_panel_discrete_socio_eco.ipynb>`

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: plot_b16_panel_discrete_socio_eco.py <plot_b16_panel_discrete_socio_eco.py>`

    .. container:: sphx-glr-download sphx-glr-download-zip

      :download:`Download zipped: plot_b16_panel_discrete_socio_eco.zip <plot_b16_panel_discrete_socio_eco.zip>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_