.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/bayesian_swissmetro/plot_b16_panel_discrete_socio_eco.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_bayesian_swissmetro_plot_b16_panel_discrete_socio_eco.py: 16. Latent class model with panel data ====================================== Bayesian estimation of a discrete mixture of logit models, also called latent class model. The class membership model includes socio-economic variables. The datafile is organized as panel data. Michel Bierlaire, EPFL Sat Nov 15 2025, 18:12:05 .. GENERATED FROM PYTHON SOURCE LINES 13-27 .. code-block:: Python from IPython.core.display_functions import display import biogeme.biogeme_logging as blog from biogeme.bayesian_estimation import BayesianResults, get_pandas_estimated_parameters from biogeme.biogeme import BIOGEME from biogeme.expressions import ( Beta, DistributedParameter, Draws, exp, log, ) from biogeme.models import logit .. GENERATED FROM PYTHON SOURCE LINES 28-29 See the data processing script: :ref:`swissmetro_panel`. .. GENERATED FROM PYTHON SOURCE LINES 29-47 .. code-block:: Python from swissmetro_panel import ( CAR_AV_SP, CAR_CO_SCALED, CAR_TT_SCALED, CHOICE, INCOME, SM_AV, SM_COST_SCALED, SM_TT_SCALED, TRAIN_AV_SP, TRAIN_COST_SCALED, TRAIN_TT_SCALED, database, ) logger = blog.get_screen_logger(level=blog.INFO) logger.info('Example b16_panel_discrete_socio_eco.py') .. rst-class:: sphx-glr-script-out .. code-block:: none Example b16_panel_discrete_socio_eco.py .. GENERATED FROM PYTHON SOURCE LINES 48-49 Parameters to be estimated. One version for each latent_old class. .. GENERATED FROM PYTHON SOURCE LINES 49-52 .. code-block:: Python NUMBER_OF_CLASSES = 2 b_cost = [Beta(f'b_cost_class{i}', 0, None, None, 0) for i in range(NUMBER_OF_CLASSES)] .. GENERATED FROM PYTHON SOURCE LINES 53-55 Define a random parameter, normally distributed across individuals, designed to be used for Monte-Carlo simulation. .. GENERATED FROM PYTHON SOURCE LINES 55-57 .. code-block:: Python b_time = [Beta(f'b_time_class{i}', 0, None, None, 0) for i in range(NUMBER_OF_CLASSES)] .. GENERATED FROM PYTHON SOURCE LINES 58-59 It is advised not to use 0 as starting value for the following parameter. .. GENERATED FROM PYTHON SOURCE LINES 59-70 .. code-block:: Python b_time_s = [ Beta(f'b_time_s_class{i}', 1, None, None, 0) for i in range(NUMBER_OF_CLASSES) ] b_time_rnd = [ DistributedParameter( f'b_time_rnd_class{i}', b_time[i] + b_time_s[i] * Draws(f'b_time_eps_class{i}', 'NORMAL'), ) for i in range(NUMBER_OF_CLASSES) ] .. GENERATED FROM PYTHON SOURCE LINES 71-72 We do the same for the constants, to address serial correlation. .. GENERATED FROM PYTHON SOURCE LINES 72-112 .. code-block:: Python asc_car = [ Beta(f'asc_car_class{i}', 0, None, None, 0) for i in range(NUMBER_OF_CLASSES) ] asc_car_s = [ Beta(f'asc_car_s_class{i}', 1, None, None, 0) for i in range(NUMBER_OF_CLASSES) ] asc_car_rnd = [ DistributedParameter( f'asc_car_rnd_class{i}', asc_car[i] + asc_car_s[i] * Draws(f'asc_car_eps_class{i}', 'NORMAL'), ) for i in range(NUMBER_OF_CLASSES) ] asc_train = [ Beta(f'asc_train_class{i}', 0, None, None, 0) for i in range(NUMBER_OF_CLASSES) ] asc_train_s = [ Beta(f'asc_train_s_class{i}', 1, None, None, 0) for i in range(NUMBER_OF_CLASSES) ] asc_train_rnd = [ DistributedParameter( f'asc_train_rnd_class{i}', asc_train[i] + asc_train_s[i] * Draws(f'asc_train_eps_class{i}', 'NORMAL'), ) for i in range(NUMBER_OF_CLASSES) ] asc_sm = [Beta(f'asc_sm_class{i}', 0, None, None, 1) for i in range(NUMBER_OF_CLASSES)] asc_sm_s = [ Beta(f'asc_sm_s_class{i}', 1, None, None, 0) for i in range(NUMBER_OF_CLASSES) ] asc_sm_rnd = [ DistributedParameter( f'asc_sm_rnd_class{i}', asc_sm[i] + asc_sm_s[i] * Draws(f'asc_sm_eps_class{i}', 'NORMAL'), ) for i in range(NUMBER_OF_CLASSES) ] .. GENERATED FROM PYTHON SOURCE LINES 113-114 Parameters for the class membership model. .. GENERATED FROM PYTHON SOURCE LINES 114-117 .. code-block:: Python class_cte = Beta('class_cte', 0, None, None, 0) class_inc = Beta('class_inc', 0, None, None, 0) .. GENERATED FROM PYTHON SOURCE LINES 118-119 In class 0, it is assumed that the time coefficient is zero .. GENERATED FROM PYTHON SOURCE LINES 119-121 .. code-block:: Python b_time_rnd[0] = 0 .. GENERATED FROM PYTHON SOURCE LINES 122-123 Utility functions. .. GENERATED FROM PYTHON SOURCE LINES 123-140 .. code-block:: Python v_train_per_class = [ asc_train_rnd[i] + b_time_rnd[i] * TRAIN_TT_SCALED + b_cost[i] * TRAIN_COST_SCALED for i in range(NUMBER_OF_CLASSES) ] v_swissmetro_per_class = [ asc_sm_rnd[i] + b_time_rnd[i] * SM_TT_SCALED + b_cost[i] * SM_COST_SCALED for i in range(NUMBER_OF_CLASSES) ] v_car_per_class = [ asc_car_rnd[i] + b_time_rnd[i] * CAR_TT_SCALED + b_cost[i] * CAR_CO_SCALED for i in range(NUMBER_OF_CLASSES) ] v_per_class = [ {1: v_train_per_class[i], 2: v_swissmetro_per_class[i], 3: v_car_per_class[i]} for i in range(NUMBER_OF_CLASSES) ] .. GENERATED FROM PYTHON SOURCE LINES 141-142 Associate the availability conditions with the alternatives .. GENERATED FROM PYTHON SOURCE LINES 142-144 .. code-block:: Python av = {1: TRAIN_AV_SP, 2: SM_AV, 3: CAR_AV_SP} .. GENERATED FROM PYTHON SOURCE LINES 145-147 The choice model is a discrete mixture of logit, with availability conditions We calculate the conditional probability for each class. .. GENERATED FROM PYTHON SOURCE LINES 147-151 .. code-block:: Python conditional_probability_per_class = [ logit(v_per_class[i], av, CHOICE) for i in range(NUMBER_OF_CLASSES) ] .. GENERATED FROM PYTHON SOURCE LINES 152-153 Class membership model. .. GENERATED FROM PYTHON SOURCE LINES 153-157 .. code-block:: Python score_class_0 = class_cte + class_inc * INCOME probability_class_1 = 1 / (1 + exp(score_class_0)) probability_class_0 = 1 - probability_class_1 .. GENERATED FROM PYTHON SOURCE LINES 158-159 Conditional on the random variables, likelihood for the individual. .. GENERATED FROM PYTHON SOURCE LINES 159-164 .. code-block:: Python conditional_choice_probability = ( probability_class_0 * conditional_probability_per_class[0] + probability_class_1 * conditional_probability_per_class[1] ) .. GENERATED FROM PYTHON SOURCE LINES 165-166 We need the log probability per observation .. GENERATED FROM PYTHON SOURCE LINES 166-168 .. code-block:: Python conditional_log_probability = log(conditional_choice_probability) .. GENERATED FROM PYTHON SOURCE LINES 169-170 %% .. GENERATED FROM PYTHON SOURCE LINES 170-178 .. code-block:: Python the_biogeme = BIOGEME( database, conditional_log_probability, warmup=40, bayesian_draws=40, chains=1, ) the_biogeme.model_name = 'b16_panel_discrete_socio_eco' .. rst-class:: sphx-glr-script-out .. code-block:: none Biogeme parameters read from biogeme.toml. .. GENERATED FROM PYTHON SOURCE LINES 179-180 Estimate the parameters. .. GENERATED FROM PYTHON SOURCE LINES 180-187 .. code-block:: Python try: results = BayesianResults.from_netcdf( filename=f'saved_results/{the_biogeme.model_name}.nc' ) except FileNotFoundError: results = the_biogeme.bayesian_estimation() .. rst-class:: sphx-glr-script-out .. code-block:: none Loaded NetCDF file size: 10.9 MB load finished in 324 ms .. GENERATED FROM PYTHON SOURCE LINES 188-190 .. code-block:: Python print(results.short_summary()) .. rst-class:: sphx-glr-script-out .. code-block:: none /Users/bierlair/python_envs/venv313/lib/python3.13/site-packages/arviz/stats/stats.py:1667: UserWarning: For one or more samples the posterior variance of the log predictive densities exceeds 0.4. This could be indication of WAIC starting to fail. See http://arxiv.org/abs/1507.04544 for details warnings.warn( /Users/bierlair/python_envs/venv313/lib/python3.13/site-packages/arviz/stats/stats.py:797: UserWarning: Estimated shape parameter of Pareto distribution is greater than 0.38 for one or more samples. You should consider using a more robust model, this is because importance sampling is less likely to work well if the marginal posterior and LOO posterior are very different. This is more likely to happen with a non-robust model and highly influential observations. warnings.warn( loo_res finished in 52 ms loo finished in 52 ms Sample size 6768 Sampler NUTS Number of chains 1 Number of draws per chain 40 Total number of draws 40 Acceptance rate target 0.9 Run time 0:01:06.651876 Posterior predictive log-likelihood (sum of log mean p) -2154.03 Expected log-likelihood E[log L(Y|θ)] -2347.36 Best-draw log-likelihood (posterior upper bound) -2289.28 WAIC (Widely Applicable Information Criterion) -2744.28 WAIC Standard Error 80.29 Effective number of parameters (p_WAIC) 590.25 LOO (Leave-One-Out Cross-Validation) -2651.28 LOO Standard Error 73.43 Effective number of parameters (p_LOO) 497.24 .. GENERATED FROM PYTHON SOURCE LINES 191-193 .. code-block:: Python pandas_results = get_pandas_estimated_parameters(estimation_results=results) display(pandas_results) .. rst-class:: sphx-glr-script-out .. code-block:: none Shape validation failed: input_shape: (1, 40), minimum_shape: (chains=2, draws=4) Diagnostics computation took 18.3 seconds (cached). Name Value (mean) ... ESS (bulk) ESS (tail) 0 class_cte -1.985520 ... 1.993462 17.407238 1 class_inc -0.265144 ... 2.829976 21.105249 2 asc_train_class0 -6.728808 ... 4.910675 21.105249 3 asc_train_s_class0 -1.427167 ... 1.691975 24.157661 4 b_cost_class0 -8.732684 ... 3.386415 18.886680 5 asc_sm_s_class0 7.462313 ... 3.103692 44.679600 6 asc_car_class0 -3.999915 ... 4.905896 40.190375 7 asc_car_s_class0 -0.819028 ... 17.224894 40.190375 8 asc_train_class1 0.203905 ... 6.125934 44.679600 9 asc_train_s_class1 2.625066 ... 3.883404 40.190375 10 b_time_class1 -8.874934 ... 1.623690 16.681299 11 b_time_s_class1 5.466661 ... 1.944467 22.003474 12 b_cost_class1 -4.827650 ... 2.680831 44.679600 13 asc_sm_s_class1 -1.767727 ... 2.155726 16.681299 14 asc_car_class1 1.067482 ... 2.194947 21.105249 15 asc_car_s_class1 5.277336 ... 1.990461 16.681299 [16 rows x 12 columns] .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 18.722 seconds) .. _sphx_glr_download_auto_examples_bayesian_swissmetro_plot_b16_panel_discrete_socio_eco.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_b16_panel_discrete_socio_eco.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_b16_panel_discrete_socio_eco.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: plot_b16_panel_discrete_socio_eco.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_