6. Hybrid choice model - Bayesian estimation

This script estimates the full hybrid choice model, combining:

  • a discrete choice model, and

  • a MIMIC model with two latent variables (structural and measurement equations),

using Bayesian estimation in Biogeme.

It represents the most complete specification in the model family and is primarily used to:

  • study identification and normalization under Bayesian inference,

  • analyze posterior distributions of both choice and latent-variable parameters,

  • compare Bayesian and maximum likelihood hybrid models, and

  • assess the added value of latent variables relative to simpler specifications.

The configuration is defined locally in this file and passed to the generic estimation pipeline via estimate_model().

Michel Bierlaire Thu Dec 25 2025, 08:27:43

Biogeme parameters read from biogeme.toml.
Loaded NetCDF file size: 3.7 GB
load finished in 19330 ms (19.33 s)
Results are read from the file saved_results/b06_hybrid_bayes.nc.
posterior_predictive_loglike finished in 408 ms
expected_log_likelihood finished in 15 ms
best_draw_log_likelihood finished in 14 ms
/Users/bierlair/python_envs/venv313/lib/python3.13/site-packages/arviz/stats/stats.py:1667: UserWarning: For one or more samples the posterior variance of the log predictive densities exceeds 0.4. This could be indication of WAIC starting to fail.
See http://arxiv.org/abs/1507.04544 for details
  warnings.warn(
waic_res finished in 914 ms
waic finished in 914 ms
/Users/bierlair/python_envs/venv313/lib/python3.13/site-packages/arviz/stats/stats.py:1057: RuntimeWarning: overflow encountered in exp
  weights = 1 / np.exp(len_scale - len_scale[:, None]).sum(axis=1)
/Users/bierlair/python_envs/venv313/lib/python3.13/site-packages/numpy/_core/_methods.py:52: RuntimeWarning: overflow encountered in reduce
  return umr_sum(a, axis, dtype, out, keepdims, initial, where)
/Users/bierlair/python_envs/venv313/lib/python3.13/site-packages/arviz/stats/stats.py:797: UserWarning: Estimated shape parameter of Pareto distribution is greater than 0.70 for one or more samples. You should consider using a more robust model, this is because importance sampling is less likely to work well if the marginal posterior and LOO posterior are very different. This is more likely to happen with a non-robust model and highly influential observations.
  warnings.warn(
loo_res finished in 19065 ms (19.07 s)
loo finished in 19066 ms (19.07 s)
Sample size                                              896
Sampler                                                  NUTS
Number of chains                                         4
Number of draws per chain                                20000
Total number of draws                                    80000
Acceptance rate target                                   0.9
Run time                                                 4:17:39.810887
Posterior predictive log-likelihood (sum of log mean p)  -16196.80
Expected log-likelihood E[log L(Y|θ)]                    -16503.16
Best-draw log-likelihood (posterior upper bound)         -16361.16
WAIC (Widely Applicable Information Criterion)           -17080.21
WAIC Standard Error                                      134.87
Effective number of parameters (p_WAIC)                  883.41
LOO (Leave-One-Out Cross-Validation)                     -17452.85
LOO Standard Error                                       135.73
Effective number of parameters (p_LOO)                   1256.06
Diagnostics computation took 172.1 seconds (cached).
                                                Name  ...    ESS (tail)
0                                      choice_asc_pt  ...   3743.562786
1          struct_environmental_attitude_childSuburb  ...  30809.469465
2         struct_environmental_attitude_ScaledIncome  ...  18784.263822
3   struct_environmental_attitude_city_center_as_kid  ...  32551.841797
4             struct_environmental_attitude_artisans  ...  35371.503161
..                                               ...  ...           ...
70                                      cars_delta_1  ...  57354.812362
71                                      cars_delta_2  ...  57499.121838
72                         measurement_Mobil12_sigma  ...  37294.927753
73                         measurement_Envir02_sigma  ...  13433.071677
74                         measurement_Mobil10_sigma  ...  38530.742443

[75 rows x 12 columns]

import biogeme.biogeme_logging as blog

from config import Config
from estimate import estimate_model

logger = blog.get_screen_logger(level=blog.INFO)

the_config = Config(
    name='b06_hybrid_bayes',
    latent_variables="two",
    choice_model="yes",
    estimation="bayes",
    number_of_bayesian_draws_per_chain=20_000,
    number_of_monte_carlo_draws=20_000,
)

estimate_model(config=the_config)

Total running time of the script: (3 minutes 43.419 seconds)

Gallery generated by Sphinx-Gallery