Configuration parameters¶
Biogeme can be configured using a parameter file. By default, the name is supposed to be biogeme.toml. If such a
file does not exist, Biogeme will create one containing the default values. The following table provides a description
of all parameters that can be configured.
Name |
Description |
Default |
Section |
Type |
|---|---|---|---|---|
largest_neighborhood |
int: size of the largest neighborhood considered by the Variable Neighborhood Search (VNS) algorithm. |
20 |
AssistedSpecification |
int |
maximum_attempts |
int: an attempts consists in selecting a solution in the Pareto set, and trying to improve it. The parameter imposes an upper bound on the total number of attempts, irrespectively if they are successful or not. |
100 |
AssistedSpecification |
int |
maximum_number_parameters |
int: maximum number of parameters allowed in a model. Each specification with a higher number is deemed invalid and not estimated. |
50 |
AssistedSpecification |
int |
number_of_neighbors |
int: maximum number of neighbors that are visited by the VNS algorithm. |
20 |
AssistedSpecification |
int |
bayesian_draws |
Number of draws per chain from the posterior distribution |
2000 |
Bayesian |
int |
calculate_likelihood |
Calculates likelihood-based statistics from the posterior draws |
True |
Bayesian |
bool |
calculate_loo |
Calculates the Leave-One-Out Cross-Validation (LOO) |
True |
Bayesian |
bool |
calculate_waic |
Calculates the Widely Applicable Information Criterion (WAIC) |
False |
Bayesian |
bool |
chains |
Number of independent Markov chains to run in parallel. |
4 |
Bayesian |
int |
mcmc_sampling_strategy |
Defines how MCMC sampling is performed: ‘automatic’ (selected based on hardware), ‘numpyro-parallel’ (one chain per device), ‘numpyro-vectorized’ (all chains on one device), ‘pymc’ (default PyMC sampler on CPU) |
automatic |
Bayesian |
str |
sample_from_prior |
bool: if “True”, samples from the prior distributions are generated. This may help in the diagnostic of indentification issues. |
True |
Bayesian |
bool |
target_accept |
Target acceptance probability for the No-U-Turn Sampler (NUTS) algorithm. Higher values like 0.9 or 0.95 often work better for problematic posteriors. |
0.9 |
Bayesian |
float |
warmup |
Number of warm-up / burn-in iterations per chain that are used only to adapt the sampler, not to estimate the posterior. |
2000 |
Bayesian |
int |
version |
Version of Biogeme that created the TOML file. Do not modify this value. |
3.3.3 |
Biogeme |
str |
number_of_jobs |
int: The maximum number of concurrently running jobs. If -1 is given, joblib tries to use all CPUs. |
2 |
Bootstrap |
int |
bootstrap_samples |
int: number of re-estimations for bootstrap sampling. |
100 |
Estimation |
int |
calculating_second_derivatives |
Defines how to calculate the second derivatives: analytical,finite_differences,never. |
analytical |
Estimation |
str |
large_data_set |
If the number of observations is larger than this value, the data set is deemed large, and the default estimation algorithm will not use second derivatives. |
100000 |
Estimation |
int |
max_number_parameters_to_report |
int: maximum number of parameters to report during the estimation. |
15 |
Estimation |
int |
maximum_number_catalog_expressions |
If the expression contains catalogs, the parameter sets an upper bound of the total number of possible combinations that can be estimated in the same loop. |
100 |
Estimation |
int |
optimization_algorithm |
str: optimization algorithm to be used for estimation. Valid values: [‘automatic’, ‘scipy’, ‘LS-newton’, ‘TR-newton’, ‘LS-BFGS’, ‘TR-BFGS’, ‘simple_bounds’, ‘simple_bounds_newton’, ‘simple_bounds_BFGS’] |
automatic |
Estimation |
str |
save_iterations |
bool: If True, the current iterate is saved after each iteration, in a file named |
True |
Estimation |
bool |
number_of_draws |
int: Number of draws for Monte-Carlo integration. |
10000 |
MonteCarlo |
int |
seed |
int: Seed used for the pseudo-random number generation. It is useful only when each run should generate the exact same result. If 0, a new seed is used at each run. |
0 |
MonteCarlo |
int |
number_of_threads |
int: Number of threads/processors to be used. If the parameter is 0, the number of available threads is calculated using cpu_count(). |
0 |
MultiThreading |
int |
generate_html |
bool: “True” if the HTML file with the results must be generated. |
True |
Output |
bool |
generate_netcdf |
bool: “True” if the netcdf file with the Bayesian estimation results must be generated. |
True |
Output |
bool |
generate_yaml |
bool: “True” if the yaml file with the results must be generated. |
True |
Output |
bool |
identification_threshold |
float: if the smallest eigenvalue of the second derivative matrix is lesser or equal to this parameter, the model is considered not identified. The corresponding eigenvector is then reported to identify the parameters involved in the issue. |
1e-05 |
Output |
float |
only_robust_stats |
bool: “True” if only the robust statistics need to be reported. If “False”, the statistics from the Rao-Cramer bound are also reported. |
True |
Output |
bool |
save_validation_results |
bool: “True” if the validation results are saved in CSV files. |
True |
Output |
bool |
enlarging_factor |
If an iteration is very successful, the radius of the trust region is multiplied by this factor |
10 |
SimpleBounds |
float |
infeasible_cg |
If True, the conjugate gradient algorithm may generate infeasible solutions until termination. The result will then be projected on the feasible domain. If False, the algorithm stops as soon as an infeasible iterate is generated |
False |
SimpleBounds |
bool |
initial_radius |
Initial radius of the trust region |
1 |
SimpleBounds |
float |
max_iterations |
int: maximum number of iterations |
1000 |
SimpleBounds |
int |
second_derivatives |
float: proportion (between 0 and 1) of iterations when the analytical Hessian is calculated |
1.0 |
SimpleBounds |
float |
steptol |
The algorithm stops when the relative change in x is below this threshold. Basically, if p significant digits of x are needed, steptol should be set to 1.0e-p. |
3.666852862501036e-11 |
SimpleBounds |
float |
tolerance |
float: the algorithm stops when this precision is reached |
6.055454452393343e-06 |
SimpleBounds |
float |
missing_data |
number: If one variable has this value, it is assumed that a data is missing and an exception will be triggered. |
99999 |
Specification |
int |
numerically_safe |
If true, Biogeme is doing its best to deal with numerical issues, such as division by a number close to zero, at the possible expense of speed. |
False |
Specification |
bool |
use_jit |
If True, the model is compiled using jit (just-in-time) to speed up the calculation. For complex models, compilation time may exceed the gain due to compilation, so that it is worth turning it off. |
True |
Specification |
bool |
dogleg |
bool: choice of the method to solve the trust region subproblem. True: dogleg. False: truncated conjugate gradient. |
True |
TrustRegion |
bool |
The structure of the biogeme.toml file is as follows.
1# Default parameter file for Biogeme 3.3.3
2# Automatically created on June 17, 2026. 18:20:01
3
4[Specification]
5missing_data = 99999 # number: If one variable has this value, it is assumed that
6 # a data is missing and an exception will be triggered.
7numerically_safe = "False" # If true, Biogeme is doing its best to deal with
8 # numerical issues, such as division by a number close
9 # to zero, at the possible expense of speed.
10use_jit = "True" # If True, the model is compiled using jit (just-in-time) to speed
11 # up the calculation. For complex models, compilation time may
12 # exceed the gain due to compilation, so that it is worth
13 # turning it off.
14
15[MultiThreading]
16number_of_threads = 0 # int: Number of threads/processors to be used. If the
17 # parameter is 0, the number of available threads is
18 # calculated using cpu_count().
19
20[Output]
21identification_threshold = 1e-05 # float: if the smallest eigenvalue of the
22 # second derivative matrix is lesser or equal to
23 # this parameter, the model is considered not
24 # identified. The corresponding eigenvector is
25 # then reported to identify the parameters
26 # involved in the issue.
27only_robust_stats = "True" # bool: "True" if only the robust statistics need to be
28 # reported. If "False", the statistics from the
29 # Rao-Cramer bound are also reported.
30generate_html = "True" # bool: "True" if the HTML file with the results must be
31 # generated.
32generate_yaml = "True" # bool: "True" if the yaml file with the results must be
33 # generated.
34generate_netcdf = "True" # bool: "True" if the netcdf file with the Bayesian
35 # estimation results must be generated.
36save_validation_results = "True" # bool: "True" if the validation results are saved
37 # in CSV files.
38
39[Estimation]
40bootstrap_samples = 100 # int: number of re-estimations for bootstrap sampling.
41calculating_second_derivatives = "analytical" # Defines how to calculate the second
42 # derivatives:
43 # analytical,finite_differences,never.
44 #
45large_data_set = 100000 # If the number of observations is larger than this
46 # value, the data set is deemed large, and the default
47 # estimation algorithm will not use second derivatives.
48max_number_parameters_to_report = 15 # int: maximum number of parameters to
49 # report during the estimation.
50save_iterations = "True" # bool: If True, the current iterate is saved after each
51 # iteration, in a file named ``__[modelName].iter``,
52 # where ``[modelName]`` is the name given to the model.
53 # If such a file exists, the starting values for the
54 # estimation are replaced by the values saved in the
55 # file.
56maximum_number_catalog_expressions = 100 # If the expression contains catalogs,
57 # the parameter sets an upper bound of
58 # the total number of possible
59 # combinations that can be estimated in
60 # the same loop.
61optimization_algorithm = "automatic" # str: optimization algorithm to be used for
62 # estimation. Valid values: ['automatic',
63 # 'scipy', 'LS-newton', 'TR-newton',
64 # 'LS-BFGS', 'TR-BFGS', 'simple_bounds',
65 # 'simple_bounds_newton',
66 # 'simple_bounds_BFGS']
67
68[SimpleBounds]
69second_derivatives = 1.0 # float: proportion (between 0 and 1) of iterations when
70 # the analytical Hessian is calculated
71tolerance = 6.055454452393343e-06 # float: the algorithm stops when this
72 # precision is reached
73max_iterations = 1000 # int: maximum number of iterations
74infeasible_cg = "False" # If True, the conjugate gradient algorithm may generate
75 # infeasible solutions until termination. The result
76 # will then be projected on the feasible domain. If
77 # False, the algorithm stops as soon as an infeasible
78 # iterate is generated
79initial_radius = 1 # Initial radius of the trust region
80steptol = 3.666852862501036e-11 # The algorithm stops when the relative change in
81 # x is below this threshold. Basically, if p
82 # significant digits of x are needed, steptol
83 # should be set to 1.0e-p.
84enlarging_factor = 10 # If an iteration is very successful, the radius of the
85 # trust region is multiplied by this factor
86
87[Bayesian]
88mcmc_sampling_strategy = "automatic" # Defines how MCMC sampling is performed:
89 # 'automatic' (selected based on hardware),
90 # 'numpyro-parallel' (one chain per device),
91 # 'numpyro-vectorized' (all chains on one
92 # device), 'pymc' (default PyMC sampler on
93 # CPU)
94sample_from_prior = "True" # bool: if "True", samples from the prior distributions
95 # are generated. This may help in the diagnostic of
96 # indentification issues.
97bayesian_draws = 2000 # Number of draws per chain from the posterior distribution
98warmup = 2000 # Number of warm-up / burn-in iterations per chain that are used
99 # only to adapt the sampler, not to estimate the posterior.
100chains = 4 # Number of independent Markov chains to run in parallel.
101target_accept = 0.9 # Target acceptance probability for the No-U-Turn Sampler
102 # (NUTS) algorithm. Higher values like 0.9 or 0.95 often work
103 # better for problematic posteriors.
104calculate_waic = "False" # Calculates the Widely Applicable Information Criterion
105 # (WAIC)
106calculate_loo = "True" # Calculates the Leave-One-Out Cross-Validation (LOO)
107calculate_likelihood = "True" # Calculates likelihood-based statistics from the
108 # posterior draws
109
110[MonteCarlo]
111number_of_draws = 10000 # int: Number of draws for Monte-Carlo integration.
112seed = 0 # int: Seed used for the pseudo-random number generation. It is useful
113 # only when each run should generate the exact same result. If 0, a new
114 # seed is used at each run.
115
116[Biogeme]
117version = "3.3.3" # Version of Biogeme that created the TOML file. Do not modify
118 # this value.
119
120[Bootstrap]
121number_of_jobs = 2 # int: The maximum number of concurrently running jobs. If -1
122 # is given, joblib tries to use all CPUs.
123
124[TrustRegion]
125dogleg = "True" # bool: choice of the method to solve the trust region subproblem.
126 # True: dogleg. False: truncated conjugate gradient.
127
128[AssistedSpecification]
129maximum_number_parameters = 50 # int: maximum number of parameters allowed in a
130 # model. Each specification with a higher number
131 # is deemed invalid and not estimated.
132number_of_neighbors = 20 # int: maximum number of neighbors that are visited by
133 # the VNS algorithm.
134largest_neighborhood = 20 # int: size of the largest neighborhood considered by
135 # the Variable Neighborhood Search (VNS) algorithm.
136maximum_attempts = 100 # int: an attempts consists in selecting a solution in the
137 # Pareto set, and trying to improve it. The parameter
138 # imposes an upper bound on the total number of attempts,
139 # irrespectively if they are successful or not.
140