Configuration parameters

Biogeme can be configured using a parameter file. By default, the name is supposed to be biogeme.toml. If such a file does not exist, Biogeme will create one containing the default values. The following table provides a description of all parameters that can be configured.

Name

Description

Default

Section

Type

largest_neighborhood

int: size of the largest neighborhood considered by the Variable Neighborhood Search (VNS) algorithm.

20

AssistedSpecification

int

maximum_attempts

int: an attempts consists in selecting a solution in the Pareto set, and trying to improve it. The parameter imposes an upper bound on the total number of attempts, irrespectively if they are successful or not.

100

AssistedSpecification

int

maximum_number_parameters

int: maximum number of parameters allowed in a model. Each specification with a higher number is deemed invalid and not estimated.

50

AssistedSpecification

int

number_of_neighbors

int: maximum number of neighbors that are visited by the VNS algorithm.

20

AssistedSpecification

int

bayesian_draws

Number of draws per chain from the posterior distribution

2000

Bayesian

int

calculate_likelihood

Calculates likelihood-based statistics from the posterior draws

True

Bayesian

bool

calculate_loo

Calculates the Leave-One-Out Cross-Validation (LOO)

True

Bayesian

bool

calculate_waic

Calculates the Widely Applicable Information Criterion (WAIC)

True

Bayesian

bool

chains

Number of independent Markov chains to run in parallel.

4

Bayesian

int

mcmc_sampling_strategy

Defines how MCMC sampling is performed: ‘automatic’ (selected based on hardware), ‘numpyro-parallel’ (one chain per device), ‘numpyro-vectorized’ (all chains on one device), ‘pymc’ (default PyMC sampler on CPU)

automatic

Bayesian

str

sample_from_prior

bool: if “True”, samples from the prior distributions are generated. This may help in the diagnostic of indentification issues.

True

Bayesian

bool

target_accept

Target acceptance probability for the No-U-Turn Sampler (NUTS) algorithm. Higher values like 0.9 or 0.95 often work better for problematic posteriors.

0.9

Bayesian

float

warmup

Number of warm-up / burn-in iterations per chain that are used only to adapt the sampler, not to estimate the posterior.

2000

Bayesian

int

version

Version of Biogeme that created the TOML file. Do not modify this value.

3.3.2

Biogeme

str

number_of_jobs

int: The maximum number of concurrently running jobs. If -1 is given, joblib tries to use all CPUs.

2

Bootstrap

int

bootstrap_samples

int: number of re-estimations for bootstrap sampling.

100

Estimation

int

calculating_second_derivatives

Defines how to calculate the second derivatives: analytical,finite_differences,never.

analytical

Estimation

str

large_data_set

If the number of observations is larger than this value, the data set is deemed large, and the default estimation algorithm will not use second derivatives.

100000

Estimation

int

max_number_parameters_to_report

int: maximum number of parameters to report during the estimation.

15

Estimation

int

maximum_number_catalog_expressions

If the expression contains catalogs, the parameter sets an upper bound of the total number of possible combinations that can be estimated in the same loop.

100

Estimation

int

optimization_algorithm

str: optimization algorithm to be used for estimation. Valid values: [‘automatic’, ‘scipy’, ‘LS-newton’, ‘TR-newton’, ‘LS-BFGS’, ‘TR-BFGS’, ‘simple_bounds’, ‘simple_bounds_newton’, ‘simple_bounds_BFGS’]

automatic

Estimation

str

save_iterations

bool: If True, the current iterate is saved after each iteration, in a file named __[modelName].iter, where [modelName] is the name given to the model. If such a file exists, the starting values for the estimation are replaced by the values saved in the file.

True

Estimation

bool

number_of_draws

int: Number of draws for Monte-Carlo integration.

10000

MonteCarlo

int

seed

int: Seed used for the pseudo-random number generation. It is useful only when each run should generate the exact same result. If 0, a new seed is used at each run.

0

MonteCarlo

int

number_of_threads

int: Number of threads/processors to be used. If the parameter is 0, the number of available threads is calculated using cpu_count().

0

MultiThreading

int

generate_html

bool: “True” if the HTML file with the results must be generated.

True

Output

bool

generate_netcdf

bool: “True” if the netcdf file with the Bayesian estimation results must be generated.

True

Output

bool

generate_yaml

bool: “True” if the yaml file with the results must be generated.

True

Output

bool

identification_threshold

float: if the smallest eigenvalue of the second derivative matrix is lesser or equal to this parameter, the model is considered not identified. The corresponding eigenvector is then reported to identify the parameters involved in the issue.

1e-05

Output

float

only_robust_stats

bool: “True” if only the robust statistics need to be reported. If “False”, the statistics from the Rao-Cramer bound are also reported.

True

Output

bool

save_validation_results

bool: “True” if the validation results are saved in CSV files.

True

Output

bool

enlarging_factor

If an iteration is very successful, the radius of the trust region is multiplied by this factor

10

SimpleBounds

float

infeasible_cg

If True, the conjugate gradient algorithm may generate infeasible solutions until termination. The result will then be projected on the feasible domain. If False, the algorithm stops as soon as an infeasible iterate is generated

False

SimpleBounds

bool

initial_radius

Initial radius of the trust region

1

SimpleBounds

float

max_iterations

int: maximum number of iterations

1000

SimpleBounds

int

second_derivatives

float: proportion (between 0 and 1) of iterations when the analytical Hessian is calculated

1.0

SimpleBounds

float

steptol

The algorithm stops when the relative change in x is below this threshold. Basically, if p significant digits of x are needed, steptol should be set to 1.0e-p.

3.666852862501036e-11

SimpleBounds

float

tolerance

float: the algorithm stops when this precision is reached

6.055454452393343e-06

SimpleBounds

float

missing_data

number: If one variable has this value, it is assumed that a data is missing and an exception will be triggered.

99999

Specification

int

numerically_safe

If true, Biogeme is doing its best to deal with numerical issues, such as division by a number close to zero, at the possible expense of speed.

False

Specification

bool

use_jit

If True, the model is compiled using jit (just-in-time) to speed up the calculation. For complex models, compilation time may exceed the gain due to compilation, so that it is worth turning it off.

True

Specification

bool

dogleg

bool: choice of the method to solve the trust region subproblem. True: dogleg. False: truncated conjugate gradient.

True

TrustRegion

bool

The structure of the biogeme.toml file is as follows.

  1# Default parameter file for Biogeme 3.3.2
  2# Automatically created on December 25, 2025. 21:51:51
  3
  4[MonteCarlo]
  5number_of_draws = 10000 # int: Number of draws for Monte-Carlo integration.
  6seed = 0 # int: Seed used for the pseudo-random number generation. It is useful
  7         # only when each run should generate the exact same result. If 0, a new
  8         # seed is used at each run.
  9
 10[TrustRegion]
 11dogleg = "True" # bool: choice of the method to solve the trust region subproblem.
 12                # True: dogleg. False: truncated conjugate gradient.
 13
 14[Specification]
 15missing_data = 99999 # number: If one variable has this value, it is assumed that
 16                     # a data is missing and an exception will be triggered.
 17numerically_safe = "False" # If true, Biogeme is doing its best to deal with
 18                           # numerical issues, such as division by a number close
 19                           # to zero, at the possible expense of speed.
 20use_jit = "True" # If True, the model is compiled using jit (just-in-time) to speed
 21                 # up the calculation. For complex models, compilation time may
 22                 # exceed the gain due to compilation, so that it is worth
 23                 # turning it off.
 24
 25[Estimation]
 26bootstrap_samples = 100 # int: number of re-estimations for bootstrap sampling.
 27calculating_second_derivatives = "analytical" # Defines how to calculate the second
 28                                              # derivatives:
 29                                              # analytical,finite_differences,never.
 30                                              # 
 31large_data_set = 100000 # If the number of observations is larger than this
 32                        # value, the data set is deemed large, and the default
 33                        # estimation algorithm will not use second derivatives.
 34max_number_parameters_to_report = 15 # int: maximum number of parameters to
 35                                     # report during the estimation.
 36save_iterations = "True" # bool: If True, the current iterate is saved after each
 37                         # iteration, in a file named ``__[modelName].iter``,
 38                         # where ``[modelName]`` is the name given to the model.
 39                         # If such a file exists, the starting values for the
 40                         # estimation are replaced by the values saved in the
 41                         # file.
 42maximum_number_catalog_expressions = 100 # If the expression contains catalogs,
 43                                         # the parameter sets an upper bound of
 44                                         # the total number of possible
 45                                         # combinations that can be estimated in
 46                                         # the same loop.
 47optimization_algorithm = "automatic" # str: optimization algorithm to be used for
 48                                     # estimation. Valid values: ['automatic',
 49                                     # 'scipy', 'LS-newton', 'TR-newton',
 50                                     # 'LS-BFGS', 'TR-BFGS', 'simple_bounds',
 51                                     # 'simple_bounds_newton',
 52                                     # 'simple_bounds_BFGS']
 53
 54[Bayesian]
 55mcmc_sampling_strategy = "automatic" # Defines how MCMC sampling is performed:
 56                                     # 'automatic' (selected based on hardware),
 57                                     # 'numpyro-parallel' (one chain per device),
 58                                     # 'numpyro-vectorized' (all chains on one
 59                                     # device), 'pymc' (default PyMC sampler on
 60                                     # CPU)
 61sample_from_prior = "True" # bool: if "True", samples from the prior distributions
 62                           # are generated. This may help in the diagnostic of
 63                           # indentification issues.
 64bayesian_draws = 2000 # Number of draws per chain from the posterior distribution
 65warmup = 2000 # Number of warm-up / burn-in iterations per chain that are used
 66              # only to adapt the sampler, not to estimate the posterior.
 67chains = 4 # Number of independent Markov chains to run in parallel.
 68target_accept = 0.9 # Target acceptance probability for the No-U-Turn Sampler
 69                    # (NUTS) algorithm. Higher values like 0.9 or 0.95 often work
 70                    # better for problematic posteriors.
 71calculate_waic = "True" # Calculates the Widely Applicable Information Criterion
 72                        # (WAIC)
 73calculate_loo = "True" # Calculates the Leave-One-Out Cross-Validation (LOO)
 74calculate_likelihood = "True" # Calculates likelihood-based statistics from the
 75                              # posterior draws
 76
 77[Output]
 78identification_threshold = 1e-05 # float: if the smallest eigenvalue of the
 79                                 # second derivative matrix is lesser or equal to
 80                                 # this parameter, the model is considered not
 81                                 # identified. The corresponding eigenvector is
 82                                 # then reported to identify the parameters
 83                                 # involved in the issue.
 84only_robust_stats = "True" # bool: "True" if only the robust statistics need to be
 85                           # reported. If "False", the statistics from the
 86                           # Rao-Cramer bound are also reported.
 87generate_html = "True" # bool: "True" if the HTML file with the results must be
 88                       # generated.
 89generate_yaml = "True" # bool: "True" if the yaml file with the results must be
 90                       # generated.
 91generate_netcdf = "True" # bool: "True" if the netcdf file with the Bayesian
 92                         # estimation results must be generated.
 93save_validation_results = "True" # bool: "True" if the validation results are saved
 94                                 # in CSV files.
 95
 96[MultiThreading]
 97number_of_threads = 0 # int: Number of threads/processors to be used. If the
 98                      # parameter is 0, the number of available threads is
 99                      # calculated using cpu_count().
100
101[Bootstrap]
102number_of_jobs = 2 # int: The maximum number of concurrently running jobs. If -1
103                   # is given, joblib tries to use all CPUs.
104
105[Biogeme]
106version = "3.3.2" # Version of Biogeme that created the TOML file. Do not modify
107                  # this value.
108
109[AssistedSpecification]
110maximum_number_parameters = 50 # int: maximum number of parameters allowed in a
111                               # model. Each specification with a higher number
112                               # is deemed invalid and not estimated.
113number_of_neighbors = 20 # int: maximum number of neighbors that are visited by
114                         # the VNS algorithm.
115largest_neighborhood = 20 # int: size of the largest neighborhood considered by
116                          # the Variable Neighborhood Search (VNS) algorithm.
117maximum_attempts = 100 # int: an attempts consists in selecting a solution in the
118                       # Pareto set, and trying to improve it. The parameter
119                       # imposes an upper bound on the total number of attempts,
120                       # irrespectively if they are successful or not.
121
122[SimpleBounds]
123second_derivatives = 1.0 # float: proportion (between 0 and 1) of iterations when
124                         # the analytical Hessian is calculated
125tolerance = 6.055454452393343e-06 # float: the algorithm stops when this
126                                  # precision is reached
127max_iterations = 1000 # int: maximum number of iterations
128infeasible_cg = "False" # If True, the conjugate gradient algorithm may generate
129                        # infeasible solutions until termination.  The result
130                        # will then be projected on the feasible domain.  If
131                        # False, the algorithm stops as soon as an infeasible
132                        # iterate is generated
133initial_radius = 1 # Initial radius of the trust region
134steptol = 3.666852862501036e-11 # The algorithm stops when the relative change in
135                                # x is below this threshold. Basically, if p
136                                # significant digits of x are needed, steptol
137                                # should be set to 1.0e-p.
138enlarging_factor = 10 # If an iteration is very successful, the radius of the
139                      # trust region is multiplied by this factor
140