Note
Go to the end to download the full example code.
1a. Estimation of a multinomial logit modelΒΆ
This example estimates a multinomial logit model using the Swissmetro stated-preference dataset.
Three transportation alternatives are considered:
Train,
Swissmetro,
Car.
The utility functions include alternative-specific constants and generic coefficients associated with travel time and travel cost. The Swissmetro alternative is used as the reference alternative, and its alternative-specific constant is therefore fixed to zero for identification purposes.
The script illustrates the complete workflow:
Import and prepare the data.
Define the model parameters.
Specify the utility functions and availability conditions.
Formulate the log-likelihood function.
Estimate the model using Biogeme.
Display a summary of the estimation results.
Export the estimated parameters as a pandas table.
The # %% markers are used to separate the script into notebook
cells when the example gallery is converted into Jupyter notebooks.
Tested with Biogeme 3.3.3.
Michel Bierlaire, EPFL Tue Jun 09 2026, 14:30:00
from IPython.core.display_functions import display
Import the variables and the database prepared in the Swissmetro data processing example: Data preparation for Swissmetro.
from swissmetro_data import (
CAR_AV_SP,
CAR_CO_SCALED,
CAR_TT_SCALED,
CHOICE,
SM_AV,
SM_COST_SCALED,
SM_TT_SCALED,
TRAIN_AV_SP,
TRAIN_COST_SCALED,
TRAIN_TT_SCALED,
database,
)
import biogeme.biogeme_logging as blog
from biogeme.biogeme import BIOGEME
from biogeme.expressions import Beta
from biogeme.models import loglogit
from biogeme.results_processing import get_pandas_estimated_parameters
The logger sets the verbosity of Biogeme. By default, Biogeme is quite silent and generates only warnings. To have more information about what is happening behind the scene, the level should be set to blog.INFO.
logger = blog.get_screen_logger(level=blog.INFO)
logger.info('Example b01logit_bis.py')
Example b01logit_bis.py
Parameters to be estimated: alternative specific constants
asc_car = Beta('asc_car', 0, None, None, 0)
asc_train = Beta('asc_train', 0, None, None, 0)
The constant associated with Swissmetro is normalized to zero. It does not need to be defined at all. Here, we illustrate the fact that setting the last argument of the Beta function to 1 fixes the parameter to its default value (here, 0).
asc_sm = Beta('asc_sm', 0, None, None, 1)
Coefficients of the attributes
b_time = Beta('b_time', 0, None, None, 0)
b_cost = Beta('b_cost', 0, None, None, 0)
Definition of the utility functions.
v_train = asc_train + b_time * TRAIN_TT_SCALED + b_cost * TRAIN_COST_SCALED
v_sm = asc_sm + b_time * SM_TT_SCALED + b_cost * SM_COST_SCALED
v_car = asc_car + b_time * CAR_TT_SCALED + b_cost * CAR_CO_SCALED
Associate utility functions with the numbering of alternatives.
v = {1: v_train, 2: v_sm, 3: v_car}
Associate the availability conditions with the alternatives.
av = {1: TRAIN_AV_SP, 2: SM_AV, 3: CAR_AV_SP}
Definition of the model. This is the contribution of each observation to the log likelihood function.
log_probability = loglogit(v, av, CHOICE)
Create the Biogeme object.
the_biogeme = BIOGEME(database, log_probability)
the_biogeme.model_name = 'b01a_logit'
Default values of the Biogeme parameters are used.
File biogeme.toml has been created
Calculate the null log likelihood for reporting.
the_biogeme.calculate_null_loglikelihood(av)
-6964.662979192191
Estimate the model parameters by maximum likelihood.
results = the_biogeme.estimate()
*** Initial values of the parameters are obtained from the file __b01a_logit.iter
Cannot read file __b01a_logit.iter. Statement is ignored.
Starting values for the algorithm: {}
As the model is not too complex, we activate the calculation of second derivatives. To change this behavior, modify the algorithm to "simple_bounds" in the TOML file.
Optimization algorithm: hybrid Newton/BFGS with simple bounds [simple_bounds]
** Optimization: Newton with trust region for simple bounds
Iter. asc_train b_time b_cost asc_car Function Relgrad Radius Rho
0 -0.92 -0.67 -0.88 -0.49 5.4e+03 0.041 10 1.1 ++
1 -0.73 -1.2 -1 -0.18 5.3e+03 0.0072 1e+02 1.1 ++
2 -0.7 -1.3 -1.1 -0.16 5.3e+03 0.00018 1e+03 1 ++
3 -0.7 -1.3 -1.1 -0.16 5.3e+03 1.1e-07 1e+03 1 ++
Optimization algorithm has converged.
Relative gradient: 1.0595595037775285e-07
Cause of termination: Relative gradient = 1.1e-07 <= 6.1e-06
Number of function evaluations: 13
Number of gradient evaluations: 9
Number of hessian evaluations: 4
Algorithm: Newton with trust region for simple bound constraints
Number of iterations: 4
Proportion of Hessian calculation: 4/4 = 100.0%
Optimization time: 0:00:00.640926
Calculate second derivatives and BHHH
File b01a_logit.html has been generated.
File b01a_logit.yaml has been generated.
Display a short textual summary of the estimation results.
print(results.short_summary())
Results for model b01a_logit
Nbr of parameters: 4
Sample size: 6768
Excluded data: 3960
Null log likelihood: -6964.663
Final log likelihood: -5331.252
Likelihood ratio test (null): 3266.822
Rho square (null): 0.235
Rho bar square (null): 0.234
Akaike Information Criterion: 10670.5
Bayesian Information Criterion: 10697.78
Convert the estimation results into a pandas DataFrame.
pandas_results = get_pandas_estimated_parameters(
estimation_results=results,
)
display(pandas_results)
{'Estimated parameters': Name Value Robust std err. Robust t-stat. Robust p-value
0 asc_train -0.701187 0.082562 -8.492857 0.000000
1 b_time -1.277859 0.104254 -12.257120 0.000000
2 b_cost -1.083790 0.068225 -15.885521 0.000000
3 asc_car -0.154633 0.058163 -2.658590 0.007847}
Total running time of the script: (0 minutes 1.504 seconds)