biogeme.results module

Implementation of class containing and processing the estimation results.

author:: Michel Bierlaire
date:: Tue Mar 26 16:50:01 2019

class biogeme.results.Beta(name, value, bounds)[source]

Bases: object

Class gathering the information related to the parameters of the model

Parameters:

name (str)
value (float)
bounds (tuple[float, float])

bootstrap_pValue: float | None: p-value calculated from bootstrap

bootstrap_tTest: float | None: t-test calculated from bootstrap

is_bound_active(threshold=1e-06)[source]

Check if one of the two bound is ‘numerically’ active. Being numerically active means that the distance between the value of the parameter and one of its bounds is below the threshold.

Parameters:: threshold (float) – distance below which the bound is considered to be active. Default: \(10^{-6}\)
Returns:: True is one of the two bounds is numericall y active.
Return type:: bool
Raises:: BiogemeError – if threshold is negative.

lb: float: Lower bound

name: str: Name of the parameter

pValue: float | None: p-value

robust_pValue: float | None: Robust p-value

robust_stdErr: float | None: Robust standard error

robust_tTest: float | None: Robust t-test

set_bootstrap_std_err(std_err)[source]

Records the robust standard error calculated by bootstrap, and calculates and records the corresponding t-statistic and p-value

Parameters:: std_err (float) – standard error calculated by bootstrap.

set_robust_std_err(std_err)[source]

Records the robust standard error, and calculates and records the corresponding t-statistic and p-value

Parameters:: std_err (float) – robust standard error

set_std_err(std_err)[source]

Records the standard error, and calculates and records the corresponding t-statistic and p-value

Parameters:: std_err (float) – standard error.

stdErr: float | None: Standard error

tTest: float | None: t-test

ub: float: Upper bound

value: float: Current value

class biogeme.results.GeneralStatistic(value, format)[source]

Bases: NamedTuple

Parameters:

value (Any)
format (str)

format: str: Alias for field number 1

value: Any: Alias for field number 0

class biogeme.results.RawResults(the_model, beta_values, f_g_h_b, bootstrap=None)[source]

Bases: object

Class containing the raw results from the estimation

Parameters:

the_model (BIOGEME)
beta_values (list[float])
f_g_h_b (BiogemeFunctionOutput)
bootstrap (np.ndarray | None)

F12FileName: str | None: Name of the F12 output file

H: ndarray: Value of the hessian of the loglik. function

betaNames: tuple[str]: Names of the parameters

betaValues: list[float]: Values of the parameters

betas: list[Beta]: List of objects of type results.Beta

bhhh: ndarray: Value of the BHHH matrix of the loglikelihood function

bootstrap: ndarray

output of the bootstrapping. numpy array, of size B x K, where

B is the number of bootstrap iterations
K is the number of parameters to estimate

bootstrap_time: Time needed to perform the bootstrap

convergence: bool: Success of the optimization algorithm

dataname: str: Name of the database

drawsProcessingTime: timedelta: Time needed to process the draws

excludedData: int: Number of excluded data

g: ndarray: Value of the gradient of the loglik. function

gradientNorm: float: Norm of the gradient

htmlFileName: str | None: Name of the HTML output file

initLogLike: float: Value of the likelihood function with the initial value of the parameters

latexFileName: str | None: Name of the LaTeX output file

logLike: float: Value of the loglikelihood function

modelName: str: Name of the model

monte_carlo: bool: True if the model involved Monte Carlo integration

nparam: int: Number of parameters

nullLogLike: float: Value of the likelihood function with equal probability model

numberOfDraws: int: Number of draws for Monte Carlo integration

numberOfObservations: int: Number of observations

numberOfThreads: int: Number of threads used for parallel computing

optimizationMessages: OptimizationResults: Diagnostics given by the optimization algorithm

pickleFileName: str | None: Name of the pickle outpt file

sampleSize: int: Sample size (number of individuals if panel data)

typesOfDraws: dict[str, RandomNumberGeneratorTuple]: Types of draws for Monte Carlo integration

userNotes: str: User notes

class biogeme.results.bioResults(the_raw_results=None, pickle_file=None, identification_threshold=None)[source]

Bases: object

Class managing the estimation results

Parameters:

the_raw_results (RawResults | None)
pickle_file (str | None)
identification_threshold (float | None)

algorithm_has_converged()[source]

Reports if the algorithm has indeed converged

Returns:: True if the algorithm has converged.
Return type:: bool

data: Object of type biogeme.results.RawResults containing the raw estimation results.

getBetaValues(my_betas=None)[source]

Warning

This function is deprecated. Use get_beta_values() instead.

Return type:: dict[str, float]
Parameters:: my_betas (list[str] | None)

getBetasForSensitivityAnalysis(my_betas, size=100, use_bootstrap=True)[source]

Warning

This function is deprecated. Use get_betas_for_sensitivity_analysis() instead.

Return type:

list[dict[str, float]]

Parameters:

my_betas (list[str])
size (int)
use_bootstrap (bool)

getBootstrapVarCovar()[source]

Warning

This function is deprecated. Use get_bootstrap_var_covar() instead.

Return type:: DataFrame

getCorrelationResults(subset=None)[source]

Warning

This function is deprecated. Use get_correlation_results() instead.

Return type:: DataFrame
Parameters:: subset (list[str] | None)

getEstimatedParameters(only_robust=True)[source]

Warning

This function is deprecated. Use get_estimated_parameters() instead.

Return type:: DataFrame
Parameters:: only_robust (bool)

getF12(robust_std_err=True)[source]

Warning

This function is deprecated. Use get_f12() instead.

Return type:: str
Parameters:: robust_std_err (bool)

getGeneralStatistics()[source]

Warning

This function is deprecated. Use get_general_statistics() instead.

Return type:: dict[str, GeneralStatistic]

getHtml(only_robust=True)[source]

Warning

This function is deprecated. Use get_html() instead.

Return type:: str
Parameters:: only_robust (bool)

getLaTeX(onlyRobust=True)[source]: Warning

This function is deprecated. Use get_latex() instead.

getRobustVarCovar()[source]

Warning

This function is deprecated. Use get_robust_var_covar() instead.

Return type:: DataFrame

getVarCovar()[source]

Warning

This function is deprecated. Use get_var_covar() instead.

Return type:: DataFrame

get_beta_values(my_betas=None)[source]

Retrieve the values of the estimated parameters, by names.

Parameters:: my_betas (list(string)) – names of the requested parameters. If None, all available parameters will be reported. Default: None.
Returns:: dict containing the values, where the keys are the names.
Return type:: dict(string:float)
Raises:: BiogemeError – if some requested parameters are not available.

get_betas_for_sensitivity_analysis(my_betas, size=100, use_bootstrap=True)[source]

Generate draws from the distribution of the estimates, for sensitivity analysis.

Parameters:

my_betas (list(string)) – names of the parameters for which draws are requested.
size (int) – number of draws. If useBootstrap is True, the value is ignored and a warning is issued. Default: 100.
use_bootstrap (bool) – if True, the bootstrap estimates are directly used. The advantage is that it does not reyl on the assumption that the estimates follow a normal distribution. Default: True.

Raises:

BiogemeError – if useBootstrap is True and the bootstrap results are not available

Returns:

list of dict. Each dict has a many entries as parameters. The list has as many entries as draws.

Return type:

list(dict)

get_bootstrap_var_covar()[source]

Obtain the bootstrap variance covariance matrix as a Pandas data frame.

Returns:: bootstrap variance covariance matrix, or None if not available
Return type:: pandas.DataFrame

get_correlation_results(subset=None)[source]

Get the statistics about pairs of coefficients as a Pandas dataframe

Parameters:: subset (list(str)) – produce the results only for a subset of parameters. If None, all the parameters are involved. Default: None
Returns:: Pandas data frame with the correlation results
Return type:: pandas.DataFrame

get_estimated_parameters(only_robust=True)[source]

Gather the estimated parameters and the corresponding statistics in a Pandas dataframe.

Parameters:: only_robust (bool) – if True, only the robust statistics are included
Returns:: Pandas dataframe with the results
Return type:: pandas.DataFrame

get_f12(robust_std_err=True)[source]

F12 is a format used by the software ALOGIT to report estimation results.

Parameters:: robust_std_err (bool) – if True, the robust standard errors are reports. If False, the Rao-Cramer are.
Returns:: results in F12 format
Return type:: string

get_general_statistics()[source]

Format the results in a dict

Returns:: dict with the results. The keys describe each content. Each element is a GeneralStatistic tuple, with the value and its preferred formatting.
Return type:: dict[str, GeneralStatistic]

Example:

'Init log likelihood': (-115.30029248549191, '.7g')

Return type:: dict(string:float,string)

get_html(only_robust=True)[source]

Get the results coded in HTML

Parameters:: only_robust (bool) – if True, only the robust statistics are included
Returns:: HTML code
Return type:: string

get_latex(only_robust=True)[source]

Get the results coded in LaTeX

Parameters:: only_robust (bool) – if True, only the robust statistics are included
Returns:: LaTeX code
Return type:: string

get_robust_var_covar()[source]

Obtain the robust variance covariance matrix as a Pandas data frame.

Returns:: robust variance covariance matrix
Return type:: pandas.DataFrame

get_var_covar()[source]

Obtain the Rao-Cramer variance covariance matrix as a Pandas data frame.

Returns:: Rao-Cramer variance covariance matrix
Return type:: pandas.DataFrame

likelihood_ratio_test(other_model, significance_level=0.05)[source]

This function performs a likelihood ratio test between a restricted and an unrestricted model. The “self” model can be either the restricted or the unrestricted.

Parameters:

other_model (bioResults) – other model to perform the test.
significance_level (float) – level of significance of the test. Default: 0.05

Returns:

a tuple containing:

a message with the outcome of the test
the statistic, that is minus two times the difference between the loglikelihood of the two models
the threshold of the chi square distribution.

Return type:

LRTuple(str, float, float)

numberOfFreeParameters()[source]

Warning

This function is deprecated. Use number_of_free_parameters() instead.

Return type:: int

number_of_free_parameters()[source]

This is the number of estimated parameters, minus those that are at their bounds

Return type:: int

printGeneralStatistics()[source]

Warning

This function is deprecated. Use print_general_statistics() instead.

Return type:: str

print_general_statistics()[source]

Print the general statistics of the estimation.

Returns:

general statistics

Example:

Number of estimated parameters: 2
Sample size:    5
Excluded observations:  0
Init log likelihood:    -67.08858
Final log likelihood:   -67.06549
Likelihood ratio test for the init. model:      0.04618175
Rho-square for the init. model: 0.000344
Rho-square-bar for the init. model:     -0.0295
Akaike Information Criterion:   138.131
Bayesian Information Criterion: 137.3499
Final gradient norm:    3.9005E-07
Bootstrapping time:     0:00:00.042713
Nbr of threads: 16

Return type:

str

shortSummary()[source]

Warning

This function is deprecated. Use short_summary() instead.

Provides a short summary of the estimation results. Old syntax

Return type:: str

short_summary()[source]

Provides a short summary of the estimation results

Return type:: str

variance_covariance_missing()[source]

Check if the variance covariance matrix is missing

Returns:: True if missing.
Return type:: bool

writeF12(robust_std_err=True)[source]

Warning

This function is deprecated. Use write_f12() instead.

Return type:: None
Parameters:: robust_std_err (bool)

writeHtml(only_robust=True)[source]

Warning

This function is deprecated. Use write_html() instead.

Return type:: None
Parameters:: only_robust (bool)

writeLaTeX()[source]

Warning

This function is deprecated. Use write_latex() instead.

Return type:: None

writePickle()[source]

Warning

This function is deprecated. Use write_pickle() instead.

Return type:: str

write_f12(robust_std_err=True)[source]

Write the results in F12 file.

Return type:: None
Parameters:: robust_std_err (bool)

write_html(only_robust=True)[source]

Write the results in an HTML file.

Return type:: None
Parameters:: only_robust (bool)

write_latex()[source]

Write the results in a LaTeX file.

Return type:: None

write_pickle()[source]

Dump the data in a file in pickle format.

Returns:: name of the file.
Return type:: string

biogeme.results.calcPValue(t)[source]

Warning

This function is deprecated. Use calc_p_value() instead.

Return type:: float
Parameters:: t (float)

biogeme.results.calc_p_value(t)[source]

Calculates the p value of a parameter from its t-statistic.

The formula is

\[2(1-\Phi(|t|)\]

where \(\Phi(\cdot)\) is the CDF of a normal distribution.

Parameters:: t (float) – t-statistics
Returns:: p-value
Return type:: float

biogeme.results.compileEstimationResults(dict_of_results, statistics=('Number of estimated parameters', 'Sample size', 'Final log likelihood', 'Akaike Information Criterion', 'Bayesian Information Criterion'), include_parameter_estimates=True, include_robust_stderr=False, include_robust_ttest=True, formatted=True, use_short_names=False)[source]

Warning

This function is deprecated. Use compile_estimation_results() instead.

Parameters:

dict_of_results (dict[str, bioResults])
statistics (tuple[str, ...])
include_parameter_estimates (bool)
include_robust_stderr (bool)
include_robust_ttest (bool)
formatted (bool)
use_short_names (bool)

biogeme.results.compile_estimation_results(dict_of_results, statistics=('Number of estimated parameters', 'Sample size', 'Final log likelihood', 'Akaike Information Criterion', 'Bayesian Information Criterion'), include_parameter_estimates=True, include_robust_stderr=False, include_robust_ttest=True, formatted=True, use_short_names=False)[source]

Compile estimation results into a common table

Parameters:

dict_of_results (dict(str: bioResults)) – dict of results, containing for each model the name, the ID and the results, or the name of the pickle file containing them.
statistics (tuple(str)) – list of statistics to include in the summary table
include_parameter_estimates (bool) – if True, the parameter estimates are included.
include_robust_stderr (bool) – if True, the robust standard errors of the parameters are included.
include_robust_ttest (bool) – if True, the t-test of the parameters are included.
formatted (bool) – if True, a formatted string in included in the table results. If False, the numerical values are stored. Use “True” if you need to print the results. Use “False” if you need to use them for further calculation.
use_short_names (bool) – if True, short names, such as Model_1, Model_2, are used to identify the model. It is nicer on for the reporting.

Returns:

pandas dataframe with the requested results, and the specification of each model

Return type:

tuple(pandas.DataFrame, dict(str:dict(str:str)))

biogeme.results.compile_results_in_directory(statistics=('Number of estimated parameters', 'Sample size', 'Final log likelihood', 'Akaike Information Criterion', 'Bayesian Information Criterion'), include_parameter_estimates=True, include_robust_stderr=False, include_robust_ttest=True, formatted=True)[source]

Compile estimation results found in the local directory into a: common table. The results are supposed to be in a file with pickle extension.

Parameters:

statistics (tuple(str)) – list of statistics to include in the summary table
include_parameter_estimates (bool) – if True, the parameter estimates are included.
include_robust_stderr (bool) – if True, the robust standard errors of the parameters are included.
include_robust_ttest (bool) – if True, the t-test of the parameters are included.
formatted (bool) – if True, a formatted string in included in the table results. If False, the numerical values are stored. Use “True” if you need to print the results. Use “False” if you need to use them for further calculation.

Returns:

pandas dataframe with the requested results, or None if no file was found.

Return type:

pandas.DataFrame

biogeme.results.pareto_optimal(dict_of_results, a_pareto=None)[source]

Identifies the non dominated models, with respect to maximum log likelihood and minimum number of parameters

Parameters:

dict_of_results (dict(str:bioResults)) – dict of results associated with their config ID
a_pareto (biogeme.pareto.Pareto) – if not None, Pareto set where the results will be inserted.

Returns:

a dict of named results with pareto optimal results

Return type:

dict(str: biogeme.results.bioResult)