biogeme.tools

Examples of use of several functions.

This is designed for programmers who need examples of use of the functions of the module. The examples are designed to illustrate the syntax. They do not correspond to any meaningful model.

author:

Michel Bierlaire

date:

Sat Dec 2 13:09:42 2023

import biogeme.biogeme_logging as blog
import numpy as np
import pandas as pd
from IPython.core.display_functions import display
from biogeme.exceptions import BiogemeError
from biogeme.function_output import FunctionOutput
from biogeme.tools import (
    CheckDerivativesResults,
    calculate_prime_numbers,
    check_derivatives,
    count_number_of_groups,
    findiff_g,
    findiff_h,
    get_prime_numbers,
    likelihood_ratio,
)
from biogeme.version import get_text

Version of Biogeme.

print(get_text())
biogeme 3.3.1 [2025-09-03]
Home page: http://biogeme.epfl.ch
Submit questions to https://groups.google.com/d/forum/biogeme
Michel Bierlaire, Transport and Mobility Laboratory, Ecole Polytechnique Fédérale de Lausanne (EPFL)
logger = blog.get_screen_logger(level=blog.INFO)

Define a function and its derivatives:

\[f = \log(x_0) + \exp(x_1),\]
\[\begin{split}g = \left( \begin{array}{c}\frac{1}{x_0} \\ \exp(x_1)\end{array}\right),\end{split}\]
\[\begin{split}h=\left(\begin{array}{cc} -\frac{1}{x_0^2} & 0 \\ 0 & \exp(x_1)\end{array}\right).\end{split}\]
def my_function(
    x: np.ndarray, gradient: bool, hessian: bool, bhhh: bool
) -> FunctionOutput:
    """Implementation of the test function.

    :param x: point at which the function and its derivatives must be evaluated.
    """
    f = np.log(x[0]) + np.exp(x[1])
    g = np.empty(2)
    g[0] = 1.0 / x[0]
    g[1] = np.exp(x[1])
    h = np.empty((2, 2))
    h[0, 0] = -1.0 / x[0] ** 2
    h[0, 1] = 0.0
    h[1, 0] = 0.0
    h[1, 1] = np.exp(x[1])
    return FunctionOutput(
        function=float(f),
        gradient=g if gradient else None,
        hessian=h if hessian else None,
        bhhh=None,
    )

Evaluate the function at the point

\[\begin{split}x = \left( \begin{array}{c}1.1 \\ 1.1 \end{array}\right).\end{split}\]
x = np.array([1.1, 1.1])
the_output = my_function(x, gradient=True, hessian=True, bhhh=False)
display(the_output.function)
3.099476203750758

We use the DataFrame for a nicer display.

pd.DataFrame(the_output.gradient)
0
0 0.909091
1 3.004166


pd.DataFrame(the_output.hessian)
0 1
0 -0.826446 0.000000
1 0.000000 3.004166


Calculates an approximation of the gradient by finite differences.

g_fd = findiff_g(my_function, x)
pd.DataFrame(g_fd)
0
0 0.909091
1 3.004166


Check the precision of the approximation

pd.DataFrame(the_output.gradient - g_fd)
0
0 0.000000e+00
1 6.486904e-10


Calculates an approximation of the Hessian by finite differences.

h_fd = findiff_h(my_function, x)
pd.DataFrame(h_fd)
0 1
0 -0.826446 0.000000
1 0.000000 3.004166


Check the precision of the approximation

pd.DataFrame(the_output.hessian - h_fd)
0 1
0 -1.600951e-08 0.000000e+00
1 0.000000e+00 6.486904e-10


There is a function that checks the analytical derivatives by comparing them to their finite difference approximation.

results: CheckDerivativesResults = check_derivatives(
    my_function, x, names=None, logg=True
)
Comparing first derivatives
x       Gradient    FinDiff    Difference
x[0]    0.909091   0.909091    0
x[1]    3.00417    3.00417     6.4869e-10
Comparing second derivatives
Row    Col      Hessian    FinDiff    Difference
x[0]   x[0]   -0.826446  -0.826446  -1.60095e-08
x[1]   x[0]    0          0          0
x[0]   x[1]    0          0          0
x[1]   x[1]    3.00417    3.00417    6.4869e-10

Difference between analytical and finite difference gradient

display(results.errors_gradient)
[0.00000000e+00 6.48690435e-10]

Difference between analytical and finite difference hessian

display(results.errors_hessian)
[[-1.60095120e-08  0.00000000e+00]
 [ 0.00000000e+00  6.48690435e-10]]

To help reading the reporting, it is possible to give names to variables.

named_results: CheckDerivativesResults = check_derivatives(
    my_function, x, names=['First', 'Second'], logg=True
)
Comparing first derivatives
x         Gradient    FinDiff    Difference
First     0.909091   0.909091    0
Second    3.00417    3.00417     6.4869e-10
Comparing second derivatives
Row     Col       Hessian    FinDiff    Difference
First   First   -0.826446  -0.826446  -1.60095e-08
Second  First    0          0          0
First   Second   0          0          0
Second  Second   3.00417    3.00417    6.4869e-10
pd.DataFrame(named_results.errors_gradient)
0
0 0.000000e+00
1 6.486904e-10


display(named_results.errors_hessian)
[[-1.60095120e-08  0.00000000e+00]
 [ 0.00000000e+00  6.48690435e-10]]

Prime numbers: calculate prime numbers lesser or equal to an upper bound.

my_primes = calculate_prime_numbers(10)
display(my_primes)
[2, 3, 5, 7]
my_primes = calculate_prime_numbers(100)
display(my_primes)
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97]

Calculate a given number of prime numbers.

my_primes = get_prime_numbers(7)
display(my_primes)
[2, 3, 5, 7, 11, 13, 17]

Counting groups of data.

alist = [1, 2, 2, 3, 3, 3, 4, 1, 1]
df = pd.DataFrame(
    {
        'ID': [1, 1, 2, 3, 3, 1, 2, 3],
        'value': [1000, 2000, 3000, 4000, 5000, 5000, 10000, 20000],
    }
)
count_number_of_groups(df, 'ID')
6
count_number_of_groups(df, 'value')
7

Likelihood ratio test.

model1 = (-1340.8, 5)
model2 = (-1338.49, 7)

A likelihood ratio test is performed. The function returns the outcome of the test, the statistic, and the threshold.

likelihood_ratio.likelihood_ratio_test(model1, model2)
LRTuple(message='H0 cannot be rejected at level 5.0%', statistic=4.619999999999891, threshold=np.float64(5.99146454710798))

The default level of significance is 0.95. It can be changed.

likelihood_ratio.likelihood_ratio_test(model1, model2, significance_level=0.9)
LRTuple(message='H0 can be rejected at level 90.0%', statistic=4.619999999999891, threshold=np.float64(0.21072103131565265))

The order in which the models are presented is irrelevant.

likelihood_ratio.likelihood_ratio_test(model2, model1)
LRTuple(message='H0 cannot be rejected at level 5.0%', statistic=4.619999999999891, threshold=np.float64(5.99146454710798))

But the unrestricted model must have a higher loglikelihood than the restricted one.

model1 = (-1340.8, 7)
model2 = (-1338.49, 5)
try:
    likelihood_ratio.likelihood_ratio_test(model1, model2)
except BiogemeError as e:
    print(e)
The unrestricted model (-1340.8, 7) has a lower log likelihood than the restricted one (-1338.49, 5)

Total running time of the script: (0 minutes 0.032 seconds)

Gallery generated by Sphinx-Gallery