biogeme.database.audit module

Audit the dataframe

class biogeme.database.audit.ChosenAvailable(chosen, available)[source]

Bases: NamedTuple

Parameters:
  • chosen (int)

  • available (int)

available: int

Alias for field number 1

chosen: int

Alias for field number 0

biogeme.database.audit.audit_dataframe(data)[source]

Performs a series of checks and reports warnings and errors for a pandas DataFrame.

Parameters:

data (DataFrame) – The DataFrame to audit.

Return type:

AuditTuple

Returns:

the list of errors.

biogeme.database.audit.audit_panel_dataframe(data, id_column)[source]

Performs panel-specific checks on a pandas DataFrame, ensuring entries for the same individual are contiguous.

Parameters:
  • data (DataFrame) – The DataFrame to audit.

  • id_column (str) – The name of the column identifying individuals.

Return type:

tuple[list[str], list[str]]

Returns:

A tuple (list_of_errors, list_of_warnings).

biogeme.database.audit.check_availability_of_chosen_alt(database, avail, choice)[source]

Check if the chosen alternative is available for each entry in the database.

Parameters:
  • database (Database) – object containing the data

  • avail (dict[slice(<class ‘int’>, <class ‘biogeme.expressions.base_expressions.Expression’>, None)]) – list of expressions to evaluate the availability conditions for each alternative.

  • choice (Expression) – expression for the chosen alternative.

Return type:

Series

Returns:

numpy series of bool, long as the number of entries in the database, containing True is the chosen alternative is available, False otherwise.

Raises:

BiogemeError – if the chosen alternative does not appear in the availability dict

biogeme.database.audit.choice_availability_statistics(database, avail, choice)[source]

Calculates the number of times an alternative is chosen and available

Parameters:
  • database (Database) – object containing the data

  • avail (dict[slice(<class ‘int’>, <class ‘biogeme.expressions.base_expressions.Expression’>, None)]) – list of expressions to evaluate the availability conditions for each alternative.

  • choice (Expression) – expression for the chosen alternative.

Return type:

dict[int, ChosenAvailable]

Returns:

for each alternative, a tuple containing the number of time it is chosen, and the number of time it is available.

Raises:

BiogemeError – if the database is empty.