biogeme.sampling_of_alternatives.sampling_context module¶
Defines a class that characterized the context to apply sampling of alternatives
- author:
Michel Bierlaire
- date:
Wed Sep 6 14:38:31 2023
- class biogeme.sampling_of_alternatives.sampling_context.CrossVariableTuple(name, formula)[source]¶
Bases:
NamedTuple
A cross variable is a variable that involves socio-economic attributes of the individuals, and attributes of the alternatives. It can only be calculated after the sampling has been made.
- Parameters:
name (str)
formula (Expression)
-
formula:
Expression
¶ Alias for field number 1
-
name:
str
¶ Alias for field number 0
- class biogeme.sampling_of_alternatives.sampling_context.SamplingContext(the_partition, sample_sizes, individuals, choice_column, alternatives, id_column, biogeme_file_name, utility_function, combined_variables, mev_partition=None, mev_sample_sizes=None, cnl_nests=None)[source]¶
Bases:
object
Class gathering the data needed to perform an estimation with samples of alternatives
- Parameters:
the_partition (
Partition
) – Partition used for the sampling.sample_sizes (
Iterable
[int
]) – number of alternative to draw from each segment.individuals (
DataFrame
) – Pandas data frame containing all the individuals as rows. One column must contain the choice of each individual.choice_column (
str
) – name of the column containing the choice of each individual.alternatives (
DataFrame
) – Pandas data frame containing all the alternatives as rows. One column must contain a unique ID identifying the alternatives. The other columns contain variables to include in the data file.id_column (
str
) – name of the column containing the Ids of the alternatives.utility_function (
Expression
) – definition of the generic utility functioncombined_variables (
list
[CrossVariableTuple
]) – definition of interaction variablesmev_partition (
Partition
|None
) – If a second choice set need to be sampled for the MEV terms, the corresponding partition is provided here.biogeme_file_name (str)
mev_sample_sizes (Iterable[int] | None)
cnl_nests (NestsForCrossNestedLogit | None)
-
alternatives:
DataFrame
¶
- property attributes: set[str]¶
List of attributes for the choice model
-
biogeme_file_name:
str
¶
- check_expression(expression)[source]¶
Verifies if the variables contained in the expression can be found in the databases
- Return type:
None
- Parameters:
expression (Expression)
- check_mev_partition()[source]¶
Check if the partition is a partition of the MEV alternatives. It does not need to cover the full choice set
- Return type:
None
- check_partition()[source]¶
Check if the partition is truly a partition. If not, an exception is raised
- Raises:
BiogemeError – if some elements are present in more than one subset.
BiogemeError – if the size of the union of the subsets does not match the expected total size
BiogemeError – if an alternative in the partition does not appear in the database of alternatives
BiogemeError – if a segment is empty
BiogemeError – if the number of sampled alternatives in a stratum is incorrect , that is zero, or larger than the stratum size..
- Return type:
None
- check_valid_alternatives(set_of_ids)[source]¶
- Check if the IDs in set are indeed valid
alternatives. Typically used to check if a nest is well defined
- Parameters:
set_of_ids (
set
[int
]) – set of identifiers to check- Raises:
BiogemeError – if at least one id is invalid.
- Return type:
None
-
choice_column:
str
¶
-
cnl_nests:
NestsForCrossNestedLogit
|None
= None¶
-
combined_variables:
list
[CrossVariableTuple
]¶
-
id_column:
str
¶
-
individuals:
DataFrame
¶
- property mev_prefix: str¶
Build the prefix for the MEV columns
-
mev_sample_sizes:
Iterable
[int
] |None
= None¶
- property mev_sampling_protocol: list[StratumTuple] | None¶
Provides a list of strata characterizing the MEV sampling
-
sample_sizes:
Iterable
[int
]¶
- property sampling_protocol: list[StratumTuple]¶
Provides a list of strata characterizing the sampling
- property total_mev_sample_size: int¶
Sample size
- property total_sample_size: int¶
Sample size
-
utility_function:
Expression
¶
- class biogeme.sampling_of_alternatives.sampling_context.StratumTuple(subset, sample_size)[source]¶
Bases:
NamedTuple
A stratum is an element of a partition of the full choice set, combined with the number of alternatives that must be sampled.
- Parameters:
subset (set[int])
sample_size (int)
-
sample_size:
int
¶ Alias for field number 1
-
subset:
set
[int
]¶ Alias for field number 0