biogeme.sampling_of_alternatives.sampling_context module
Defines a class that characterized the context to apply sampling of alternatives
- author:
Michel Bierlaire
- date:
Wed Sep 6 14:38:31 2023
- class biogeme.sampling_of_alternatives.sampling_context.CrossVariableTuple(name, formula)[source]
Bases:
NamedTuple
A cross variable is a variable that involves socio-economic attributes of the individuals, and attributes of the alternatives. It can only be calculated after the sampling has been made.
- Parameters:
name (str)
formula (Expression)
-
formula:
Expression
Alias for field number 1
-
name:
str
Alias for field number 0
- class biogeme.sampling_of_alternatives.sampling_context.SamplingContext(the_partition, sample_sizes, individuals, choice_column, alternatives, id_column, biogeme_file_name, utility_function, combined_variables, mev_partition=None, mev_sample_sizes=None, cnl_nests=None)[source]
Bases:
object
Class gathering the data needed to perform an estimation with samples of alternatives
- Parameters:
the_partition (
Partition
) – Partition used for the sampling.sample_sizes (
Iterable
[int
]) – number of alternative to draw from each segment.individuals (
DataFrame
) – Pandas data frame containing all the individuals as rows. One column must contain the choice of each individual.choice_column (
str
) – name of the column containing the choice of each individual.alternatives (
DataFrame
) – Pandas data frame containing all the alternatives as rows. One column must contain a unique ID identifying the alternatives. The other columns contain variables to include in the data file.id_column (
str
) – name of the column containing the Ids of the alternatives.utility_function (
Expression
) – definition of the generic utility functioncombined_variables (
list
[CrossVariableTuple
]) – definition of interaction variablesmev_partition (
Optional
[Partition
]) – If a second choice set need to be sampled for the MEV terms, the corresponding partitition is provided here.biogeme_file_name (str)
mev_sample_sizes (Iterable[int] | None)
cnl_nests (NestsForCrossNestedLogit | None)
-
alternatives:
DataFrame
-
biogeme_file_name:
str
- check_expression(expression)[source]
Verifies if the variables contained in the expression can be found in the databases
- Return type:
None
- Parameters:
expression (Expression)
- check_mev_partition()[source]
Check if the partition is a partition of the MEV alternatives. It does not need to cover the full choice set
- Return type:
None
- check_partition()[source]
Check if the partition is truly a partition. If not, an exception is raised
- Raises:
BiogemeError – if some elements are present in more than one subset.
BiogemeError – if the size of the union of the subsets does not match the expected total size
BiogemeError – if an alternative in the partition does not appear in the database of alternatives
BiogemeError – if a segment is empty
BiogemeError – if the number of sampled alternatives in a stratum is incorrect , that is zero, or larger than the stratum size..
- Return type:
None
- check_valid_alternatives(set_of_ids)[source]
- Check if the IDs in set are indeed valid
alternatives. Typically used to check if a nest is well defined
- Parameters:
set_of_ids (
set
[int
]) – set of identifiers to check- Raises:
BiogemeError – if at least one id is invalid.
- Return type:
None
-
choice_column:
str
-
cnl_nests:
Optional
[NestsForCrossNestedLogit
] = None
-
combined_variables:
list
[CrossVariableTuple
]
-
id_column:
str
-
individuals:
DataFrame
-
mev_sample_sizes:
Optional
[Iterable
[int
]] = None
-
sample_sizes:
Iterable
[int
]
-
utility_function:
Expression
- class biogeme.sampling_of_alternatives.sampling_context.StratumTuple(subset, sample_size)[source]
Bases:
NamedTuple
A stratum is an element of a partition of the full choice set, combined with the number of alternatives that must be sampled.
- Parameters:
subset (set[int])
sample_size (int)
-
sample_size:
int
Alias for field number 1
-
subset:
set
[int
] Alias for field number 0