biogeme.validation.prepare_validation module¶
Split data into validation and estimation samples
- class biogeme.validation.prepare_validation.EstimationValidationIndices(estimation, validation)[source]¶
Bases:
NamedTuple
- Parameters:
estimation (Index)
validation (Index)
-
estimation:
Index
¶ Alias for field number 0
-
validation:
Index
¶ Alias for field number 1
- biogeme.validation.prepare_validation.split(dataframe, slices, groups=None)[source]¶
Splits a DataFrame into multiple training and validation index sets for cross-validation.
This function returns a list of EstimationValidationIndices named tuples, each containing the indices for an estimation (training) set and a validation set. If a grouping column is specified, the split ensures that all entries with the same group ID remain in the same fold.
- Parameters:
dataframe (
DataFrame
) – The full dataset to split.slices (
int
) – The number of folds/slices. Must be >= 2.groups (
str
|None
) – Optional name of the column containing group identifiers. If provided, all rows with the same group ID are kept in the same fold.
- Return type:
- Returns:
A list of EstimationValidationIndices tuples containing index sets, one per fold.
- Raises:
ValueError – If slices is less than 2.