model_selection.split_one_step

model_selection.split_one_step

One step ahead cross-validation splitting.

Classes

Name	Description
OneStepAheadFold	Class to split time series data into train and test folds for one-step-ahead

OneStepAheadFold

model_selection.split_one_step.OneStepAheadFold(
    initial_train_size,
    window_size=None,
    differentiation=None,
    return_all_indexes=False,
    verbose=True,
)

Class to split time series data into train and test folds for one-step-ahead forecasting.

Parameters

Name	Type	Description	Default
initial_train_size	int \| str \| pd.Timestamp	Number of observations used for initial training. - If an integer, the number of observations used for initial training. - If a date string or pandas Timestamp, it is the last date included in the initial training set.	required
window_size	int	Number of observations needed to generate the autoregressive predictors. Defaults to None.	`None`
differentiation	int	Number of observations to use for differentiation. This is used to extend the `last_window` as many observations as the differentiation order. Defaults to None.	`None`
return_all_indexes	bool	Whether to return all indexes or only the start and end indexes of each fold. Defaults to False.	`False`
verbose	bool	Whether to print information about generated folds. Defaults to True.	`True`

Attributes

Name	Type	Description
initial_train_size	int	Number of observations used for initial training.
window_size	int	Number of observations needed to generate the autoregressive predictors.
differentiation	int	Number of observations to use for differentiation. This is used to extend the `last_window` as many observations as the differentiation order.
return_all_indexes	bool	Whether to return all indexes or only the start and end indexes of each fold.
verbose	bool	Whether to print information about generated folds.

Methods

Name	Description
split	Split the time series data into train and test folds.

split

model_selection.split_one_step.OneStepAheadFold.split(
    X,
    as_pandas=False,
    externally_fitted=None,
)

Split the time series data into train and test folds.

Parameters

Name	Type	Description	Default
X	pd.Series \| pd.DataFrame \| pd.Index \| dict	Time series data or index to split.	required
as_pandas	bool	If True, the folds are returned as a DataFrame. This is useful to visualize the folds in a more interpretable way. Defaults to False.	`False`
externally_fitted	Any	This argument is not used in this class. It is included for API consistency. Defaults to None.	`None`

Returns

Name	Type	Description
	list \| pd.DataFrame	list \| pd.DataFrame: A list of lists containing the indices (position) of
	list \| pd.DataFrame	the fold. The list contains 2 lists with the following information:
	list \| pd.DataFrame	- fold: fold number.
	list \| pd.DataFrame	- [train_start, train_end]: list with the start and end positions of the training set.
	list \| pd.DataFrame	- [test_start, test_end]: list with the start and end positions of the test set. These are the observations used to evaluate the forecaster.
	list \| pd.DataFrame	- fit_forecaster: boolean indicating whether the forecaster should be fitted in this fold.
	list \| pd.DataFrame	It is important to note that the returned values are the positions of the
	list \| pd.DataFrame	observations and not the actual values of the index, so they can be used to
	list \| pd.DataFrame	slice the data directly using iloc.
	list \| pd.DataFrame	If `as_pandas` is `True`, the folds are returned as a DataFrame with the
	list \| pd.DataFrame	following columns: ‘fold’, ‘train_start’, ‘train_end’, ‘test_start’,
	list \| pd.DataFrame	‘test_end’, ‘fit_forecaster’.
	list \| pd.DataFrame	Following the python convention, the start index is inclusive and the end
	list \| pd.DataFrame	index is exclusive. This means that the last index is not included in the
	list \| pd.DataFrame	slice.