model_selection.validation

model_selection.validation

Functions

Name	Description
backtesting_forecaster	Backtesting of forecaster model following the folds generated by the TimeSeriesFold
backtesting_forecaster_one_step	Backtesting of forecaster model using one-step-ahead predictions.

backtesting_forecaster

model_selection.validation.backtesting_forecaster(
    forecaster,
    y,
    cv,
    metric,
    exog=None,
    interval=None,
    interval_method='bootstrapping',
    n_boot=250,
    use_in_sample_residuals=True,
    use_binned_residuals=True,
    random_state=123,
    return_predictors=False,
    n_jobs='auto',
    verbose=False,
    show_progress=True,
    suppress_warnings=False,
)

Backtesting of forecaster model following the folds generated by the TimeSeriesFold class and using the metric(s) provided.

If forecaster is already trained and initial_train_size is set to None in the TimeSeriesFold class, no initial train will be done and all data will be used to evaluate the model. However, the first len(forecaster.last_window) observations are needed to create the initial predictors, so no predictions are calculated for them.

A copy of the original forecaster is created so that it is not modified during the process.

Parameters

Name	Type	Description	Default
forecaster	(`ForecasterRecursive`, `ForecasterDirect`, `ForecasterEquivalentDate`)	Forecaster model.	required
y	pd.Series	Training time series.	required
cv	TimeSeriesFold	TimeSeriesFold object with the information needed to split the data into folds.	required
metric	str \| Callable \| list	Metric used to quantify the goodness of fit of the model. - If `str`: {‘mean_squared_error’, ‘mean_absolute_error’, ‘mean_absolute_percentage_error’, ‘mean_squared_log_error’, ‘mean_absolute_scaled_error’, ‘root_mean_squared_scaled_error’} - If `Callable`: Function with arguments `y_true`, `y_pred` and `y_train` (Optional) that returns a float. - If `list`: List containing multiple strings and/or Callables.	required
exog	pd.Series \| pd.DataFrame	Exogenous variable/s included as predictor/s. Must have the same number of observations as `y` and should be aligned so that y[i] is regressed on exog[i]. Defaults to None.	`None`
interval	float \| list \| tuple \| str \| object	Specifies whether probabilistic predictions should be estimated and the method to use. The following options are supported: - If `float`, represents the nominal (expected) coverage (between 0 and 1). For instance, `interval=0.95` corresponds to `[2.5, 97.5]` percentiles. - If `list` or `tuple`: Sequence of percentiles to compute, each value must be between 0 and 100 inclusive. For example, a 95% confidence interval can be specified as `interval = [2.5, 97.5]` or multiple percentiles (e.g. 10, 50 and 90) as `interval = [10, 50, 90]`. - If ‘bootstrapping’ (str): `n_boot` bootstrapping predictions will be generated. - If scipy.stats distribution object, the distribution parameters will be estimated for each prediction. - If None, no probabilistic predictions are estimated. Defaults to None.	`None`
interval_method	str	Technique used to estimate prediction intervals. Available options: - ‘bootstrapping’: Bootstrapping is used to generate prediction intervals. - ‘conformal’: Employs the conformal prediction split method for interval estimation. Defaults to ‘bootstrapping’.	`'bootstrapping'`
n_boot	int	Number of bootstrapping iterations to perform when estimating prediction intervals. Defaults to 250.	`250`
use_in_sample_residuals	bool	If `True`, residuals from the training data are used as proxy of prediction error to create predictions. If `False`, out of sample residuals (calibration) are used. Out-of-sample residuals must be precomputed using Forecaster’s `set_out_sample_residuals()` method. Defaults to True.	`True`
use_binned_residuals	bool	If `True`, residuals are selected based on the predicted values (binned selection). If `False`, residuals are selected randomly. Defaults to True.	`True`
random_state	int	Seed for the random number generator to ensure reproducibility. Defaults to 123.	`123`
return_predictors	bool	If `True`, the predictors used to make the predictions are also returned. Defaults to False.	`False`
n_jobs	int \| str	The number of jobs to run in parallel. If `-1`, then the number of jobs is set to the number of cores. If ‘auto’, `n_jobs` is set using the function `skforecast.utils.select_n_jobs_backtesting`. Defaults to ‘auto’.	`'auto'`
verbose	bool	Print number of folds and index of training and validation sets used for backtesting. Defaults to False.	`False`
show_progress	bool	Whether to show a progress bar. Defaults to True.	`True`
suppress_warnings	bool	If `True`, spotforecast warnings will be suppressed during the backtesting process. See `spotforecast.exceptions.warn_skforecast_categories` for more information. Defaults to False.	`False`

Returns

Name	Type	Description
tuple	(pd.DataFrame, pd.DataFrame)	- metric_values: Value(s) of the metric(s). - backtest_predictions: Value of predictions. The DataFrame includes the following columns: - fold: Indicates the fold number where the prediction was made. - pred: Predicted values for the corresponding series and time steps. If `interval` is not `None`, additional columns are included depending on the method: - For `float`: Columns `lower_bound` and `upper_bound`. - For `list` or `tuple` of 2 elements: Columns `lower_bound` and `upper_bound`. - For `list` or `tuple` with multiple percentiles: One column per percentile (e.g., `p_10`, `p_50`, `p_90`). - For `'bootstrapping'`: One column per bootstrapping iteration (e.g., `pred_boot_0`, `pred_boot_1`, …, `pred_boot_n`). - For `scipy.stats` distribution objects: One column for each estimated parameter of the distribution (e.g., `loc`, `scale`). If `return_predictors` is `True`, one column per predictor is created. Depending on the relation between `steps` and `fold_stride`, the output may include repeated indexes (if `fold_stride < steps`) or gaps (if `fold_stride > steps`). See Notes below for more details.

Notes

Note on fold_stride vs. steps:

If fold_stride == steps, test sets are placed back-to-back without overlap. Each observation appears only once in the output DataFrame, so the index is unique.
If fold_stride < steps, test sets overlap. Multiple forecasts are generated for the same observations and, therefore, the output DataFrame contains repeated indexes.
If fold_stride > steps, there are gaps between consecutive test sets. Some observations in the series will not have associated predictions, so the output DataFrame has non-contiguous indexes.

Examples

>>> import pandas as pd
>>> from sklearn.ensemble import RandomForestRegressor
>>> from spotforecast2_safe.forecaster.recursive import ForecasterRecursive
>>> from spotforecast2_safe.model_selection import backtesting_forecaster, TimeSeriesFold
>>> y = pd.Series([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
>>> forecaster = ForecasterRecursive(
...     estimator=RandomForestRegressor(random_state=123),
...     lags=5
... )
>>> cv = TimeSeriesFold(
...     steps=2,
...     initial_train_size=5,
...     refit=False
... )
>>> metric_values, backtest_predictions = backtesting_forecaster(
...     forecaster=forecaster,
...     y=y,
...     cv=cv,
...     metric='mean_squared_error'
... )
>>> metric_values
   mean_squared_error
0            0.201334
>>> backtest_predictions
   fold  pred
5     0  5.18
6     0  6.10
7     1  7.36
8     1  8.40
9     2  9.31

backtesting_forecaster_one_step

model_selection.validation.backtesting_forecaster_one_step(
    forecaster,
    y,
    cv,
    metric,
    exog=None,
    interval=None,
    interval_method='bootstrapping',
    n_boot=250,
    use_in_sample_residuals=True,
    use_binned_residuals=True,
    random_state=123,
    return_predictors=False,
    n_jobs='auto',
    verbose=False,
    show_progress=True,
    suppress_warnings=False,
)

Backtesting of forecaster model using one-step-ahead predictions.

Parameters

Name	Type	Description	Default
forecaster	(`ForecasterRecursive`, `ForecasterDirect`, `ForecasterEquivalentDate`)	Forecaster model.	required
y	pd.Series	Training time series.	required
cv	OneStepAheadFold	OneStepAheadFold object with the information needed to split the data into folds.	required
metric	str \| Callable \| list	Metric used to quantify the goodness of fit of the model.	required
exog	pd.Series \| pd.DataFrame	Exogenous variable/s included as predictor/s. Defaults to None.	`None`
interval	float \| list \| tuple \| str \| object	Specifies whether probabilistic predictions should be estimated.	`None`
interval_method	str	Technique used to estimate prediction intervals.	`'bootstrapping'`
n_boot	int	Number of bootstrapping iterations.	`250`
use_in_sample_residuals	bool	Use residuals from training data.	`True`
use_binned_residuals	bool	Use binned residuals for intervals.	`True`
random_state	int	Seed for reproducibility.	`123`
return_predictors	bool	Return predictors used for each prediction.	`False`
n_jobs	int \| str	Number of jobs to run in parallel.	`'auto'`
verbose	bool	Print information about the process.	`False`
show_progress	bool	Whether to show a progress bar.	`True`
suppress_warnings	bool	Suppress spotforecast warnings.	`False`

Returns

Name	Type	Description
tuple	(pd.DataFrame, pd.DataFrame)	- metric_values: Value(s) of the metric(s). - backtest_predictions: Value of predictions.

Notes

This function is designed for one-step-ahead backtesting, where predictions are made for the next time step using the most recent data. The function handles the fitting and prediction process for each fold defined in the OneStepAheadFold object, and calculates the specified metric(s) based on the true and predicted values. Depending on the interval and interval_method parameters, it can also generate probabilistic predictions.

Examples

>>> from spotforecast2_safe.forecaster import ForecasterRecursive
>>> from spotforecast2_safe.model_selection.split_one_step import OneStepAheadFold
>>> from spotforecast2_safe.model_selection.validation import backtesting_forecaster_one_step
>>> # Create a forecaster and a one-step-ahead fold
>>> import numpy as np
>>> import pandas as pd
>>> from sklearn.linear_model import LinearRegression
>>> y = pd.Series(np.random.randn(100), name='y')
>>> forecaster = ForecasterRecursive(estimator=LinearRegression(), lags=5)
>>> cv = OneStepAheadFold(initial_train_size=20, window_size=5)
>>> # Perform backtesting
>>> metric_values, backtest_predictions = backtesting_forecaster_one_step(
...     forecaster=forecaster,
...     y=y,
...     cv=cv,
...     metric='mean_squared_error',
...     exog=None,
...     interval=0.95,
...     interval_method='bootstrapping',
...     n_boot=20,
...     use_in_sample_residuals=True,
...     use_binned_residuals=False,
...     random_state=42,
...     return_predictors=False,
...     n_jobs=1,
...     verbose=True,
...     show_progress=True,
...     suppress_warnings=False
... )
# Note: For reliable bootstrapping with binned residuals, use a sufficiently large series and value spread.
# For random data, use_binned_residuals=False.
# TODO: Setting return_predictors=True requires ForecasterRecursive.create_predict_X().