model_selection.validation

model_selection.validation

Functions

Name Description
backtesting_forecaster Backtesting of forecaster model following the folds generated by the TimeSeriesFold
backtesting_forecaster_one_step Backtesting of forecaster model using one-step-ahead predictions.

backtesting_forecaster

model_selection.validation.backtesting_forecaster(
    forecaster,
    y,
    cv,
    metric,
    exog=None,
    interval=None,
    interval_method='bootstrapping',
    n_boot=250,
    use_in_sample_residuals=True,
    use_binned_residuals=True,
    random_state=123,
    return_predictors=False,
    n_jobs='auto',
    verbose=False,
    show_progress=True,
    suppress_warnings=False,
)

Backtesting of forecaster model following the folds generated by the TimeSeriesFold class and using the metric(s) provided.

If forecaster is already trained and initial_train_size is set to None in the TimeSeriesFold class, no initial train will be done and all data will be used to evaluate the model. However, the first len(forecaster.last_window) observations are needed to create the initial predictors, so no predictions are calculated for them.

A copy of the original forecaster is created so that it is not modified during the process.

Parameters

Name Type Description Default
forecaster (ForecasterRecursive, ForecasterDirect, ForecasterEquivalentDate) Forecaster model. required
y pd.Series Training time series. required
cv TimeSeriesFold TimeSeriesFold object with the information needed to split the data into folds. required
metric str | Callable | list Metric used to quantify the goodness of fit of the model. - If str: {‘mean_squared_error’, ‘mean_absolute_error’, ‘mean_absolute_percentage_error’, ‘mean_squared_log_error’, ‘mean_absolute_scaled_error’, ‘root_mean_squared_scaled_error’} - If Callable: Function with arguments y_true, y_pred and y_train (Optional) that returns a float. - If list: List containing multiple strings and/or Callables. required
exog pd.Series | pd.DataFrame Exogenous variable/s included as predictor/s. Must have the same number of observations as y and should be aligned so that y[i] is regressed on exog[i]. Defaults to None. None
interval float | list | tuple | str | object Specifies whether probabilistic predictions should be estimated and the method to use. The following options are supported: - If float, represents the nominal (expected) coverage (between 0 and 1). For instance, interval=0.95 corresponds to [2.5, 97.5] percentiles. - If list or tuple: Sequence of percentiles to compute, each value must be between 0 and 100 inclusive. For example, a 95% confidence interval can be specified as interval = [2.5, 97.5] or multiple percentiles (e.g. 10, 50 and 90) as interval = [10, 50, 90]. - If ‘bootstrapping’ (str): n_boot bootstrapping predictions will be generated. - If scipy.stats distribution object, the distribution parameters will be estimated for each prediction. - If None, no probabilistic predictions are estimated. Defaults to None. None
interval_method str Technique used to estimate prediction intervals. Available options: - ‘bootstrapping’: Bootstrapping is used to generate prediction intervals. - ‘conformal’: Employs the conformal prediction split method for interval estimation. Defaults to ‘bootstrapping’. 'bootstrapping'
n_boot int Number of bootstrapping iterations to perform when estimating prediction intervals. Defaults to 250. 250
use_in_sample_residuals bool If True, residuals from the training data are used as proxy of prediction error to create predictions. If False, out of sample residuals (calibration) are used. Out-of-sample residuals must be precomputed using Forecaster’s set_out_sample_residuals() method. Defaults to True. True
use_binned_residuals bool If True, residuals are selected based on the predicted values (binned selection). If False, residuals are selected randomly. Defaults to True. True
random_state int Seed for the random number generator to ensure reproducibility. Defaults to 123. 123
return_predictors bool If True, the predictors used to make the predictions are also returned. Defaults to False. False
n_jobs int | str The number of jobs to run in parallel. If -1, then the number of jobs is set to the number of cores. If ‘auto’, n_jobs is set using the function skforecast.utils.select_n_jobs_backtesting. Defaults to ‘auto’. 'auto'
verbose bool Print number of folds and index of training and validation sets used for backtesting. Defaults to False. False
show_progress bool Whether to show a progress bar. Defaults to True. True
suppress_warnings bool If True, spotforecast warnings will be suppressed during the backtesting process. See spotforecast.exceptions.warn_skforecast_categories for more information. Defaults to False. False

Returns

Name Type Description
tuple (pd.DataFrame, pd.DataFrame) - metric_values: Value(s) of the metric(s). - backtest_predictions: Value of predictions. The DataFrame includes the following columns: - fold: Indicates the fold number where the prediction was made. - pred: Predicted values for the corresponding series and time steps. If interval is not None, additional columns are included depending on the method: - For float: Columns lower_bound and upper_bound. - For list or tuple of 2 elements: Columns lower_bound and upper_bound. - For list or tuple with multiple percentiles: One column per percentile (e.g., p_10, p_50, p_90). - For 'bootstrapping': One column per bootstrapping iteration (e.g., pred_boot_0, pred_boot_1, …, pred_boot_n). - For scipy.stats distribution objects: One column for each estimated parameter of the distribution (e.g., loc, scale). If return_predictors is True, one column per predictor is created. Depending on the relation between steps and fold_stride, the output may include repeated indexes (if fold_stride < steps) or gaps (if fold_stride > steps). See Notes below for more details.

Notes

Note on fold_stride vs. steps:

  • If fold_stride == steps, test sets are placed back-to-back without overlap. Each observation appears only once in the output DataFrame, so the index is unique.
  • If fold_stride < steps, test sets overlap. Multiple forecasts are generated for the same observations and, therefore, the output DataFrame contains repeated indexes.
  • If fold_stride > steps, there are gaps between consecutive test sets. Some observations in the series will not have associated predictions, so the output DataFrame has non-contiguous indexes.

Examples

>>> import pandas as pd
>>> from sklearn.ensemble import RandomForestRegressor
>>> from spotforecast2_safe.forecaster.recursive import ForecasterRecursive
>>> from spotforecast2_safe.model_selection import backtesting_forecaster, TimeSeriesFold
>>> y = pd.Series([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
>>> forecaster = ForecasterRecursive(
...     estimator=RandomForestRegressor(random_state=123),
...     lags=5
... )
>>> cv = TimeSeriesFold(
...     steps=2,
...     initial_train_size=5,
...     refit=False
... )
>>> metric_values, backtest_predictions = backtesting_forecaster(
...     forecaster=forecaster,
...     y=y,
...     cv=cv,
...     metric='mean_squared_error'
... )
>>> metric_values
   mean_squared_error
0            0.201334
>>> backtest_predictions
   fold  pred
5     0  5.18
6     0  6.10
7     1  7.36
8     1  8.40
9     2  9.31

backtesting_forecaster_one_step

model_selection.validation.backtesting_forecaster_one_step(
    forecaster,
    y,
    cv,
    metric,
    exog=None,
    interval=None,
    interval_method='bootstrapping',
    n_boot=250,
    use_in_sample_residuals=True,
    use_binned_residuals=True,
    random_state=123,
    return_predictors=False,
    n_jobs='auto',
    verbose=False,
    show_progress=True,
    suppress_warnings=False,
)

Backtesting of forecaster model using one-step-ahead predictions.

Parameters

Name Type Description Default
forecaster (ForecasterRecursive, ForecasterDirect, ForecasterEquivalentDate) Forecaster model. required
y pd.Series Training time series. required
cv OneStepAheadFold OneStepAheadFold object with the information needed to split the data into folds. required
metric str | Callable | list Metric used to quantify the goodness of fit of the model. required
exog pd.Series | pd.DataFrame Exogenous variable/s included as predictor/s. Defaults to None. None
interval float | list | tuple | str | object Specifies whether probabilistic predictions should be estimated. None
interval_method str Technique used to estimate prediction intervals. 'bootstrapping'
n_boot int Number of bootstrapping iterations. 250
use_in_sample_residuals bool Use residuals from training data. True
use_binned_residuals bool Use binned residuals for intervals. True
random_state int Seed for reproducibility. 123
return_predictors bool Return predictors used for each prediction. False
n_jobs int | str Number of jobs to run in parallel. 'auto'
verbose bool Print information about the process. False
show_progress bool Whether to show a progress bar. True
suppress_warnings bool Suppress spotforecast warnings. False

Returns

Name Type Description
tuple (pd.DataFrame, pd.DataFrame) - metric_values: Value(s) of the metric(s). - backtest_predictions: Value of predictions.

Notes

This function is designed for one-step-ahead backtesting, where predictions are made for the next time step using the most recent data. The function handles the fitting and prediction process for each fold defined in the OneStepAheadFold object, and calculates the specified metric(s) based on the true and predicted values. Depending on the interval and interval_method parameters, it can also generate probabilistic predictions.

Examples

>>> from spotforecast2_safe.forecaster import ForecasterRecursive
>>> from spotforecast2_safe.model_selection.split_one_step import OneStepAheadFold
>>> from spotforecast2_safe.model_selection.validation import backtesting_forecaster_one_step
>>> # Create a forecaster and a one-step-ahead fold
>>> import numpy as np
>>> import pandas as pd
>>> from sklearn.linear_model import LinearRegression
>>> y = pd.Series(np.random.randn(100), name='y')
>>> forecaster = ForecasterRecursive(estimator=LinearRegression(), lags=5)
>>> cv = OneStepAheadFold(initial_train_size=20, window_size=5)
>>> # Perform backtesting
>>> metric_values, backtest_predictions = backtesting_forecaster_one_step(
...     forecaster=forecaster,
...     y=y,
...     cv=cv,
...     metric='mean_squared_error',
...     exog=None,
...     interval=0.95,
...     interval_method='bootstrapping',
...     n_boot=20,
...     use_in_sample_residuals=True,
...     use_binned_residuals=False,
...     random_state=42,
...     return_predictors=False,
...     n_jobs=1,
...     verbose=True,
...     show_progress=True,
...     suppress_warnings=False
... )
# Note: For reliable bootstrapping with binned residuals, use a sufficiently large series and value spread.
# For random data, use_binned_residuals=False.
# TODO: Setting return_predictors=True requires ForecasterRecursive.create_predict_X().