forecaster.recursive._forecaster_recursive.ForecasterRecursive

forecaster.recursive._forecaster_recursive.ForecasterRecursive(
    estimator=None,
    lags=None,
    window_features=None,
    transformer_y=None,
    transformer_exog=None,
    weight_func=None,
    differentiation=None,
    fit_kwargs=None,
    binner_kwargs=None,
    forecaster_id=None,
    regressor=None,
)

Recursive autoregressive forecaster for scikit-learn compatible estimators.

This class turns any estimator compatible with the scikit-learn API into a recursive autoregressive (multi-step) forecaster. The forecaster learns to predict future values by using lagged values of the target variable and optional exogenous features. Predictions are made iteratively, where each step uses previous predictions as input for the next step (recursive strategy).

Parameters

Name	Type	Description	Default
estimator	object	Scikit-learn compatible estimator for regression. If None, a default estimator will be initialized. Can also be passed via regressor parameter.	`None`
lags	Union[int, List[int], np.ndarray, range, None]	Lagged values of the target variable to use as predictors. Can be an integer (uses lags from 1 to lags), list of integers, numpy array, or range. At least one of lags or window_features must be provided. Defaults to None.	`None`
window_features	Union[object, List[object], None]	List of window feature objects to compute features from the target variable. Each object must implement transform_batch() method. At least one of lags or window_features must be provided. Defaults to None.	`None`
transformer_y	Optional[object]	Transformer object for the target variable. Must implement fit() and transform() methods. Applied before training and predictions. Defaults to None.	`None`
transformer_exog	Optional[object]	Transformer object for exogenous variables. Must implement fit() and transform() methods. Applied before training and predictions. Defaults to None.	`None`
weight_func	Optional[Callable]	Function to compute sample weights for training. Must accept an index and return an array of weights. Defaults to None.	`None`
differentiation	Optional[int]	Order of differencing to apply to the target variable. Must be a positive integer. Differencing is applied before creating lags. Defaults to None.	`None`
fit_kwargs	Optional[Dict[str, object]]	Dictionary of additional keyword arguments to pass to the estimator’s fit() method. Defaults to None.	`None`
binner_kwargs	Optional[Dict[str, object]]	Dictionary of keyword arguments for QuantileBinner used in probabilistic predictions. Defaults to {‘n_bins’: 10, ‘method’: ‘linear’}.	`None`
forecaster_id	Union[str, int, None]	Identifier for the forecaster instance. Can be a string or integer. Used for tracking and logging purposes. Defaults to None.	`None`
regressor	object	Alternative parameter name for estimator. If provided, used instead of estimator. Defaults to None.	`None`

Attributes

Name	Type	Description
estimator		Fitted scikit-learn estimator.
lags		Lag indices used in the model.
lags_names		Names of lag features (e.g., [‘lag_1’, ‘lag_2’]).
window_features		List of window feature transformers.
window_features_names		Names of window features.
window_size		Maximum window size needed (max of lags and window features).
transformer_y		Transformer for target variable.
transformer_exog		Transformer for exogenous variables.
weight_func		Function for sample weighting.
differentiation		Order of differencing applied.
differentiator		TimeSeriesDifferentiator instance if differencing is used.
is_fitted		Boolean indicating if forecaster has been fitted.
fit_date		Timestamp of the last fit operation.
last_window_		Last window_size observations from training data.
index_type_		Type of index in training data (RangeIndex or DatetimeIndex).
index_freq_		Frequency of DatetimeIndex if applicable.
training_range_		First and last index values of training data.
series_name_in_		Name of the target series.
exog_in_		Boolean indicating if exogenous variables were used in training.
exog_names_in_		Names of exogenous variables.
exog_type_in_		Type of exogenous input (Series or DataFrame).
X_train_features_names_out_		Names of all training features.
in_sample_residuals_		Residuals from training set.
in_sample_residuals_by_bin_		Residuals grouped by bins for probabilistic pred.
forecaster_id		Identifier for the forecaster instance.

Note

Either lags or window_features (or both) must be provided during initialization.
The forecaster uses a recursive strategy where each multi-step prediction depends on previous predictions within the same forecast horizon.
Exogenous variables must have the same index as the target variable and must be available for the entire prediction horizon.
The forecaster supports point predictions, prediction intervals, bootstrapping, quantile predictions, and probabilistic forecasts via conformal methods.

Examples

import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression

from spotforecast2_safe.forecaster.recursive import ForecasterRecursive

rng = np.random.default_rng(0)
y = pd.Series(rng.standard_normal(100), name='y')
forecaster = ForecasterRecursive(
    estimator=LinearRegression(),
    lags=10,
)
forecaster.fit(y)
predictions = forecaster.predict(steps=5)
print(predictions)

100   -0.124227
101   -0.040601
102    0.070276
103   -0.031902
104   -0.051919
Name: pred, dtype: float64

import numpy as np
import pandas as pd
from sklearn.ensemble import RandomForestRegressor
from sklearn.preprocessing import StandardScaler

from spotforecast2_safe.forecaster.recursive import ForecasterRecursive
from spotforecast2_safe.preprocessing import RollingFeatures

rng = np.random.default_rng(1)
y = pd.Series(rng.standard_normal(100), name='y')
forecaster = ForecasterRecursive(
    estimator=RandomForestRegressor(n_estimators=100, random_state=1),
    lags=[1, 7, 30],
    window_features=[RollingFeatures(stats='mean', window_sizes=7)],
    transformer_y=StandardScaler(),
    differentiation=1,
)
forecaster.fit(y)
predictions = forecaster.predict(steps=10)
print(predictions)

╭───────────────────────────── DataTransformationWarning ──────────────────────────────╮
│ The output matrix is in the transformed scale due to the inclusion of                │
│ transformations or differentiation in the Forecaster. As a result, any predictions   │
│ generated using this matrix will also be in the transformed scale. Please refer to   │
│ the documentation for more details:                                                  │
│ https://skforecast.org/latest/user_guides/training-and-prediction-matrices.html      │
│                                                                                      │
│ Category : spotforecast2.exceptions.DataTransformationWarning                        │
│ Location :                                                                           │
│ /home/runner/work/spotforecast2-safe/spotforecast2-safe/src/spotforecast2_safe/forec │
│ aster/recursive/_forecaster_recursive.py:1468                                        │
│ Suppress : warnings.simplefilter('ignore', category=DataTransformationWarning)       │
╰──────────────────────────────────────────────────────────────────────────────────────╯

100    0.051764
101    1.180056
102    0.128981
103    1.600242
104    1.739395
105    1.804752
106    3.231367
107    3.990533
108    5.370820
109    5.845878
Name: pred, dtype: float64

import numpy as np
import pandas as pd
from sklearn.linear_model import Ridge

from spotforecast2_safe.forecaster.recursive import ForecasterRecursive

rng = np.random.default_rng(2)
y = pd.Series(rng.standard_normal(100), name='target')
exog = pd.DataFrame({'temp': rng.standard_normal(100)}, index=y.index)
forecaster = ForecasterRecursive(
    estimator=Ridge(),
    lags=7,
    forecaster_id='my_forecaster',
)
forecaster.fit(y, exog)
exog_future = pd.DataFrame(
    {'temp': rng.standard_normal(5)},
    index=pd.RangeIndex(start=100, stop=105),
)
predictions = forecaster.predict(steps=5, exog=exog_future)
print(predictions)

100    0.469714
101    0.285839
102    0.235736
103    0.033931
104    0.171666
Name: pred, dtype: float64

import numpy as np
import pandas as pd
from sklearn.ensemble import GradientBoostingRegressor

from spotforecast2_safe.forecaster.recursive import ForecasterRecursive

rng = np.random.default_rng(3)
y = pd.Series(rng.standard_normal(100), name='y')
forecaster = ForecasterRecursive(
    estimator=GradientBoostingRegressor(random_state=3),
    lags=14,
    binner_kwargs={'n_bins': 15, 'method': 'linear'},
)
forecaster.fit(y, store_in_sample_residuals=True)
predictions = forecaster.predict(steps=5)
print(predictions)

100    0.034171
101    0.407376
102    0.001077
103   -0.138702
104   -0.114340
Name: pred, dtype: float64

Methods

Name	Description
create_predict_X	Create the predictors needed to predict `steps` ahead. As it is a recursive
create_sample_weights	Create weights for each observation according to the forecaster’s attribute
create_train_X_y	Public method to create training predictors and target values.
fit	Fit the forecaster to the training data.
get_feature_importances	Return feature importances of the estimator stored in the forecaster.
get_params	Get parameters for this forecaster.
get_tags	Return the tags that characterize the behavior of the forecaster.
predict	Predict future values recursively for the specified number of steps.
predict_bootstrapping	Generate multiple forecasting predictions using a bootstrapping process.
predict_dist	Fit a given probability distribution for each step. After generating
predict_interval	Predict n steps ahead and estimate prediction intervals using either
predict_quantiles	Calculate the specified quantiles for each step. After generating
set_fit_kwargs	Set new values for the additional keyword arguments passed to the `fit`
set_in_sample_residuals	Set in-sample residuals in case they were not calculated during the
set_lags	Set new value to the attribute `lags`. Attributes `lags_names`,
set_out_sample_residuals	Set new values to the attribute `out_sample_residuals_`.
set_params	Set the parameters of this forecaster.
set_window_features	Set new value to the attribute `window_features`.
summary	Show forecaster information.

create_predict_X

forecaster.recursive._forecaster_recursive.ForecasterRecursive.create_predict_X(
    steps,
    last_window=None,
    exog=None,
    check_inputs=True,
)

Create the predictors needed to predict steps ahead. As it is a recursive process, the predictors are created at each iteration of the prediction process.

Parameters

Name	Type	Description	Default
steps	int	Number of steps to predict. - If steps is int, number of steps to predict. - If str or pandas Datetime, the prediction will be up to that date.	required
last_window	pd.Series \| pd.DataFrame \| None	Series values used to create the predictors (lags) needed in the first iteration of the prediction (t + 1). If `last_window = None`, the values stored in `self.last_window_` are used to calculate the initial predictors, and the predictions start right after training data. Defaults to None.	`None`
exog	pd.Series \| pd.DataFrame \| None	Exogenous variable/s included as predictor/s. Defaults to None.	`None`
check_inputs	bool	If `True`, the input is checked for possible warnings and errors with the `check_predict_input` function. This argument is created for internal use and is not recommended to be changed. Defaults to True.	`True`

Returns

Name	Type	Description
	pd.DataFrame	Pandas DataFrame with the predictors for each step. The index
	pd.DataFrame	is the same as the prediction index.

Examples

import warnings
import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
from spotforecast2_safe.forecaster.recursive import ForecasterRecursive

warnings.simplefilter("ignore")
rng = np.random.default_rng(0)
y = pd.Series(
    np.sin(np.linspace(0, 4 * np.pi, 100)), name="y"
)
forecaster = ForecasterRecursive(
    estimator=LinearRegression(),
    lags=4,
)
forecaster.fit(y=y)
X_predict = forecaster.create_predict_X(steps=3)
print(X_predict)
assert isinstance(X_predict, pd.DataFrame)
assert X_predict.shape == (3, 4)

            lag_1         lag_2         lag_3     lag_4
100 -4.898587e-16 -1.265925e-01 -2.511480e-01 -0.371662
101  1.265925e-01 -4.898587e-16 -1.265925e-01 -0.251148
102  2.511480e-01  1.265925e-01 -4.898587e-16 -0.126592

create_sample_weights

forecaster.recursive._forecaster_recursive.ForecasterRecursive.create_sample_weights(
    X_train,
)

Create weights for each observation according to the forecaster’s attribute weight_func.

Parameters

Name	Type	Description	Default
X_train	pd.DataFrame	Dataframe created with the `create_train_X_y` method, first return.	required

Returns

Name	Type	Description
	np.ndarray	Weights to use in `fit` method.

Examples

import warnings
import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
from spotforecast2_safe.forecaster.recursive import ForecasterRecursive

warnings.simplefilter("ignore")
rng = np.random.default_rng(0)
y = pd.Series(
    np.sin(np.linspace(0, 4 * np.pi, 100)), name="y"
)

def linear_weight(index):
    return np.linspace(0.5, 1.0, len(index))

forecaster = ForecasterRecursive(
    estimator=LinearRegression(),
    lags=4,
    weight_func=linear_weight,
)
X_train, _ = forecaster.create_train_X_y(y=y)
weights = forecaster.create_sample_weights(X_train=X_train)
print(f"weights shape: {weights.shape}")
assert weights is not None
assert weights.shape == (len(X_train),)

weights shape: (96,)

create_train_X_y

forecaster.recursive._forecaster_recursive.ForecasterRecursive.create_train_X_y(
    y,
    exog=None,
)

Public method to create training predictors and target values.

This method is a public wrapper around the internal method _create_train_X_y, which generates the training predictors and target values based on the provided time series and exogenous variables. It ensures that the necessary transformations and feature engineering steps are applied to prepare the data for training the forecaster.

Parameters

Name	Type	Description	Default
y	pd.Series	Target series for training. Must be a pandas Series.	required
exog	Union[pd.Series, pd.DataFrame, None]	Optional exogenous variables for training. Can be a pandas Series or DataFrame. Must have the same index as `y` and cover the same time range. Defaults to None.	`None`

Returns

Name	Type	Description
	Tuple[pd.DataFrame, pd.Series]	Tuple containing: - X_train: DataFrame of training predictors including lags, window features, and exogenous variables (if provided). - y_train: Series of target values aligned with the predictors.

Examples

import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
from spotforecast2_safe.forecaster.recursive import ForecasterRecursive
from spotforecast2_safe.preprocessing import RollingFeatures

rng = np.random.default_rng(0)
y = pd.Series(np.arange(30), name='y')
exog = pd.DataFrame({'temp': rng.standard_normal(30)}, index=y.index)
forecaster = ForecasterRecursive(
    estimator=LinearRegression(),
    lags=3,
    window_features=[RollingFeatures(stats='mean', window_sizes=3)],
)
X_train, y_train = forecaster.create_train_X_y(y=y, exog=exog)
print(isinstance(X_train, pd.DataFrame))
print(isinstance(y_train, pd.Series))

True
True

fit

forecaster.recursive._forecaster_recursive.ForecasterRecursive.fit(
    y,
    exog=None,
    store_last_window=True,
    store_in_sample_residuals=False,
    random_state=123,
    suppress_warnings=False,
)

Fit the forecaster to the training data.

Parameters

Name	Type	Description	Default
y	pd.Series	Target series for training. Must be a pandas Series.	required
exog	Union[pd.Series, pd.DataFrame, None]	Optional exogenous variables for training. Can be a pandas Series or DataFrame.Must have the same index as `y` and cover the same time range. Defaults to None.	`None`
store_last_window	bool	Whether to store the last window of the training series for use in prediction. Defaults to True.	`True`
store_in_sample_residuals	bool	Whether to store in-sample residuals after fitting, which can be used for certain probabilistic prediction methods. Defaults to False.	`False`
random_state	int	Random seed for reproducibility when sampling residuals if `store_in_sample_residuals` is True. Defaults to 123.	`123`
suppress_warnings	bool	Whether to suppress warnings during fitting, such as those related to insufficient data length for lags or window features. Defaults to False.	`False`

Returns

Name	Type	Description
	None	None

Examples

import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
from spotforecast2_safe.forecaster.recursive import ForecasterRecursive
from spotforecast2_safe.preprocessing import RollingFeatures

rng = np.random.default_rng(0)
y = pd.Series(np.arange(30), name='y')
exog = pd.DataFrame({'temp': rng.standard_normal(30)}, index=y.index)
forecaster = ForecasterRecursive(
    estimator=LinearRegression(),
    lags=3,
    window_features=[RollingFeatures(stats='mean', window_sizes=3)],
)
forecaster.fit(y=y, exog=exog, store_in_sample_residuals=True)
print(forecaster.is_fitted)

True

get_feature_importances

forecaster.recursive._forecaster_recursive.ForecasterRecursive.get_feature_importances(
    sort_importance=True,
)

Return feature importances of the estimator stored in the forecaster. Only valid when estimator stores internally the feature importances in the attribute feature_importances_ or coef_. Otherwise, returns None.

Parameters

Name	Type	Description	Default
sort_importance	bool	If `True`, sorts the feature importances in descending order.	`True`

Returns

Name	Type	Description
	pd.DataFrame	pd.DataFrame: Feature importances associated with each predictor.

Raises

Name	Type	Description
	NotFittedError	If the forecaster is not fitted.

Examples

import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
from spotforecast2_safe.forecaster.recursive import ForecasterRecursive

forecaster = ForecasterRecursive(estimator=LinearRegression(), lags=3)
forecaster.fit(y=pd.Series(np.arange(20)))
print(forecaster.get_feature_importances())

  feature  importance
0   lag_1    0.333333
1   lag_2    0.333333
2   lag_3    0.333333

get_params

forecaster.recursive._forecaster_recursive.ForecasterRecursive.get_params(
    deep=True,
)

Get parameters for this forecaster.

Parameters

Name	Type	Description	Default
deep	bool	If True, will return the parameters for this forecaster and contained sub-objects that are estimators.	`True`

Returns

Name	Type	Description
params	Dict[str, object]	Dictionary of parameter names mapped to their values.

Examples

from sklearn.linear_model import LinearRegression
from spotforecast2_safe.forecaster.recursive import ForecasterRecursive

forecaster = ForecasterRecursive(estimator=LinearRegression(), lags=3)
params = forecaster.get_params()
print(params['lags'])
print(params['differentiation'])

[1 2 3]
None

get_tags

forecaster.recursive._forecaster_recursive.ForecasterRecursive.get_tags()

Return the tags that characterize the behavior of the forecaster.

Returns

Name	Type	Description
	dict[str, Any]	Dictionary with forecaster tags describing behavior and capabilities.

Examples

from sklearn.linear_model import Ridge
from spotforecast2_safe.forecaster.recursive import ForecasterRecursive

forecaster = ForecasterRecursive(estimator=Ridge(), lags=3)
tags = forecaster.get_tags()
print(tags['forecaster_task'])

regression

predict

forecaster.recursive._forecaster_recursive.ForecasterRecursive.predict(
    steps,
    last_window=None,
    exog=None,
    check_inputs=True,
)

Predict future values recursively for the specified number of steps.

Parameters

Name	Type	Description	Default
steps	int \| str \| pd.Timestamp	Number of future steps to predict.	required
last_window	Union[pd.Series, pd.DataFrame, None]	Optional last window of observed values to use for prediction. If None, uses the last window from training. Must be a pandas Series or DataFrame with the same structure as the training target series. Defaults to None.	`None`
exog	Union[pd.Series, pd.DataFrame, None]	Optional exogenous variables for prediction. Can be a pandas Series or DataFrame. Must have the same structure as the exogenous variables used in training. Defaults to None.	`None`
check_inputs	bool	Whether to perform input validation checks. Defaults to True.	`True`

Returns

Name	Type	Description
	pd.Series	Pandas Series of predicted values for the specified number of steps,
	pd.Series	indexed according to the prediction index constructed from the last window and the number of steps.

Examples

import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
from spotforecast2_safe.forecaster.recursive import ForecasterRecursive
from spotforecast2_safe.preprocessing import RollingFeatures

rng = np.random.default_rng(0)
y = pd.Series(np.arange(30), name='y')
exog = pd.DataFrame({'temp': rng.standard_normal(30)}, index=y.index)
forecaster = ForecasterRecursive(
    estimator=LinearRegression(),
    lags=3,
    window_features=[RollingFeatures(stats='mean', window_sizes=3)],
)
forecaster.fit(y=y, exog=exog)
last_window = y.iloc[-3:]
exog_future = pd.DataFrame(
    {'temp': rng.standard_normal(5)},
    index=pd.RangeIndex(start=30, stop=35),
)
predictions = forecaster.predict(
    steps=5, last_window=last_window, exog=exog_future, check_inputs=True
)
print(isinstance(predictions, pd.Series))

True

predict_bootstrapping

forecaster.recursive._forecaster_recursive.ForecasterRecursive.predict_bootstrapping(
    steps,
    last_window=None,
    exog=None,
    n_boot=250,
    use_in_sample_residuals=True,
    use_binned_residuals=True,
    random_state=123,
)

Generate multiple forecasting predictions using a bootstrapping process. By sampling from a collection of past observed errors (the residuals), each iteration of bootstrapping generates a different set of predictions. See the References section for more information.

Parameters

Name	Type	Description	Default
steps	int \| str \| pd.Timestamp	Number of steps to predict. - If steps is int, number of steps to predict. - If str or pandas Datetime, the prediction will be up to that date.	required
last_window	pd.Series \| pd.DataFrame \| None	Series values used to create the predictors (lags) needed in the first iteration of the prediction (t + 1). If `last_window = None`, the values stored in `self.last_window_` are used to calculate the initial predictors, and the predictions start right after training data. Defaults to None.	`None`
exog	pd.Series \| pd.DataFrame \| None	Exogenous variable/s included as predictor/s. Defaults to None.	`None`
n_boot	int	Number of bootstrapping iterations to perform when estimating prediction intervals. Defaults to 250.	`250`
use_in_sample_residuals	bool	If `True`, residuals from the training data are used as proxy of prediction error to create predictions. If `False`, out of sample residuals (calibration) are used. Out-of-sample residuals must be precomputed using Forecaster’s `set_out_sample_residuals()` method. Defaults to True.	`True`
use_binned_residuals	bool	If `True`, residuals are selected based on the predicted values (binned selection). If `False`, residuals are selected randomly. Defaults to True.	`True`
random_state	int	Seed for the random number generator to ensure reproducibility. Defaults to 123.	`123`

Returns

Name	Type	Description
	pd.DataFrame	Pandas DataFrame with predictions generated by bootstrapping. Shape: (steps, n_boot).

Raises

Name	Type	Description
	ValueError	If `steps` is not an integer or a valid date.
	ValueError	If `exog` is missing or has invalid shape.
	ValueError	If `n_boot` is not a positive integer.
	ValueError	If `use_in_sample_residuals=True` and `in_sample_residuals_` are not available.
	ValueError	If `use_in_sample_residuals=False` and `out_sample_residuals_` are not available.

Examples

import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
from spotforecast2_safe.forecaster.recursive import ForecasterRecursive

rng = np.random.default_rng(123)
y = pd.Series(rng.standard_normal(size=100), name='y')
forecaster = ForecasterRecursive(estimator=LinearRegression(), lags=3)
_ = forecaster.fit(y=y, store_in_sample_residuals=True)
boot_preds = forecaster.predict_bootstrapping(steps=3, n_boot=5)
print(boot_preds.shape)

(3, 5)

References

.. [1] Forecasting: Principles and Practice (3rd ed) Rob J Hyndman and George Athanasopoulos. https://otexts.com/fpp3/prediction-intervals.html

predict_dist

forecaster.recursive._forecaster_recursive.ForecasterRecursive.predict_dist(
    steps,
    distribution,
    last_window=None,
    exog=None,
    n_boot=250,
    use_in_sample_residuals=True,
    use_binned_residuals=True,
    random_state=123,
)

Fit a given probability distribution for each step. After generating multiple forecasting predictions through a bootstrapping process, each step is fitted to the given distribution.

Parameters

Name	Type	Description	Default
steps	int \| str \| pd.Timestamp	Number of steps to predict. - If steps is int, number of steps to predict. - If str or pandas Datetime, the prediction will be up to that date.	required
distribution	object	A distribution object from scipy.stats with methods `_pdf` and `fit`. For example scipy.stats.norm.	required
last_window	pd.Series \| pd.DataFrame \| None	Series values used to create the predictors (lags) needed in the first iteration of the prediction (t + 1). If `last_window = None`, the values stored in`self.last_window_` are used to calculate the initial predictors, and the predictions start right after training data.	`None`
exog	pd.Series \| pd.DataFrame \| None	Exogenous variable/s included as predictor/s.	`None`
n_boot	int	Number of bootstrapping iterations to perform when estimating prediction intervals.	`250`
use_in_sample_residuals	bool	If `True`, residuals from the training data are used as proxy of prediction error to create predictions. If `False`, out of sample residuals (calibration) are used. Out-of-sample residuals must be precomputed using Forecaster’s `set_out_sample_residuals()` method.	`True`
use_binned_residuals	bool	If `True`, residuals are selected based on the predicted values (binned selection). If `False`, residuals are selected randomly.	`True`
random_state	int	Seed for the random number generator to ensure reproducibility.	`123`

Returns

Name	Type	Description
	pd.DataFrame	Distribution parameters estimated for each step.

Examples

import warnings
import numpy as np
import pandas as pd
from scipy.stats import norm
from sklearn.linear_model import LinearRegression
from spotforecast2_safe.forecaster.recursive import ForecasterRecursive

warnings.simplefilter("ignore")
rng = np.random.default_rng(0)
y = pd.Series(
    np.sin(np.linspace(0, 4 * np.pi, 100)), name="y"
)
forecaster = ForecasterRecursive(
    estimator=LinearRegression(),
    lags=4,
)
forecaster.fit(y=y, store_in_sample_residuals=True)
dist_params = forecaster.predict_dist(
    steps=2,
    distribution=norm,
    n_boot=10,
    random_state=1234,
)
print(dist_params)
assert isinstance(dist_params, pd.DataFrame)
assert "loc" in dist_params.columns
assert "scale" in dist_params.columns
assert dist_params.shape[0] == 2

          loc         scale
100  0.126592  4.822606e-16
101  0.251148  9.213848e-16

predict_interval

forecaster.recursive._forecaster_recursive.ForecasterRecursive.predict_interval(
    steps,
    last_window=None,
    exog=None,
    method='bootstrapping',
    interval=[5, 95],
    n_boot=250,
    use_in_sample_residuals=True,
    use_binned_residuals=True,
    random_state=123,
)

Predict n steps ahead and estimate prediction intervals using either bootstrapping or conformal prediction methods. Refer to the References section for additional details on these methods.

Parameters

Name	Type	Description	Default
steps	int \| str \| pd.Timestamp	Number of steps to predict. - If steps is int, number of steps to predict. - If str or pandas Datetime, the prediction will be up to that date.	required
last_window	pd.Series \| pd.DataFrame \| None	Series values used to create the predictors (lags) needed in the first iteration of the prediction (t + 1). If `last_window = None`, the values stored in `self.last_window_` are used to calculate the initial predictors, and the predictions start right after training data. Defaults to None.	`None`
exog	pd.Series \| pd.DataFrame \| None	Exogenous variable/s included as predictor/s. Defaults to None.	`None`
method	str	Technique used to estimate prediction intervals. Available options: - ‘bootstrapping’: Bootstrapping is used to generate prediction intervals [1]. - ‘conformal’: Employs the conformal prediction split method for interval estimation [2]. Defaults to ‘bootstrapping’.	`'bootstrapping'`
interval	float \| list[float] \| tuple[float]	Confidence level of the prediction interval. Interpretation depends on the method used: - If `float`, represents the nominal (expected) coverage (between 0 and 1). For instance, `interval=0.95` corresponds to `[2.5, 97.5]` percentiles. - If `list` or `tuple`, defines the exact percentiles to compute, which must be between 0 and 100 inclusive. For example, interval of 95% should be as `interval = [2.5, 97.5]`. - When using `method='conformal'`, the interval must be a float or a list/tuple defining a symmetric interval. Defaults to [5, 95].	`[5, 95]`
n_boot	int	Number of bootstrapping iterations to perform when estimating prediction intervals. Defaults to 250.	`250`
use_in_sample_residuals	bool	If `True`, residuals from the training data are used as proxy of prediction error to create predictions. If `False`, out of sample residuals (calibration) are used. Out-of-sample residuals must be precomputed using Forecaster’s `set_out_sample_residuals()` method. Defaults to True.	`True`
use_binned_residuals	bool	If `True`, residuals are selected based on the predicted values (binned selection). If `False`, residuals are selected randomly. Defaults to True.	`True`
random_state	int	Seed for the random number generator to ensure reproducibility. Defaults to 123.	`123`

Returns

Name	Type	Description
	pd.DataFrame	Pandas DataFrame with values predicted by the forecaster and their estimated interval.
	pd.DataFrame	- pred: predictions.
	pd.DataFrame	- lower_bound: lower bound of the interval.
	pd.DataFrame	- upper_bound: upper bound of the interval.

Raises

Name	Type	Description
	ValueError	If `method` is not ‘bootstrapping’ or ‘conformal’.
	ValueError	If `interval` is invalid or not compatible with the chosen method.
	ValueError	If inputs (`steps`, `exog`, etc.) are invalid.

Examples

import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
from spotforecast2_safe.forecaster.recursive import ForecasterRecursive

rng = np.random.default_rng(123)
y = pd.Series(rng.standard_normal(size=100), name='y')
forecaster = ForecasterRecursive(estimator=LinearRegression(), lags=3)
_ = forecaster.fit(y=y, store_in_sample_residuals=True)
intervals_boot = forecaster.predict_interval(
    steps=3, method='bootstrapping', interval=[5, 95]
)
print(intervals_boot.columns.tolist())

['pred', 'lower_bound', 'upper_bound']

import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
from spotforecast2_safe.forecaster.recursive import ForecasterRecursive

rng = np.random.default_rng(123)
y = pd.Series(rng.standard_normal(size=100), name='y')
forecaster = ForecasterRecursive(estimator=LinearRegression(), lags=3)
_ = forecaster.fit(y=y, store_in_sample_residuals=True)
intervals_conf = forecaster.predict_interval(
    steps=3, method='conformal', interval=0.95
)
print(intervals_conf.columns.tolist())

['pred', 'lower_bound', 'upper_bound']

References

.. [1] Forecasting: Principles and Practice (3rd ed) Rob J Hyndman and George Athanasopoulos. https://otexts.com/fpp3/prediction-intervals.html .. [2] MAPIE - Model Agnostic Prediction Interval Estimator. https://mapie.readthedocs.io/en/stable/theoretical_description_regression.html#the-split-method

predict_quantiles

forecaster.recursive._forecaster_recursive.ForecasterRecursive.predict_quantiles(
    steps,
    last_window=None,
    exog=None,
    quantiles=[0.05, 0.5, 0.95],
    n_boot=250,
    use_in_sample_residuals=True,
    use_binned_residuals=True,
    random_state=123,
)

Calculate the specified quantiles for each step. After generating multiple forecasting predictions through a bootstrapping process, each quantile is calculated for each step.

Parameters

Name	Type	Description	Default
steps	int \| str \| pd.Timestamp	Number of steps to predict. - If steps is int, number of steps to predict. - If str or pandas Datetime, the prediction will be up to that date.	required
last_window	pd.Series \| pd.DataFrame \| None	Series values used to create the predictors (lags) needed in the first iteration of the prediction (t + 1). If `last_window = None`, the values stored in`self.last_window_` are used to calculate the initial predictors, and the predictions start right after training data.	`None`
exog	pd.Series \| pd.DataFrame \| None	Exogenous variable/s included as predictor/s.	`None`
quantiles	list[float] \| tuple[float]	Sequence of quantiles to compute, which must be between 0 and 1 inclusive. For example, quantiles of 0.05, 0.5 and 0.95 should be as `quantiles = [0.05, 0.5, 0.95]`.	`[0.05, 0.5, 0.95]`
n_boot	int	Number of bootstrapping iterations to perform when estimating quantiles.	`250`
use_in_sample_residuals	bool	If `True`, residuals from the training data are used as proxy of prediction error to create predictions. If `False`, out of sample residuals (calibration) are used. Out-of-sample residuals must be precomputed using Forecaster’s `set_out_sample_residuals()` method.	`True`
use_binned_residuals	bool	If `True`, residuals are selected based on the predicted values (binned selection). If `False`, residuals are selected randomly.	`True`
random_state	int	Seed for the random number generator to ensure reproducibility.	`123`

Returns

Name	Type	Description
	pd.DataFrame	Quantiles predicted by the forecaster.

Examples

import warnings
import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
from spotforecast2_safe.forecaster.recursive import ForecasterRecursive

warnings.simplefilter("ignore")
rng = np.random.default_rng(0)
y = pd.Series(
    np.sin(np.linspace(0, 4 * np.pi, 100)), name="y"
)
forecaster = ForecasterRecursive(
    estimator=LinearRegression(),
    lags=4,
)
forecaster.fit(y=y, store_in_sample_residuals=True)
quantiles = forecaster.predict_quantiles(
    steps=3,
    quantiles=[0.1, 0.5, 0.9],
    n_boot=10,
    random_state=1234,
)
print(quantiles)
assert isinstance(quantiles, pd.DataFrame)
assert list(quantiles.columns) == ["q_0.1", "q_0.5", "q_0.9"]
assert quantiles.shape == (3, 3)

        q_0.1     q_0.5     q_0.9
100  0.126592  0.126592  0.126592
101  0.251148  0.251148  0.251148
102  0.371662  0.371662  0.371662

set_fit_kwargs

forecaster.recursive._forecaster_recursive.ForecasterRecursive.set_fit_kwargs(
    fit_kwargs,
)

Set new values for the additional keyword arguments passed to the fit method of the estimator.

Parameters

Name	Type	Description	Default
fit_kwargs	dict[str, object]	Dict of the form {“argument”: new_value}.	required

Examples

import numpy as np
import pandas as pd
from lightgbm import LGBMRegressor
from spotforecast2_safe.forecaster.recursive import ForecasterRecursive

forecaster = ForecasterRecursive(
    estimator=LGBMRegressor(n_estimators=10, random_state=1234, verbose=-1),
    lags=4,
)
forecaster.set_fit_kwargs({"categorical_feature": "auto"})
print(forecaster.fit_kwargs)
assert "categorical_feature" in forecaster.fit_kwargs

{'categorical_feature': 'auto'}

set_in_sample_residuals

forecaster.recursive._forecaster_recursive.ForecasterRecursive.set_in_sample_residuals(
    y,
    exog=None,
    random_state=123,
)

Set in-sample residuals in case they were not calculated during the training process.

In-sample residuals are calculated as the difference between the true values and the predictions made by the forecaster using the training data. The following internal attributes are updated:

in_sample_residuals_: residuals stored in a numpy ndarray.
binner_intervals_: intervals used to bin the residuals are calculated using the quantiles of the predicted values.
in_sample_residuals_by_bin_: residuals are binned according to the predicted value they are associated with and stored in a dictionary, where the keys are the intervals of the predicted values and the values are the residuals associated with that range.

A total of 10_000 residuals are stored in the attribute in_sample_residuals_. If the number of residuals is greater than 10_000, a random sample of 10_000 residuals is stored. The number of residuals stored per bin is limited to 10_000 // self.binner.n_bins_.

Parameters

Name	Type	Description	Default
y	pd.Series	Target time series.	required

exog: Exogenous variables.
random_state: Random state for reproducibility.

Returns

Name	Type	Description
	None	None

Raises

Name	Type	Description
	NotFittedError	If the forecaster is not fitted.
	IndexError	If the index range of `y` does not match the range used during training.
	ValueError	If the features generated from the provided data do not match those used during the training process.

Examples

import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
from spotforecast2_safe.forecaster.recursive import ForecasterRecursive

forecaster = ForecasterRecursive(estimator=LinearRegression(), lags=3)
forecaster.fit(y=pd.Series(np.arange(20)), store_in_sample_residuals=False)
forecaster.set_in_sample_residuals(y=pd.Series(np.arange(20)))
print(forecaster.in_sample_residuals_)

[0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
 0.00000000e+00 8.88178420e-16 0.00000000e+00 0.00000000e+00
 0.00000000e+00 1.77635684e-15 1.77635684e-15 0.00000000e+00
 0.00000000e+00 1.77635684e-15 0.00000000e+00 0.00000000e+00
 0.00000000e+00]

set_lags

forecaster.recursive._forecaster_recursive.ForecasterRecursive.set_lags(
    lags=None,
)

Set new value to the attribute lags. Attributes lags_names, max_lag and window_size are also updated.

Parameters

Name	Type	Description	Default
lags	Union[int, List[int], np.ndarray, range, None]	Lags used as predictors. Index starts at 1, so lag 1 is equal to t-1. - `int`: include lags from 1 to `lags` (included). - `list`, `1d numpy ndarray` or `range`: include only lags present in `lags`, all elements must be int. - `None`: no lags are included as predictors.	`None`

Examples

import numpy as np
from sklearn.linear_model import LinearRegression
from spotforecast2_safe.forecaster.recursive import ForecasterRecursive

forecaster = ForecasterRecursive(
    estimator=LinearRegression(),
    lags=4,
)
print(f"before: lags={forecaster.lags}, window_size={forecaster.window_size}")
forecaster.set_lags(lags=[1, 2, 6])
print(f"after:  lags={forecaster.lags}, window_size={forecaster.window_size}")
assert list(forecaster.lags) == [1, 2, 6]
assert forecaster.window_size == 6

before: lags=[1 2 3 4], window_size=4
after:  lags=[1 2 6], window_size=6

set_out_sample_residuals

forecaster.recursive._forecaster_recursive.ForecasterRecursive.set_out_sample_residuals(
    y_true,
    y_pred,
    append=False,
    random_state=123,
)

Set new values to the attribute out_sample_residuals_.

Out of sample residuals are meant to be calculated using observations that did not participate in the training process. y_true and y_pred are expected to be in the original scale of the time series. Residuals are calculated as y_true - y_pred, after applying the necessary transformations and differentiations if the forecaster includes them (self.transformer_y and self.differentiation). Two internal attributes are updated:

out_sample_residuals_: residuals stored in a numpy ndarray.
out_sample_residuals_by_bin_: residuals are binned according to the predicted value they are associated with and stored in a dictionary, where the keys are the intervals of the predicted values and the values are the residuals associated with that range. If a bin is empty, it is filled with a random sample of residuals from other bins. This is done to ensure that all bins have at least one residual and can be used in the prediction process.

A total of 10_000 residuals are stored in the attribute out_sample_residuals_. If the number of residuals is greater than 10_000, a random sample of 10_000 residuals is stored. The number of residuals stored per bin is limited to 10_000 // self.binner.n_bins_.

Parameters

Name	Type	Description	Default
y_true	np.ndarray \| pd.Series	True values of the time series in the original scale.	required
y_pred	np.ndarray \| pd.Series	Predicted values of the time series in the original scale.	required
append	bool	If `True`, new residuals are added to the once already stored in the forecaster. If after appending the new residuals, the limit of `10_000 // self.binner.n_bins_` values per bin is reached, a random sample of residuals is stored.	`False`
random_state	int	Random state for reproducibility.	`123`

Returns

Name	Type	Description
	None	None

Raises

Name	Type	Description
	NotFittedError	If the forecaster is not fitted.
	TypeError	If `y_true` or `y_pred` are not `numpy ndarray` or `pandas Series`.
	ValueError	If `y_true` and `y_pred` have different length or index (if Series).

Examples

import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
from spotforecast2_safe.forecaster.recursive import ForecasterRecursive

forecaster = ForecasterRecursive(estimator=LinearRegression(), lags=3)
forecaster.fit(y=pd.Series(np.arange(20)), store_in_sample_residuals=False)
y_true = np.array([20, 21, 22, 23, 24])
y_pred = np.array([20.1, 20.9, 22.2, 22.8, 24.0])
forecaster.set_out_sample_residuals(y_true=y_true, y_pred=y_pred)
print(forecaster.out_sample_residuals_)

[-0.1  0.1 -0.2  0.2  0. ]

set_params

forecaster.recursive._forecaster_recursive.ForecasterRecursive.set_params(
    params=None,
    **kwargs,
)

Set the parameters of this forecaster.

Parameters

Name	Type	Description	Default
params	Dict[str, object]	Optional dictionary of parameter names mapped to their new values. If provided, these parameters are set first.	`None`
**kwargs	object	Dictionary of parameter names mapped to their new values. Parameters can be for the forecaster itself or for the contained estimator (using the `estimator__` prefix).	`{}`

Returns

Name	Type	Description
self	'ForecasterRecursive'	The forecaster instance with updated parameters.

Examples

from sklearn.linear_model import LinearRegression
from spotforecast2_safe.forecaster.recursive import ForecasterRecursive

forecaster = ForecasterRecursive(estimator=LinearRegression(), lags=3)
forecaster.set_params(estimator__fit_intercept=False)
print(forecaster.estimator.get_params()["fit_intercept"])

False

set_window_features

forecaster.recursive._forecaster_recursive.ForecasterRecursive.set_window_features(
    window_features=None,
)

Set new value to the attribute window_features.

Attributes max_size_window_features, window_features_names, window_features_class_names and window_size are also updated.

Parameters

Name	Type	Description	Default
window_features	object \| list[object] \| None	Instance or list of instances used to create window features. Window features are created from the original time series and are included as predictors.	`None`

Returns

Name	Type	Description
	None	None

Examples

from sklearn.linear_model import LinearRegression
from spotforecast2_safe.forecaster.recursive import ForecasterRecursive
from spotforecast2_safe.preprocessing import RollingFeatures

forecaster = ForecasterRecursive(estimator=LinearRegression(), lags=3)
rolling = RollingFeatures(stats=['mean', 'std'], window_sizes=[3, 5])
forecaster.set_window_features(window_features=rolling)
print(forecaster.window_features_names)
print(forecaster.window_size)

['roll_mean_3', 'roll_std_3', 'roll_mean_5', 'roll_std_5']
5

summary

forecaster.recursive._forecaster_recursive.ForecasterRecursive.summary()

Show forecaster information.

Returns

Name	Type	Description
	None	None

Examples

from sklearn.linear_model import Ridge
from spotforecast2_safe.forecaster.recursive import ForecasterRecursive

forecaster = ForecasterRecursive(estimator=Ridge(), lags=3)
forecaster.summary()

=================== 
ForecasterRecursive 
=================== 
Estimator: Ridge 
Lags: [1 2 3] 
Window features: None 
Window size: 3 
Series name: None 
Exogenous included: False 
Exogenous names: None 
Transformer for y: None 
Transformer for exog: None 
Weight function included: False 
Differentiation order: None 
Training range: None 
Training index type: None 
Training index frequency: None 
Estimator parameters: {'alpha': 1.0, 'copy_X': True, 'fit_intercept': True, 'max_iter': None, 'positive': False, 'random_state': None, 'solver': 'auto', 'tol': 0.0001} 
fit_kwargs: {} 
Creation date: 2026-06-15 22:53:03 
Last fit date: None 
spotforecast version: 22.10.1 
Python version: 3.13.13 
Forecaster id: None