Compute rolling window statistics over time series data.
This transformer computes rolling statistics (mean, std, min, max, sum, median) over windows of specified sizes from a time series. The class follows the scikit-learn transformer API with fit() and transform() methods, making it compatible with scikit-learn pipelines. It also provides transform_batch() for pandas Series input.
Rolling statistics to compute. Can be a single string (‘mean’, ‘std’, ‘min’, ‘max’, ‘sum’, ‘median’), list of statistic names, or list of callable functions. Multiple statistics can be computed simultaneously.
Custom names for output features. If None, names are auto-generated from statistic names and window sizes (e.g., ‘roll_mean_7’, ‘roll_std_14’). Defaults to None.
None
Attributes
Name
Type
Description
stats
Statistics specification as provided during initialization.
window_sizes
List of window sizes for rolling computation.
features_names
List of output feature names.
stats_funcs
List of compiled/numba-optimized statistical functions.
Note
Output contains NaN values for positions where the rolling window cannot be fully computed (first window_size-1 positions).
Statistics are computed using numba-optimized JIT functions for performance.
The transformer returns numpy arrays from transform() and pandas DataFrames from transform_batch() to maintain index alignment.
Supports custom user-defined functions in the stats parameter.
Examples
import numpy as npfrom spotforecast2_safe.preprocessing.rolling import RollingFeatures# Single statistic and window size — transform() returns the last# window's statistic as a 1D array of shape (n_features,).y = np.arange(10, dtype=float)rf = RollingFeatures(stats='mean', window_sizes=3)features = rf.fit(y).transform(y)print("transform output:", features)assert features.shape == (1,)# Mean of last window [7, 8, 9] = 8.0assertfloat(features[0]) ==8.0
import numpy as npimport pandas as pdfrom spotforecast2_safe.preprocessing.rolling import RollingFeatures# Use transform_batch() with a pandas Series to preserve the index.dates = pd.date_range('2024-01-01', periods=10, freq='D')y_series = pd.Series(np.arange(10, dtype=float), index=dates)rf = RollingFeatures(stats=['mean', 'max'], window_sizes=5)features_df = rf.transform_batch(y_series)print(features_df.head())assert features_df.shape == (10, 2)assert features_df.index.equals(y_series.index)
roll_mean_5 roll_max_5
2024-01-01 NaN NaN
2024-01-02 NaN NaN
2024-01-03 NaN NaN
2024-01-04 NaN NaN
2024-01-05 2.0 4.0
np.ndarray: Array of rolling statistics. - If X is 1D: shape (n_features,) — statistics over the last window of the series. - If X is 2D: shape (X.shape[1], n_features) — used for vectorized bootstrap.
Examples
import numpy as npfrom spotforecast2_safe.preprocessing.rolling import RollingFeatures# 1D input: returns a 1D array of shape (n_features,) with# the statistic computed over the last window of the series.y = np.arange(10, dtype=float)rf = RollingFeatures(stats='mean', window_sizes=3)out = rf.transform(y)print("transform output:", out, "shape:", out.shape)# Mean of last window [7, 8, 9] = 8.0assert out.shape == (1,)assertfloat(out[0]) ==8.0
roll_mean_3 roll_std_3 roll_mean_5 roll_std_5
2024-01-01 NaN NaN NaN NaN
2024-01-02 NaN NaN NaN NaN
2024-01-03 1.0 0.816497 NaN NaN
2024-01-04 2.0 0.816497 NaN NaN
2024-01-05 3.0 0.816497 2.0 1.414214