preprocessing._rolling

preprocessing._rolling

Classes

Name Description
RollingFeatures Compute rolling window statistics over time series data.

RollingFeatures

preprocessing._rolling.RollingFeatures(stats, window_sizes, features_names=None)

Compute rolling window statistics over time series data.

This transformer computes rolling statistics (mean, std, min, max, sum, median) over windows of specified sizes from a time series. The class follows the scikit-learn transformer API with fit() and transform() methods, making it compatible with scikit-learn pipelines. It also provides transform_batch() for pandas Series input.

Parameters

Name Type Description Default
stats str | List[str] | List[Any] Rolling statistics to compute. Can be a single string (‘mean’, ‘std’, ‘min’, ‘max’, ‘sum’, ‘median’), list of statistic names, or list of callable functions. Multiple statistics can be computed simultaneously. required
window_sizes int | List[int] Window size(s) for rolling computation. Can be a single integer or list of integers. Multiple windows are applied to all statistics. required
features_names List[str] | None Custom names for output features. If None, names are auto-generated from statistic names and window sizes (e.g., ‘roll_mean_7’, ‘roll_std_14’). Defaults to None. None

Attributes

Name Type Description
stats Statistics specification as provided during initialization.
window_sizes List of window sizes for rolling computation.
features_names List of output feature names.
stats_funcs List of compiled/numba-optimized statistical functions.

Note

  • Output contains NaN values for positions where the rolling window cannot be fully computed (first window_size-1 positions).
  • Statistics are computed using numba-optimized JIT functions for performance.
  • The transformer returns numpy arrays from transform() and pandas DataFrames from transform_batch() to maintain index alignment.
  • Supports custom user-defined functions in the stats parameter.

Examples

Create a transformer with single statistic and window size:

>>> import numpy as np
>>> from spotforecast2.preprocessing import RollingFeatures
>>> y = np.array([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0])
>>> rf = RollingFeatures(stats='mean', window_sizes=3)
>>> rf.fit(y)
>>> features = rf.transform(y)
>>> features.shape
(10, 1)
>>> features[:4]  # First 3 values are NaN
array([[nan],
       [nan],
       [2.],
       [3.]])

Create a transformer with multiple statistics and window sizes:

>>> rf = RollingFeatures(
...     stats=['mean', 'std', 'min', 'max'],
...     window_sizes=[3, 7]
... )
>>> rf.fit(y)
>>> features = rf.transform(y)
>>> features.shape
(10, 8)  # 4 stats × 2 window sizes
>>> rf.features_names
['roll_mean_3', 'roll_std_3', 'roll_min_3', 'roll_max_3',
 'roll_mean_7', 'roll_std_7', 'roll_min_7', 'roll_max_7']

Use with pandas Series to preserve index:

>>> import pandas as pd
>>> dates = pd.date_range('2024-01-01', periods=10, freq='D')
>>> y_series = pd.Series(y, index=dates)
>>> rf = RollingFeatures(stats=['mean', 'max'], window_sizes=5)
>>> features_df = rf.transform_batch(y_series)
>>> features_df.shape
(10, 2)
>>> features_df.index.equals(y_series.index)
True

Use with custom feature names:

>>> rf = RollingFeatures(
...     stats='mean',
...     window_sizes=[7, 14, 30],
...     features_names=['ma_7', 'ma_14', 'ma_30']
... )
>>> rf.fit(y)
>>> rf.features_names
['ma_7', 'ma_14', 'ma_30']

Methods

Name Description
fit Fit the rolling features transformer (no-op).
transform Compute rolling window statistics from time series data.
transform_batch Compute rolling features from a pandas Series with index preservation.
fit
preprocessing._rolling.RollingFeatures.fit(X, y=None)

Fit the rolling features transformer (no-op).

This transformer does not learn any parameters from the data. Method exists for scikit-learn compatibility.

Parameters
Name Type Description Default
X Any Time series data (not used for fitting). required
y Any Target values (ignored). Defaults to None. None
Returns
Name Type Description
self RollingFeatures Returns the fitted transformer.
transform
preprocessing._rolling.RollingFeatures.transform(X)

Compute rolling window statistics from time series data.

For each statistic and window size combination, computes the rolling statistic across the input time series. The output contains NaN values for the initial positions where the window cannot be fully computed.

Parameters
Name Type Description Default
X np.ndarray Time series data as 1D numpy array or array-like. required
Returns
Name Type Description
np.ndarray np.ndarray: Array of shape (len(X), len(features_names)) containing the computed rolling statistics. Each column corresponds to a feature in features_names. Early positions contain NaN values before the window is fully populated.
transform_batch
preprocessing._rolling.RollingFeatures.transform_batch(X)

Compute rolling features from a pandas Series with index preservation.

Transforms a pandas Series into a DataFrame of rolling statistics while preserving the original index. Useful for maintaining time alignment with the input data.

Parameters
Name Type Description Default
X pd.Series Time series data as pandas Series. The index is preserved in output. required
Returns
Name Type Description
pd.DataFrame pd.DataFrame: DataFrame with shape (len(X), len(features_names)) where columns are feature names and index matches the input Series. Contains NaN values at the beginning where windows are incomplete.
Note

This method is preferred over transform() when working with time-indexed data, as it preserves the temporal index and is compatible with forecasting workflows.