preprocessing._rolling
preprocessing._rolling
Classes
| Name | Description |
|---|---|
| RollingFeatures | Compute rolling window statistics over time series data. |
RollingFeatures
preprocessing._rolling.RollingFeatures(stats, window_sizes, features_names=None)Compute rolling window statistics over time series data.
This transformer computes rolling statistics (mean, std, min, max, sum, median) over windows of specified sizes from a time series. The class follows the scikit-learn transformer API with fit() and transform() methods, making it compatible with scikit-learn pipelines. It also provides transform_batch() for pandas Series input.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| stats | str | List[str] | List[Any] | Rolling statistics to compute. Can be a single string (‘mean’, ‘std’, ‘min’, ‘max’, ‘sum’, ‘median’), list of statistic names, or list of callable functions. Multiple statistics can be computed simultaneously. | required |
| window_sizes | int | List[int] | Window size(s) for rolling computation. Can be a single integer or list of integers. Multiple windows are applied to all statistics. | required |
| features_names | List[str] | None | Custom names for output features. If None, names are auto-generated from statistic names and window sizes (e.g., ‘roll_mean_7’, ‘roll_std_14’). Defaults to None. | None |
Attributes
| Name | Type | Description |
|---|---|---|
| stats | Statistics specification as provided during initialization. | |
| window_sizes | List of window sizes for rolling computation. | |
| features_names | List of output feature names. | |
| stats_funcs | List of compiled/numba-optimized statistical functions. |
Note
- Output contains NaN values for positions where the rolling window cannot be fully computed (first window_size-1 positions).
- Statistics are computed using numba-optimized JIT functions for performance.
- The transformer returns numpy arrays from transform() and pandas DataFrames from transform_batch() to maintain index alignment.
- Supports custom user-defined functions in the stats parameter.
Examples
Create a transformer with single statistic and window size:
>>> import numpy as np
>>> from spotforecast2.preprocessing import RollingFeatures
>>> y = np.array([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0])
>>> rf = RollingFeatures(stats='mean', window_sizes=3)
>>> rf.fit(y)
>>> features = rf.transform(y)
>>> features.shape
(10, 1)
>>> features[:4] # First 3 values are NaN
array([[nan],
[nan],
[2.],
[3.]])Create a transformer with multiple statistics and window sizes:
>>> rf = RollingFeatures(
... stats=['mean', 'std', 'min', 'max'],
... window_sizes=[3, 7]
... )
>>> rf.fit(y)
>>> features = rf.transform(y)
>>> features.shape
(10, 8) # 4 stats × 2 window sizes
>>> rf.features_names
['roll_mean_3', 'roll_std_3', 'roll_min_3', 'roll_max_3',
'roll_mean_7', 'roll_std_7', 'roll_min_7', 'roll_max_7']Use with pandas Series to preserve index:
>>> import pandas as pd
>>> dates = pd.date_range('2024-01-01', periods=10, freq='D')
>>> y_series = pd.Series(y, index=dates)
>>> rf = RollingFeatures(stats=['mean', 'max'], window_sizes=5)
>>> features_df = rf.transform_batch(y_series)
>>> features_df.shape
(10, 2)
>>> features_df.index.equals(y_series.index)
TrueUse with custom feature names:
>>> rf = RollingFeatures(
... stats='mean',
... window_sizes=[7, 14, 30],
... features_names=['ma_7', 'ma_14', 'ma_30']
... )
>>> rf.fit(y)
>>> rf.features_names
['ma_7', 'ma_14', 'ma_30']Methods
| Name | Description |
|---|---|
| fit | Fit the rolling features transformer (no-op). |
| transform | Compute rolling window statistics from time series data. |
| transform_batch | Compute rolling features from a pandas Series with index preservation. |
fit
preprocessing._rolling.RollingFeatures.fit(X, y=None)Fit the rolling features transformer (no-op).
This transformer does not learn any parameters from the data. Method exists for scikit-learn compatibility.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| X | Any | Time series data (not used for fitting). | required |
| y | Any | Target values (ignored). Defaults to None. | None |
Returns
| Name | Type | Description |
|---|---|---|
| self | RollingFeatures | Returns the fitted transformer. |
transform
preprocessing._rolling.RollingFeatures.transform(X)Compute rolling window statistics from time series data.
For each statistic and window size combination, computes the rolling statistic across the input time series. The output contains NaN values for the initial positions where the window cannot be fully computed.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| X | np.ndarray | Time series data as 1D numpy array or array-like. | required |
Returns
| Name | Type | Description |
|---|---|---|
| np.ndarray | np.ndarray: Array of shape (len(X), len(features_names)) containing the computed rolling statistics. Each column corresponds to a feature in features_names. Early positions contain NaN values before the window is fully populated. |
transform_batch
preprocessing._rolling.RollingFeatures.transform_batch(X)Compute rolling features from a pandas Series with index preservation.
Transforms a pandas Series into a DataFrame of rolling statistics while preserving the original index. Useful for maintaining time alignment with the input data.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| X | pd.Series | Time series data as pandas Series. The index is preserved in output. | required |
Returns
| Name | Type | Description |
|---|---|---|
| pd.DataFrame | pd.DataFrame: DataFrame with shape (len(X), len(features_names)) where columns are feature names and index matches the input Series. Contains NaN values at the beginning where windows are incomplete. |
Note
This method is preferred over transform() when working with time-indexed data, as it preserves the temporal index and is compatible with forecasting workflows.