preprocessing.exog_providers.ExogFeatureProvider

preprocessing.exog_providers.ExogFeatureProvider()

Contract for a pluggable exogenous-feature source.

A provider maps the hourly target index to a numeric feature frame on that exact index. Subclasses set name (a short identifier used in logs and as the default column name) and implement build.

Implementations should load their backing data lazily inside build and raise ExogProviderError when the data is missing or cannot cover the requested range, so the fail-safe policy lives in one place.

Examples

import pandas as pd
from spotforecast2_safe.preprocessing.exog_providers import (
    ExogFeatureProvider,
    ExogProviderError,
)

class ConstantProvider(ExogFeatureProvider):
    name = "constant"

    def build(self, index: pd.DatetimeIndex) -> pd.DataFrame:
        return pd.DataFrame({"constant": 1.0}, index=index).astype("float32")

idx = pd.date_range("2023-06-01", periods=6, freq="h", tz="UTC")
p = ConstantProvider()
out = p.build(idx)
print(p.name, out.shape, out.dtypes["constant"].name)
assert out.shape == (6, 1)
assert not out.isna().any().any()
constant (6, 1) float32

Methods

Name Description
build Return features aligned to index.

build

preprocessing.exog_providers.ExogFeatureProvider.build(index)

Return features aligned to index.

Parameters

Name Type Description Default
index pd.DatetimeIndex Hourly DatetimeIndex (typically tz-aware UTC) covering the full training-plus-forecast window. required

Returns

Name Type Description
pd.DataFrame pd.DataFrame: Numeric columns indexed exactly by index, NaN-free within the validated window (the full index unless a provider_window was set at construction).

Raises

Name Type Description
ExogProviderError If the provider cannot cover index.

Examples

import pandas as pd
from spotforecast2_safe.preprocessing.exog_providers import (
    ExogFeatureProvider,
)

class LinearProvider(ExogFeatureProvider):
    name = "linear"

    def build(self, index: pd.DatetimeIndex) -> pd.DataFrame:
        vals = range(len(index))
        return pd.DataFrame({"linear": list(vals)}, index=index).astype("float32")

idx = pd.date_range("2023-06-01", periods=4, freq="h", tz="UTC")
out = LinearProvider().build(idx)
print(out.shape, out["linear"].tolist())
assert out.shape == (4, 1)
(4, 1) [0.0, 1.0, 2.0, 3.0]