preprocessing.exog_providers.EntsoeForecastLoadProvider

preprocessing.exog_providers.EntsoeForecastLoadProvider(
    data_home=None,
    max_gap=0,
    max_tail_gap=0,
    provider_window=None,
)

ENTSO-E day-ahead Forecasted Load as an exogenous near-oracle prior.

Wraps spotforecast2_safe.data.fetch_data.load_timeseries_forecast, which reads the Forecasted Load column already merged into interim/energy_load.csv. The day-ahead forecast is published on D-1 and is therefore genuinely available at forecast time (leakage-clean, CR-3).

Parameters

Name Type Description Default
data_home DataHome Root data directory forwarded to the loader. None resolves via get_data_home(). None
max_gap int Maximum contiguous missing-value run healed by _align_to_index. See :func:_align_to_index for full semantics. Defaults to 0. 0
max_tail_gap int Extended healing budget for the trailing-edge NaN run. See :func:_align_to_index. Defaults to 0. 0
provider_window Optional[pd.DatetimeIndex] Validation index passed to _align_to_index as validate_index. See :func:_align_to_index. Defaults to None. None

Examples

import os
import shutil
import tempfile

import pandas as pd

from spotforecast2_safe.preprocessing.exog_providers import (
    EntsoeForecastLoadProvider,
)

tmp = tempfile.mkdtemp()
os.environ["SPOTFORECAST2_DATA"] = tmp
interim = os.path.join(tmp, "interim")
os.makedirs(interim, exist_ok=True)
idx = pd.date_range("2023-01-01", periods=48, freq="h", tz="UTC")
pd.DataFrame(
    {"Actual Load": 100.0, "Forecasted Load": 99.0}, index=idx
).rename_axis("Time (UTC)").to_csv(os.path.join(interim, "energy_load.csv"))

out = EntsoeForecastLoadProvider().build(idx)
print(out.columns.tolist(), out.shape)

shutil.rmtree(tmp)
del os.environ["SPOTFORECAST2_DATA"]
['entsoe_forecasted_load'] (48, 1)

Methods

Name Description
build Return the day-ahead Forecasted Load aligned to index.

build

preprocessing.exog_providers.EntsoeForecastLoadProvider.build(index)

Return the day-ahead Forecasted Load aligned to index.

Parameters

Name Type Description Default
index pd.DatetimeIndex Hourly DatetimeIndex (tz-aware UTC) for the forecast window. required

Returns

Name Type Description
pd.DataFrame pd.DataFrame: Single column entsoe_forecasted_load, float32.

Raises

Name Type Description
ExogProviderError If the interim CSV is missing or the Forecasted Load column is absent.

Examples

import os
import shutil
import tempfile

import pandas as pd

from spotforecast2_safe.preprocessing.exog_providers import (
    EntsoeForecastLoadProvider,
)

tmp = tempfile.mkdtemp()
os.environ["SPOTFORECAST2_DATA"] = tmp
os.makedirs(os.path.join(tmp, "interim"), exist_ok=True)
idx = pd.date_range("2023-06-01", periods=24, freq="h", tz="UTC")
pd.DataFrame(
    {"Actual Load": 100.0, "Forecasted Load": 98.0}, index=idx
).rename_axis("Time (UTC)").to_csv(
    os.path.join(tmp, "interim", "energy_load.csv")
)

out = EntsoeForecastLoadProvider().build(idx)
print(out.columns.tolist(), out.shape, out.dtypes.iloc[0].name)
assert out.shape == (24, 1)

shutil.rmtree(tmp)
del os.environ["SPOTFORECAST2_DATA"]
['entsoe_forecasted_load'] (24, 1) float32