preprocessing.exog_providers.EntsoeNetLoadProvider

preprocessing.exog_providers.EntsoeNetLoadProvider(
    data_home=None,
    max_gap=0,
    max_tail_gap=0,
    provider_window=None,
)

ENTSO-E day-ahead net load = Forecasted Load − (wind + solar) forecast.

Combines the day-ahead Forecasted Load with the day-ahead renewable forecast to form the net-load prior the residual is often modelled against. Both inputs are day-ahead (leakage-clean). Raises ExogProviderError if either input is unavailable.

Parameters

Name Type Description Default
data_home DataHome Root data directory forwarded to the loaders. None
max_gap int Maximum contiguous missing-value run healed by _align_to_index. See _align_to_index for full semantics. Defaults to 0. 0
max_tail_gap int Extended healing budget for the trailing-edge NaN run. See _align_to_index. Defaults to 0. 0
provider_window Optional[pd.DatetimeIndex] Validation index passed to _align_to_index as validate_index. See _align_to_index. Defaults to None. None

Examples

import os
import shutil
import tempfile

import pandas as pd

from spotforecast2_safe.preprocessing.exog_providers import (
    EntsoeNetLoadProvider,
)

tmp = tempfile.mkdtemp()
os.environ["SPOTFORECAST2_DATA"] = tmp
os.makedirs(os.path.join(tmp, "interim"), exist_ok=True)
idx = pd.date_range("2023-06-01", periods=24, freq="h", tz="UTC")
pd.DataFrame(
    {"Actual Load": 100.0, "Forecasted Load": 90.0}, index=idx
).rename_axis("Time (UTC)").to_csv(
    os.path.join(tmp, "interim", "energy_load.csv")
)
pd.DataFrame(
    {"Solar": 3.0, "Wind Onshore": 5.0}, index=idx
).rename_axis("Time (UTC)").to_csv(
    os.path.join(tmp, "interim", "renewable_forecast.csv")
)

provider = EntsoeNetLoadProvider()
out = provider.build(idx)
print(out.columns.tolist(), out.shape, float(out.iloc[0, 0]))
assert out.shape == (24, 1)
assert abs(float(out.iloc[0, 0]) - 82.0) < 0.1  # 90 - (3 + 5)

shutil.rmtree(tmp)
del os.environ["SPOTFORECAST2_DATA"]
['entsoe_net_load'] (24, 1) 82.0

Methods

Name Description
build Return the day-ahead net load (Forecasted Load minus renewables).

build

preprocessing.exog_providers.EntsoeNetLoadProvider.build(index)

Return the day-ahead net load (Forecasted Load minus renewables).

Parameters

Name Type Description Default
index pd.DatetimeIndex Hourly DatetimeIndex (tz-aware UTC) for the forecast window. required

Returns

Name Type Description
pd.DataFrame pd.DataFrame: Single column entsoe_net_load, float32.

Raises

Name Type Description
ExogProviderError If either energy_load.csv or renewable_forecast.csv is missing.

Examples

import os
import shutil
import tempfile

import pandas as pd

from spotforecast2_safe.preprocessing.exog_providers import (
    EntsoeNetLoadProvider,
)

tmp = tempfile.mkdtemp()
os.environ["SPOTFORECAST2_DATA"] = tmp
os.makedirs(os.path.join(tmp, "interim"), exist_ok=True)
idx = pd.date_range("2023-06-01", periods=12, freq="h", tz="UTC")
pd.DataFrame(
    {"Actual Load": 100.0, "Forecasted Load": 80.0}, index=idx
).rename_axis("Time (UTC)").to_csv(
    os.path.join(tmp, "interim", "energy_load.csv")
)
pd.DataFrame(
    {"Solar": 2.0, "Wind Onshore": 6.0}, index=idx
).rename_axis("Time (UTC)").to_csv(
    os.path.join(tmp, "interim", "renewable_forecast.csv")
)

out = EntsoeNetLoadProvider().build(idx)
print(out.columns.tolist(), out.shape, float(out.iloc[0, 0]))
assert out.shape == (12, 1)

shutil.rmtree(tmp)
del os.environ["SPOTFORECAST2_DATA"]
['entsoe_net_load'] (12, 1) 72.0