data.fetch_data.load_renewable_forecast

data.fetch_data.load_renewable_forecast(data_home=None, on_missing='raise')

Load the ENTSO-E day-ahead wind/solar generation forecast.

Reads interim/renewable_forecast.csv (written by spotforecast2_safe.downloader.entsoe.download_renewable_forecast) and returns every renewable generation-forecast column it contains (for Germany typically "Solar", "Wind Onshore" and "Wind Offshore") on a regular hourly UTC grid. Each column independently passes through the fail-safe _apply_on_missing() contract, so missing values are rejected by default rather than silently imputed.

The day-ahead renewable forecast is a near-oracle, leakage-clean prior: it is published on D-1 and is therefore genuinely available at forecast time (CR-3). Use the day-ahead forecast, never the realised generation.

Parameters

Name Type Description Default
data_home Optional[Union[str, Path]] Root data directory. If None, resolved via get_data_home(). None
on_missing OnMissing How to handle NaN rows. 'raise' (default) fails fast with the gap timestamps; 'ffill_bfill' forward/back-fills; 'passthrough' returns the raw NaN so an explicit downstream provider can decide. 'raise'

Returns

Name Type Description
pd.DataFrame pd.DataFrame: Hourly UTC-indexed day-ahead renewable forecast columns.

Raises

Name Type Description
FileNotFoundError If interim/renewable_forecast.csv is absent.
ValueError If on_missing='raise' and any column has NaNs.

Examples

import os
import shutil
import tempfile

import pandas as pd

from spotforecast2_safe.data.fetch_data import load_renewable_forecast

tmp = tempfile.mkdtemp()
os.environ["SPOTFORECAST2_DATA"] = tmp
interim = os.path.join(tmp, "interim")
os.makedirs(interim, exist_ok=True)

idx = pd.date_range("2023-01-01", periods=48, freq="h", tz="UTC")
pd.DataFrame(
    {"Solar": 1.0, "Wind Onshore": 2.0}, index=idx
).rename_axis("Time (UTC)").to_csv(
    os.path.join(interim, "renewable_forecast.csv")
)

df = load_renewable_forecast()
print(sorted(df.columns), len(df))

shutil.rmtree(tmp)
del os.environ["SPOTFORECAST2_DATA"]
['Solar', 'Wind Onshore'] 48