weather.features.get_weather_features

weather.features.get_weather_features(
    data,
    start,
    cov_end,
    forecast_horizon,
    latitude=51.5136,
    longitude=7.4653,
    timezone='UTC',
    freq='h',
    window_periods=None,
    window_functions=None,
    fallback_on_failure=True,
    cache_home=None,
    verbose=False,
    locations=None,
    location_weights=None,
    derived_features=None,
    hdh_base=DEFAULT_HDH_BASE_C,
    cdh_base=DEFAULT_CDH_BASE_C,
    wind_speed_unit='kmh',
)

Fetch weather data and compute rolling-window features.

Downloads weather observations/forecasts for the requested period, aligns them to a regular freq grid, and applies WindowFeatures to produce rolling-mean, -max, and -min features over configurable windows.

Parameters

Name Type Description Default
data pd.DataFrame Reference time series DataFrame used only for validation (shape / temporal coverage checks via curate_weather()). required
start Union[str, pd.Timestamp] Start of the feature window. String values are parsed with utc=True. required
cov_end Union[str, pd.Timestamp] Inclusive end of the feature window (must cover the full forecast horizon beyond end). String values are parsed with utc=True. required
forecast_horizon int Number of forecast steps; passed to curate_weather() for validation. required
latitude float Latitude of the target location in decimal degrees. Defaults to 51.5136 (Dortmund, Germany). 51.5136
longitude float Longitude of the target location in decimal degrees. Defaults to 7.4653 (Dortmund, Germany). 7.4653
timezone str Timezone label applied to the generated index. Defaults to "UTC". 'UTC'
freq str Pandas-compatible frequency string for the output index. Defaults to "h" (hourly). 'h'
window_periods Optional[List[str]] Rolling window sizes passed to WindowFeatures. Defaults to ["1D", "7D"]. None
window_functions Optional[List[str]] Aggregation functions applied over each window. Defaults to ["mean", "max", "min"]. None
fallback_on_failure bool If True, use locally cached fallback data when the weather API is unavailable. Defaults to True. True
cache_home Optional[Union[str, Path]] Optional path to cache directory. When provided, fetched weather data is cached in <cache_home>/weather_cache.parquet. When None (default), no caching is performed. None
verbose bool If True, print progress messages to stdout. Defaults to False. False
locations Optional[Sequence[Tuple[float, float]]] Optional sequence of (latitude, longitude) pairs for a population-weighted multi-city weather index. When None (default) the single latitude/longitude point is used, preserving prior behaviour exactly. When given, each location is fetched and the raw frames are combined via population_weighted_average using location_weights. See spotforecast2_safe.weather.locations. None
location_weights Optional[Sequence[float]] Non-negative weight per entry in locations (e.g. city population). Required when locations is given; normalised internally. None
derived_features Optional[Sequence[str]] Optional subset of {"hdh", "cdh", "apparent_temperature", "dew_point"}. When given, those columns are derived from the (weighted) weather and rolled up alongside the raw fields. None (default) adds nothing. See add_derived_weather_features. None
hdh_base float Heating base temperature (°C) for hdh. Defaults to 15.0. DEFAULT_HDH_BASE_C
cdh_base float Cooling base temperature (°C) for cdh. Defaults to 22.0. DEFAULT_CDH_BASE_C
wind_speed_unit str Unit of the fetched wind_speed_10m column for apparent-temperature, "ms" or "kmh". Defaults to "kmh" (the Open-Meteo default). 'kmh'

Returns

Name Type Description
pd.DataFrame tuple[pd.DataFrame, pd.DataFrame]: A two-element tuple:
pd.DataFrame - weather_features – DataFrame with rolling-window weather features aligned to the [start, cov_end] index.
Tuple[pd.DataFrame, pd.DataFrame] - weather_aligned – Raw weather DataFrame reindexed to the same [start, cov_end] hourly grid (forward-filled).

Raises

Name Type Description
ValueError If no numeric weather columns are found, or if missing values cannot be filled after fetching.

Examples

import tempfile

import pandas as pd

from spotforecast2_safe.weather import get_weather_features

# Build a minimal synthetic reference DataFrame whose row count is
# consistent with the requested weather window so curate_weather
# validation passes without warnings.
forecast_horizon = 2
start = pd.Timestamp("2020-06-01", tz="UTC")
cov_end = pd.Timestamp("2020-06-03", tz="UTC")
data_end = cov_end - pd.Timedelta(hours=forecast_horizon)
data_idx = pd.date_range(start=start, end=data_end, freq="h", tz="UTC")
data = pd.DataFrame({"load": range(len(data_idx))}, index=data_idx)

cache_home = tempfile.mkdtemp()
weather_features, weather_aligned = get_weather_features(
    data=data,
    start=start,
    cov_end=cov_end,
    forecast_horizon=forecast_horizon,
    cache_home=cache_home,
    verbose=False,
)
print("weather_features shape:", weather_features.shape)
print("weather_aligned shape:", weather_aligned.shape)
print("weather_aligned columns (first 3):", list(weather_aligned.columns)[:3])
assert weather_features.shape[0] > 0
assert weather_aligned.shape[0] > 0
assert "temperature_2m" in weather_aligned.columns
# Rolling-window transformer adds more columns than raw aligned data
assert weather_features.shape[1] > weather_aligned.shape[1]
weather_features shape: (49, 105)
weather_aligned shape: (49, 15)
weather_aligned columns (first 3): ['temperature_2m', 'relative_humidity_2m', 'precipitation']