weather.client.WeatherService

weather.client.WeatherService(
    latitude,
    longitude,
    cache_path=None,
    use_forecast=True,
)

High-level service for weather data generation.

Extends WeatherClient with caching, hybrid fetching (archive+forecast), and fallback strategies.

Parameters

Name Type Description Default
latitude float Latitude of the location. required
longitude float Longitude of the location. required
cache_path Path | None Optional path to cache file for storing fetched data. If provided, the service will attempt to load from cache before fetching and will save new data to this path. Default is None (no caching). None
use_forecast bool Whether to use forecast data for future dates (default True). True

Examples

# Requires a live connection to Open-Meteo APIs.
from pathlib import Path
import pandas as pd
from spotforecast2_safe.weather import WeatherService
client = WeatherService(latitude=52.52, longitude=13.405, cache_path=Path("weather_cache.parquet"))
start = pd.Timestamp("2023-01-01", tz="UTC")
end = pd.Timestamp("2023-01-07", tz="UTC")
df = client.get_dataframe(start=start, end=end, fill_missing=False)
print(df.head())
print(df.tail())

Construction and configuration do not require a network call:

from pathlib import Path
from spotforecast2_safe.weather.client import WeatherService

svc = WeatherService(
    latitude=51.03,
    longitude=7.57,
    cache_path=None,
    use_forecast=True,
)
assert svc.latitude == 51.03
assert svc.use_forecast is True
assert svc.cache_path is None
print(f"WeatherService at ({svc.latitude}, {svc.longitude}), use_forecast={svc.use_forecast}")
WeatherService at (51.03, 7.57), use_forecast=True

Methods

Name Description
get_dataframe Get weather DataFrame for a specified range using best available methods.

get_dataframe

weather.client.WeatherService.get_dataframe(
    start,
    end,
    timezone='UTC',
    freq='h',
    fallback_on_failure=True,
    fill_missing=False,
)

Get weather DataFrame for a specified range using best available methods.

Refactored from spotpredict.create_weather_df. Since the 1.0 major release, remaining gaps after fetch are rejected by default so that synthesised values never reach downstream consumers labelled as measurements. Pass fill_missing=True to opt into the legacy forward/back-fill behavior.

Parameters

Name Type Description Default
start str | pd.Timestamp Start date for the data. required
end str | pd.Timestamp End date for the data. required
timezone str Timezone for the data (default “UTC”). 'UTC'
freq str Frequency for the data (default “h”). 'h'
fallback_on_failure bool Whether to use fallback data on failure (default True). True
fill_missing bool Whether to forward- and back-fill remaining NaN gaps after fetch/resample (default False). When False (the fail-safe default), any remaining NaN raises ValueError with the gap timestamps. False

Raises

Name Type Description
ValueError If fill_missing=False and the merged frame still contains NaNs after resample.

Examples

# Requires a live connection to Open-Meteo APIs.
import pandas as pd
from spotforecast2_safe.weather import WeatherService
client = WeatherService(latitude=51.0267, longitude=7.5693)
start = pd.Timestamp.now(tz="UTC") - pd.Timedelta(days=7)
end = pd.Timestamp.now(tz="UTC")
df = client.get_dataframe(start=start, end=end, fill_missing=False)
print(df.head())
print(df.tail())

The fill_missing=False default rejects frames with gaps. The ValueError path can be exercised offline using _finalize_df directly with a synthetic frame that has NaN rows:

import pandas as pd
from spotforecast2_safe.weather.client import WeatherService

svc = WeatherService(latitude=51.03, longitude=7.57)
idx = pd.date_range("2024-01-01", periods=4, freq="h", tz="UTC")
df = pd.DataFrame({"temperature_2m": [1.0, float("nan"), 3.0, 4.0]}, index=idx)

try:
    svc._finalize_df(df, freq="h", fill_missing=False)
except ValueError as exc:
    print(f"ValueError raised as expected: {exc}")

# With fill_missing=True the gap is imputed silently
filled = svc._finalize_df(df.copy(), freq="h", fill_missing=True)
assert not filled.isna().any().any()
print("fill_missing=True: no NaNs remain")
ValueError raised as expected: 1 missing row(s) in weather frame after resample at freq='h'. First gaps: [2024-01-01 01:00:00+00:00]. Pass fill_missing=True to opt into legacy ffill/bfill imputation.
fill_missing=True: no NaNs remain