calendar.holiday.get_holiday_features

calendar.holiday.get_holiday_features(
    data,
    start,
    cov_end,
    forecast_horizon,
    tz='UTC',
    freq='h',
    country_code='DE',
    state='NW',
)

Build public-holiday indicators and align them to a regular time grid.

Generates holiday indicators via create_holiday_df(), validates coverage with curate_holidays(), and reindexes the result to a full [start, cov_end] grid with fill_value=0 so that non-holiday timestamps are always zero.

Parameters

Name Type Description Default
data pd.DataFrame Reference time series DataFrame used for temporal coverage validation inside curate_holidays(). required
start Union[str, pd.Timestamp] Start timestamp. String values are parsed with utc=True. required
cov_end Union[str, pd.Timestamp] Inclusive end timestamp (should cover the full forecast horizon). String values are parsed with utc=True. required
forecast_horizon int Number of forecast steps ahead; passed to curate_holidays(). required
tz str Timezone applied to the generated index and passed to create_holiday_df(). Defaults to "UTC". 'UTC'
freq str Pandas-compatible frequency string for the output index. Defaults to "h" (hourly). 'h'
country_code str ISO 3166-1 alpha-2 country code. Defaults to "DE" (Germany). 'DE'
state str Sub-national state/region code. Defaults to "NW" (North Rhine-Westphalia). 'NW'

Returns

Name Type Description
pd.DataFrame pd.DataFrame: DataFrame with a single integer column
pd.DataFrame is_holiday. The index is a tz-aware
pd.DataFrame DatetimeIndex with the requested freq.

Examples

import pandas as pd
from spotforecast2_safe.calendar import get_holiday_features

# Build a minimal synthetic reference DataFrame.
# curate_holidays requires: holiday_df.shape[0] == data.shape[0] + forecast_horizon.
# With n_data=48 rows and forecast_horizon=24, we need 72 hourly steps total,
# so cov_end = start + 71 h (inclusive date_range).
forecast_horizon = 24
n_data = 48
data = pd.DataFrame(
    {"load": range(n_data)},
    index=pd.date_range("2024-01-01", periods=n_data, freq="h", tz="UTC"),
)
start = data.index[0]
cov_end = start + pd.Timedelta(hours=(n_data + forecast_horizon - 1))

hf = get_holiday_features(
    data=data,
    start=start,
    cov_end=cov_end,
    forecast_horizon=forecast_horizon,
    country_code="DE",
    state="NW",
)
print("shape:", hf.shape)
print("columns:", hf.columns.tolist())
# New Year's Day (2024-01-01) is a public holiday in Germany.
print("Jan 1 00:00 is_holiday:", hf.loc["2024-01-01 00:00:00+00:00", "is_holiday"])
assert hf.shape == (n_data + forecast_horizon, 1)
assert hf.loc["2024-01-01 00:00:00+00:00", "is_holiday"] == 1
shape: (72, 1)
columns: ['is_holiday']
Jan 1 00:00 is_holiday: 1