calendar.holiday.create_school_holiday_df

calendar.holiday.create_school_holiday_df(
    start,
    end,
    tz='UTC',
    freq='h',
    country_code='DE',
    state='NW',
)

Create a DataFrame with a binary school-holiday indicator for a German state.

Builds a tz-aware time grid over [start, end] at freq and marks every timestamp that falls within a school-holiday period of the requested Bundesland as 1; all others are 0. Both edges of each interval are inclusive.

Data source: OpenHolidays API (https://openholidaysapi.org), ODbL-1.0. Coverage: 2022-01-01 to 2027-12-31 for all 16 German Bundesländer.

Only country_code="DE" is supported. Requests whose span extends beyond the covered range at either edge raise ValueError — there is no fill or extrapolation.

Parameters

Name Type Description Default
start str | pd.Timestamp Start date/datetime of the requested grid. required
end str | pd.Timestamp End date/datetime of the requested grid (inclusive). required
tz str Timezone for the resulting index. Ignored when start or end is already a tz-aware pd.Timestamp. 'UTC'
freq str Pandas-compatible frequency string. Defaults to "h" (hourly). 'h'
country_code str Must be "DE" (Germany). Any other value raises ValueError. 'DE'
state str ISO 3166-2 subdivision short code for the Bundesland, e.g. "NW" (North Rhine-Westphalia), "BY" (Bavaria). Defaults to "NW". 'NW'

Returns

Name Type Description
pd.DataFrame pd.DataFrame: Single integer column is_school_holiday (values in
pd.DataFrame {0, 1}; no NaNs) with a tz-aware DatetimeIndex at freq.

Raises

Name Type Description
ValueError If country_code is not "DE", or if the requested span extends beyond the dataset validity range at either edge.

Examples

from spotforecast2_safe.calendar import create_school_holiday_df

# NW Sommerferien 2024: 2024-07-08 → 2024-08-20 (inclusive).
# Day before (2024-07-07) must be 0; first day (2024-07-08) must be 1.
df = create_school_holiday_df(
    "2024-07-06", "2024-07-10", freq="D", state="NW"
)
print(df)
assert df.loc["2024-07-07", "is_school_holiday"] == 0
assert df.loc["2024-07-08", "is_school_holiday"] == 1
assert df.loc["2024-07-09", "is_school_holiday"] == 1
                           is_school_holiday
2024-07-06 00:00:00+00:00                  0
2024-07-07 00:00:00+00:00                  0
2024-07-08 00:00:00+00:00                  1
2024-07-09 00:00:00+00:00                  1
2024-07-10 00:00:00+00:00                  1