API Overview & Getting Started

This page provides a high-level introduction to spotforecast2-safe’s public API and key concepts. For detailed API documentation, see the API Reference.

Main Entry Points

The spotforecast2-safe library organizes functionality into six major modules:

  • Data: Fetching and managing time series, weather, and holiday data
  • Preprocessing: Feature engineering, data curation, and transformation
  • Processing: Utilities for handling timestamps and temporal conversions
  • Forecaster: Recursive forecasting models (ForecasterRecursive, ForecasterEquivalentDate)
  • Utils: CPE generation, configuration, validation, and helper functions
  • Weather: Climate data integration

Quick Start

1. Import Core Components

from spotforecast2_safe.preprocessing import (
    ExogBuilder,
    RollingFeatures,
)
from spotforecast2_safe.manager.models import (
    ForecasterRecursiveXGB,
    ForecasterRecursiveLGBM,
    ForecasterRecursiveModel,
)
from spotforecast2_safe.forecaster.recursive import ForecasterRecursive
from spotforecast2_safe.utils import generate_holiday

2. Load & Prepare Data

import pandas as pd
import numpy as np

# Create sample time series data (index with explicit freq for forecasting)
np.random.seed(0)
dates = pd.date_range('2020-01-01', periods=100, freq='D')
values = np.random.randn(100).cumsum()
df = pd.DataFrame({'value': values}, index=dates)

3. Define Forecasting Period

# Define train/test split dates for temporal validation
train_end = '2022-12-31'
test_start = '2023-01-01'
test_end = '2023-12-31'

4. Create Rolling Features

from sklearn.linear_model import Ridge
from spotforecast2_safe.forecaster.recursive import ForecasterRecursive

# Split series; RollingFeatures is passed as a window_feature to the forecaster
y = df['value']
y_train = y.iloc[:80]
rolling_features = RollingFeatures(stats='mean', window_sizes=7)
print(f"RollingFeatures: stats={rolling_features.stats}, window_sizes={rolling_features.window_sizes}")
RollingFeatures: stats=['mean'], window_sizes=[7]

5. Train Recursive Forecaster

# Build forecaster with rolling-mean window features and fit on training data
forecaster = ForecasterRecursive(
    estimator=Ridge(),
    lags=7,
    window_features=[rolling_features],
)
forecaster.fit(y_train)
y_pred = forecaster.predict(steps=7)
print(y_pred.round(3))
2020-03-21   -1.393
2020-03-22   -0.756
2020-03-23   -0.798
2020-03-24   -1.347
2020-03-25   -2.012
2020-03-26   -2.435
2020-03-27   -2.488
Freq: D, Name: pred, dtype: float64

Key Concepts

Period Management

The Period dataclass encodes cyclical temporal features (e.g. hour-of-day, day-of-week):

from spotforecast2_safe.data import Period

# Encode hour-of-day as a cyclical feature
period = Period(name="hour", n_periods=24, column="hour", input_range=(0, 23))

For train/test splits, use plain date strings with pandas boolean indexing:

train_end = '2023-06-30'
test_start = '2023-07-01'

See Period API for details.

Recursive Forecasting

Recursive forecasting predicts multiple steps ahead by feeding model predictions back as inputs. The main classes are:

  • ForecasterRecursive: Base class for recursive forecasters
  • ForecasterRecursiveLGBM: LightGBM implementation (recommended for most use cases)
  • ForecasterRecursiveXGB: XGBoost implementation
from lightgbm import LGBMRegressor

forecaster_lgbm = ForecasterRecursive(
    estimator=LGBMRegressor(n_jobs=1, verbose=-1, random_state=42),
    lags=7,
)
forecaster_lgbm.fit(y_train)
forecast = forecaster_lgbm.predict(steps=7)
print(forecast.round(3))
╭─────────────────────────────── IgnoredArgumentWarning ───────────────────────────────╮
 The number of bins has been reduced from 10 to 9 due to duplicated edges caused by   
 repeated predicted values.\n\nCategory :                                             
 spotforecast2.exceptions.IgnoredArgumentWarning\nLocation :                          
 /home/runner/work/spotforecast2-safe/spotforecast2-safe/src/spotforecast2_safe/prepr 
 ocessing/_binner.py:231\nSuppress : warnings.simplefilter('ignore',                  
 category=IgnoredArgumentWarning)                                                     
╰──────────────────────────────────────────────────────────────────────────────────────╯
2020-03-21   -0.401
2020-03-22   -0.401
2020-03-23   -0.401
2020-03-24   -0.401
2020-03-25   -0.401
2020-03-26   -0.401
2020-03-27   -0.401
Freq: D, Name: pred, dtype: float64

See ForecasterRecursive Guide for detailed examples.

Feature Engineering

The ExogBuilder class constructs exogenous (external) features:

builder = ExogBuilder(
    periods=[Period(name="hour", n_periods=24, column="hour", input_range=(0, 23))],
    country_code="DE",
)
exog = builder.build(
    start_date=pd.Timestamp('2023-01-01', tz='UTC'),
    end_date=pd.Timestamp('2023-01-03', tz='UTC'),
)
print(f"Exogenous features: {exog.shape[1]} columns, {exog.shape[0]} rows")
print(exog.columns[:5].tolist())
Exogenous features: 26 columns, 49 rows
['hour_0', 'hour_1', 'hour_2', 'hour_3', 'hour_4']

Holiday Integration

Generate holiday calendars for demand forecasting:

from spotforecast2_safe.utils import create_holiday_df

holidays = create_holiday_df(
    start='2023-01-01',
    end='2023-12-31',
    country_code='DE',
)

See Holiday Generation API for details.

Model Persistence (Saving/Loading)

Save trained models for production deployment:

import tempfile
from pathlib import Path
import spotforecast2_safe.manager.persistence as persistence

with tempfile.NamedTemporaryFile(suffix='.pkl', delete=False) as tmp:
    model_path = Path(tmp.name)

persistence.dump(forecaster, model_path)
loaded_forecaster = persistence.load(model_path)
print(f"Saved and loaded: {type(loaded_forecaster).__name__}")
model_path.unlink()
Saved and loaded: ForecasterRecursive

See Model Persistence Guide for details.

Safety-Critical Properties

All spotforecast2-safe operations maintain these critical properties:

Determinism

Same input always produces identical output (bit-level reproducible):

pred1 = forecaster.predict(steps=7)
pred2 = forecaster.predict(steps=7)
assert (pred1 == pred2).all()
print("Determinism verified: identical predictions on repeated calls.")
Determinism verified: identical predictions on repeated calls.

Fail-Safe Operation

Invalid data raises explicit errors instead of silent failures:

# Missing values raise an explicit error — no silent NaN propagation
y_with_nans = pd.Series([1.0, np.nan, 3.0, 4.0, 5.0],
                        index=pd.date_range('2020-01-01', periods=5, freq='D'))
try:
    bad_forecaster = ForecasterRecursive(estimator=Ridge(), lags=2)
    bad_forecaster.fit(y_with_nans)
except ValueError as e:
    print(f"ValueError: {e}")
ValueError: `y` has missing values.

Auditability

All transformations are traceable with clear, white-box code. The source code is visible via:

  • Docstrings (in editor)
  • Automatic API documentation (quarto)
  • GitHub repository

Complete Example: End-to-End Forecasting

import pandas as pd
from importlib.resources import files
from sklearn.linear_model import Ridge
from spotforecast2_safe.forecaster.recursive import ForecasterRecursive
from spotforecast2_safe.preprocessing import RollingFeatures
from spotforecast2_safe.utils import create_holiday_df

# 1. Load demo02.csv (column A, 1970–1972, resampled to regular hourly grid)
path = files('spotforecast2_safe.datasets.csv').joinpath('demo02.csv')
df = pd.read_csv(path, index_col='DateTime', parse_dates=True)
df = df.sort_index().loc['1970-01-01':'1972-12-31'].resample('h').mean()
y = df['A'].ffill()
print(f"Series: {len(y)} hourly obs  ({y.index[0]}{y.index[-1]})")

# 2. Holiday calendar as exogenous feature (German public holidays)
holidays = create_holiday_df(start='1970-01-01', end='1972-12-31', country_code='DE')
holidays.index = holidays.index.tz_localize(None)   # align to tz-naive index
exog = holidays.reindex(y.index).fillna(0)          # missing slots → non-holiday

# 3. Train / test split  (2 years train, 1 year test)
train_end = '1971-12-31 23:00'
y_train     = y[y.index    <= train_end]
y_test      = y[y.index    >  train_end]
exog_train  = exog[exog.index <= train_end]
exog_test   = exog[exog.index >  train_end]
print(f"Train: {len(y_train)} obs — Test: {len(y_test)} obs")

# 4. Build forecaster: lag 1 h / 1 day / 1 week + 24-h rolling mean + holiday exog
forecaster = ForecasterRecursive(
    estimator=Ridge(),
    lags=[1, 24, 168],
    window_features=[RollingFeatures(stats='mean', window_sizes=24)],
)
forecaster.fit(y_train, exog=exog_train)

# 5. Forecast the first 24 hours of the test set
exog_future = exog_test.iloc[:24]
forecast = forecaster.predict(steps=24, exog=exog_future)

# 6. Evaluate: first 10 hours as a comparison table
results = pd.DataFrame({
    'y_test':   y_test.iloc[:10].values,
    'forecast': forecast.iloc[:10].values,
    'mae':      (y_test.iloc[:10] - forecast.iloc[:10]).abs().values,
}, index=forecast.iloc[:10].index)
results.round(4)
Series: 26304 hourly obs  (1970-01-01 00:00:00 – 1972-12-31 23:00:00)
Train: 17520 obs — Test: 8784 obs
y_test forecast mae
1972-01-01 00:00:00 0.1110 0.0202 0.0907
1972-01-01 01:00:00 0.1237 0.0204 0.1033
1972-01-01 02:00:00 0.0782 0.0236 0.0546
1972-01-01 03:00:00 -0.0859 0.0198 0.1057
1972-01-01 04:00:00 -0.0826 0.0152 0.0977
1972-01-01 05:00:00 -0.0982 0.0108 0.1090
1972-01-01 06:00:00 -0.0891 0.0087 0.0978
1972-01-01 07:00:00 0.0794 0.0081 0.0712
1972-01-01 08:00:00 0.1457 0.0084 0.1373
1972-01-01 09:00:00 -0.1351 0.0090 0.1441

Documentation Organization

The complete documentation is organized as follows:

Next Steps

  1. Quick Start: Follow the Quick Start example above
  2. Learn Core Concepts: Read about Period Management and Recursive Forecasting
  3. Explore Examples: Check out ForecasterRecursive Guide
  4. API Reference: Dive into specific modules in API Documentation
  5. Contribute: See Contributing Guide to contribute improvements

Troubleshooting

For common issues and solutions:

  • Data validation errors: Ensure all input data is clean (no NaNs or Infs)
  • Import errors: Verify the package is installed with uv sync
  • Version compatibility: Check you’re using Python 3.13 or later

Before reporting a new issue, search the publicly available archives:

  • Issues Archive: Browse all reported bugs, feature requests, and their resolutions. Use the search feature to find similar problems.
  • Discussions Archive: Search community questions and answers for help with common tasks.

See Also