from spotforecast2_safe.preprocessing import (
ExogBuilder,
RollingFeatures,
)
from spotforecast2_safe.manager.models import (
ForecasterRecursiveXGB,
ForecasterRecursiveLGBM,
ForecasterRecursiveModel,
)
from spotforecast2_safe.forecaster.recursive import ForecasterRecursive
from spotforecast2_safe.utils import generate_holidayAPI Overview & Getting Started
This page provides a high-level introduction to spotforecast2-safe’s public API and key concepts. For detailed API documentation, see the API Reference.
Main Entry Points
The spotforecast2-safe library organizes functionality into six major modules:
- Data: Fetching and managing time series, weather, and holiday data
- Preprocessing: Feature engineering, data curation, and transformation
- Processing: Utilities for handling timestamps and temporal conversions
- Forecaster: Recursive forecasting models (ForecasterRecursive, ForecasterEquivalentDate)
- Utils: CPE generation, configuration, validation, and helper functions
- Weather: Climate data integration
Quick Start
1. Import Core Components
2. Load & Prepare Data
import pandas as pd
import numpy as np
# Create sample time series data (index with explicit freq for forecasting)
np.random.seed(0)
dates = pd.date_range('2020-01-01', periods=100, freq='D')
values = np.random.randn(100).cumsum()
df = pd.DataFrame({'value': values}, index=dates)3. Define Forecasting Period
# Define train/test split dates for temporal validation
train_end = '2022-12-31'
test_start = '2023-01-01'
test_end = '2023-12-31'4. Create Rolling Features
from sklearn.linear_model import Ridge
from spotforecast2_safe.forecaster.recursive import ForecasterRecursive
# Split series; RollingFeatures is passed as a window_feature to the forecaster
y = df['value']
y_train = y.iloc[:80]
rolling_features = RollingFeatures(stats='mean', window_sizes=7)
print(f"RollingFeatures: stats={rolling_features.stats}, window_sizes={rolling_features.window_sizes}")RollingFeatures: stats=['mean'], window_sizes=[7]
5. Train Recursive Forecaster
# Build forecaster with rolling-mean window features and fit on training data
forecaster = ForecasterRecursive(
estimator=Ridge(),
lags=7,
window_features=[rolling_features],
)
forecaster.fit(y_train)
y_pred = forecaster.predict(steps=7)
print(y_pred.round(3))2020-03-21 -1.393
2020-03-22 -0.756
2020-03-23 -0.798
2020-03-24 -1.347
2020-03-25 -2.012
2020-03-26 -2.435
2020-03-27 -2.488
Freq: D, Name: pred, dtype: float64
Key Concepts
Period Management
The Period dataclass encodes cyclical temporal features (e.g. hour-of-day, day-of-week):
from spotforecast2_safe.data import Period
# Encode hour-of-day as a cyclical feature
period = Period(name="hour", n_periods=24, column="hour", input_range=(0, 23))For train/test splits, use plain date strings with pandas boolean indexing:
train_end = '2023-06-30'
test_start = '2023-07-01'See Period API for details.
Recursive Forecasting
Recursive forecasting predicts multiple steps ahead by feeding model predictions back as inputs. The main classes are:
ForecasterRecursive: Base class for recursive forecastersForecasterRecursiveLGBM: LightGBM implementation (recommended for most use cases)ForecasterRecursiveXGB: XGBoost implementation
from lightgbm import LGBMRegressor
forecaster_lgbm = ForecasterRecursive(
estimator=LGBMRegressor(n_jobs=1, verbose=-1, random_state=42),
lags=7,
)
forecaster_lgbm.fit(y_train)
forecast = forecaster_lgbm.predict(steps=7)
print(forecast.round(3))╭─────────────────────────────── IgnoredArgumentWarning ───────────────────────────────╮ │ The number of bins has been reduced from 10 to 9 due to duplicated edges caused by │ │ repeated predicted values.\n\nCategory : │ │ spotforecast2.exceptions.IgnoredArgumentWarning\nLocation : │ │ /home/runner/work/spotforecast2-safe/spotforecast2-safe/src/spotforecast2_safe/prepr │ │ ocessing/_binner.py:231\nSuppress : warnings.simplefilter('ignore', │ │ category=IgnoredArgumentWarning) │ ╰──────────────────────────────────────────────────────────────────────────────────────╯
2020-03-21 -0.401
2020-03-22 -0.401
2020-03-23 -0.401
2020-03-24 -0.401
2020-03-25 -0.401
2020-03-26 -0.401
2020-03-27 -0.401
Freq: D, Name: pred, dtype: float64
See ForecasterRecursive Guide for detailed examples.
Feature Engineering
The ExogBuilder class constructs exogenous (external) features:
builder = ExogBuilder(
periods=[Period(name="hour", n_periods=24, column="hour", input_range=(0, 23))],
country_code="DE",
)
exog = builder.build(
start_date=pd.Timestamp('2023-01-01', tz='UTC'),
end_date=pd.Timestamp('2023-01-03', tz='UTC'),
)
print(f"Exogenous features: {exog.shape[1]} columns, {exog.shape[0]} rows")
print(exog.columns[:5].tolist())Exogenous features: 26 columns, 49 rows
['hour_0', 'hour_1', 'hour_2', 'hour_3', 'hour_4']
Holiday Integration
Generate holiday calendars for demand forecasting:
from spotforecast2_safe.utils import create_holiday_df
holidays = create_holiday_df(
start='2023-01-01',
end='2023-12-31',
country_code='DE',
)See Holiday Generation API for details.
Model Persistence (Saving/Loading)
Save trained models for production deployment:
import tempfile
from pathlib import Path
import spotforecast2_safe.manager.persistence as persistence
with tempfile.NamedTemporaryFile(suffix='.pkl', delete=False) as tmp:
model_path = Path(tmp.name)
persistence.dump(forecaster, model_path)
loaded_forecaster = persistence.load(model_path)
print(f"Saved and loaded: {type(loaded_forecaster).__name__}")
model_path.unlink()Saved and loaded: ForecasterRecursive
See Model Persistence Guide for details.
Safety-Critical Properties
All spotforecast2-safe operations maintain these critical properties:
Determinism
Same input always produces identical output (bit-level reproducible):
pred1 = forecaster.predict(steps=7)
pred2 = forecaster.predict(steps=7)
assert (pred1 == pred2).all()
print("Determinism verified: identical predictions on repeated calls.")Determinism verified: identical predictions on repeated calls.
Fail-Safe Operation
Invalid data raises explicit errors instead of silent failures:
# Missing values raise an explicit error — no silent NaN propagation
y_with_nans = pd.Series([1.0, np.nan, 3.0, 4.0, 5.0],
index=pd.date_range('2020-01-01', periods=5, freq='D'))
try:
bad_forecaster = ForecasterRecursive(estimator=Ridge(), lags=2)
bad_forecaster.fit(y_with_nans)
except ValueError as e:
print(f"ValueError: {e}")ValueError: `y` has missing values.
Auditability
All transformations are traceable with clear, white-box code. The source code is visible via:
- Docstrings (in editor)
- Automatic API documentation (quarto)
- GitHub repository
Complete Example: End-to-End Forecasting
import pandas as pd
from importlib.resources import files
from sklearn.linear_model import Ridge
from spotforecast2_safe.forecaster.recursive import ForecasterRecursive
from spotforecast2_safe.preprocessing import RollingFeatures
from spotforecast2_safe.utils import create_holiday_df
# 1. Load demo02.csv (column A, 1970–1972, resampled to regular hourly grid)
path = files('spotforecast2_safe.datasets.csv').joinpath('demo02.csv')
df = pd.read_csv(path, index_col='DateTime', parse_dates=True)
df = df.sort_index().loc['1970-01-01':'1972-12-31'].resample('h').mean()
y = df['A'].ffill()
print(f"Series: {len(y)} hourly obs ({y.index[0]} – {y.index[-1]})")
# 2. Holiday calendar as exogenous feature (German public holidays)
holidays = create_holiday_df(start='1970-01-01', end='1972-12-31', country_code='DE')
holidays.index = holidays.index.tz_localize(None) # align to tz-naive index
exog = holidays.reindex(y.index).fillna(0) # missing slots → non-holiday
# 3. Train / test split (2 years train, 1 year test)
train_end = '1971-12-31 23:00'
y_train = y[y.index <= train_end]
y_test = y[y.index > train_end]
exog_train = exog[exog.index <= train_end]
exog_test = exog[exog.index > train_end]
print(f"Train: {len(y_train)} obs — Test: {len(y_test)} obs")
# 4. Build forecaster: lag 1 h / 1 day / 1 week + 24-h rolling mean + holiday exog
forecaster = ForecasterRecursive(
estimator=Ridge(),
lags=[1, 24, 168],
window_features=[RollingFeatures(stats='mean', window_sizes=24)],
)
forecaster.fit(y_train, exog=exog_train)
# 5. Forecast the first 24 hours of the test set
exog_future = exog_test.iloc[:24]
forecast = forecaster.predict(steps=24, exog=exog_future)
# 6. Evaluate: first 10 hours as a comparison table
results = pd.DataFrame({
'y_test': y_test.iloc[:10].values,
'forecast': forecast.iloc[:10].values,
'mae': (y_test.iloc[:10] - forecast.iloc[:10]).abs().values,
}, index=forecast.iloc[:10].index)
results.round(4)Series: 26304 hourly obs (1970-01-01 00:00:00 – 1972-12-31 23:00:00)
Train: 17520 obs — Test: 8784 obs
| y_test | forecast | mae | |
|---|---|---|---|
| 1972-01-01 00:00:00 | 0.1110 | 0.0202 | 0.0907 |
| 1972-01-01 01:00:00 | 0.1237 | 0.0204 | 0.1033 |
| 1972-01-01 02:00:00 | 0.0782 | 0.0236 | 0.0546 |
| 1972-01-01 03:00:00 | -0.0859 | 0.0198 | 0.1057 |
| 1972-01-01 04:00:00 | -0.0826 | 0.0152 | 0.0977 |
| 1972-01-01 05:00:00 | -0.0982 | 0.0108 | 0.1090 |
| 1972-01-01 06:00:00 | -0.0891 | 0.0087 | 0.0978 |
| 1972-01-01 07:00:00 | 0.0794 | 0.0081 | 0.0712 |
| 1972-01-01 08:00:00 | 0.1457 | 0.0084 | 0.1373 |
| 1972-01-01 09:00:00 | -0.1351 | 0.0090 | 0.1441 |
Documentation Organization
The complete documentation is organized as follows:
- Home (this page): High-level overview
- API Reference: Detailed API documentation by module
- Data Module: Data fetching and Period management
- Preprocessing Module: Feature engineering and forecasters
- Processing Module: Utilities for timestamps and conversions
- Utils Module: Helper functions and CPE generation
- Weather Module: Climate data integration
- Exceptions: Error types and documentation
- Guides: Practical examples and workflows
- ForecasterRecursive Guide: Advanced forecasting techniques
- Model Persistence: Production deployment
- Safety & Compliance: Documentation for auditors and compliance
- Model/Method Card: Compliance and safety design
Next Steps
- Quick Start: Follow the Quick Start example above
- Learn Core Concepts: Read about Period Management and Recursive Forecasting
- Explore Examples: Check out ForecasterRecursive Guide
- API Reference: Dive into specific modules in API Documentation
- Contribute: See Contributing Guide to contribute improvements
Troubleshooting
For common issues and solutions:
- Data validation errors: Ensure all input data is clean (no NaNs or Infs)
- Import errors: Verify the package is installed with
uv sync - Version compatibility: Check you’re using Python 3.13 or later
Before reporting a new issue, search the publicly available archives:
- Issues Archive: Browse all reported bugs, feature requests, and their resolutions. Use the search feature to find similar problems.
- Discussions Archive: Search community questions and answers for help with common tasks.
See Also
- Complete API Reference
- Model/Method Card
- Security Policy - Vulnerability reporting and security best practices
- GitHub Repository