data.demo_loader

data.demo_loader

Demo data loader for safety-critical forecasting tasks.

This module provides flexible data loading functions for ground truth validation in forecasting demonstrations and production workflows.

Functions

Name Description
load_actual_combined Load ground truth and compute combined actual series with validation.

load_actual_combined

data.demo_loader.load_actual_combined(
    config,
    columns,
    forecast_horizon=None,
    weights=None,
    data_path=None,
)

Load ground truth and compute combined actual series with validation.

This function loads a CSV file containing ground truth data, validates the presence of required columns, extracts a subset based on forecast horizon, and aggregates multiple columns using weighted averaging.

Parameters

Name Type Description Default
config DemoConfig Configuration object containing default paths and parameters. required
columns List[str] List of column names to extract from the ground truth data. required
forecast_horizon Optional[int] Number of time steps to extract. If None, uses config.forecast_horizon. Must be positive. None
weights Optional[List[float]] Weights for aggregating columns. If None, uses config.weights. Length must match number of columns. None
data_path Optional[Path] Path to the ground truth CSV file. If None, uses config.data_path. File must exist. None

Returns

Name Type Description
pd.Series Aggregated time series combining all columns with specified weights.

Raises

Name Type Description
FileNotFoundError If the ground truth file does not exist.
ValueError If required columns are missing from the data or if weights length doesn’t match columns.

Examples

Basic usage with explicit forecast horizon and weights:

import tempfile
from pathlib import Path

import pandas as pd

from spotforecast2_safe.data.demo_data import DemoConfig
from spotforecast2_safe.data.demo_loader import load_actual_combined

idx = pd.date_range("2020-01-01", periods=3, freq="h", name="timestamp")
sample_df = pd.DataFrame({"col1": [1.0, 3.0, 5.0], "col2": [2.0, 4.0, 6.0]}, index=idx)

with tempfile.NamedTemporaryFile(mode="w", suffix=".csv", delete=False) as f:
    sample_df.to_csv(f.name)
    temp_path = Path(f.name)

config = DemoConfig(data_path=temp_path)
result = load_actual_combined(
    config, columns=["col1", "col2"], forecast_horizon=2, weights=[1.0, 1.0]
)
print(f"Result length: {len(result)}")
print(f"First value: {result.iloc[0]:.1f}")
temp_path.unlink()
Result length: 2
First value: 3.0

Override forecast horizon and weights:

idx = pd.date_range("2020-01-01", periods=10, freq="h", name="timestamp")
sample_df = pd.DataFrame(
    {"col1": [float(i) for i in range(10)], "col2": [float(i * 2) for i in range(10)]},
    index=idx,
)

with tempfile.NamedTemporaryFile(mode="w", suffix=".csv", delete=False) as f:
    sample_df.to_csv(f.name)
    temp_path = Path(f.name)

config = DemoConfig(data_path=temp_path, forecast_horizon=24)
result = load_actual_combined(
    config, columns=["col1", "col2"], forecast_horizon=5, weights=[1.0, 0.5]
)
print(f"Custom horizon length: {len(result)}")
temp_path.unlink()
Custom horizon length: 5

Override data_path while keeping the rest of the config:

idx = pd.date_range("2020-01-01", periods=1, freq="h", name="timestamp")
sample_df = pd.DataFrame({"X": [100.0], "Y": [200.0]}, index=idx)

with tempfile.NamedTemporaryFile(mode="w", suffix=".csv", delete=False) as f:
    sample_df.to_csv(f.name)
    custom_path = Path(f.name)

config = DemoConfig()  # default data_path
result = load_actual_combined(
    config,
    columns=["X", "Y"],
    data_path=custom_path,
    forecast_horizon=1,
    weights=[0.5, 0.5],
)
print(f"Custom path result: {result.iloc[0]:.1f}")
custom_path.unlink()
Custom path result: 150.0

Error handling — missing file:

config = DemoConfig(data_path=Path("/nonexistent/file.csv"))
try:
    load_actual_combined(
        config, columns=["A"], forecast_horizon=1, weights=[1.0]
    )
except FileNotFoundError:
    print("File not found as expected")
File not found as expected

Error handling — missing columns:

idx = pd.date_range("2020-01-01", periods=1, freq="h", name="timestamp")
sample_df = pd.DataFrame({"col1": [1.0]}, index=idx)

with tempfile.NamedTemporaryFile(mode="w", suffix=".csv", delete=False) as f:
    sample_df.to_csv(f.name)
    temp_path = Path(f.name)

config = DemoConfig(data_path=temp_path)
try:
    load_actual_combined(
        config,
        columns=["col1", "col2"],
        forecast_horizon=1,
        weights=[1.0, 1.0],
    )
except ValueError:
    print("Missing columns detected")
temp_path.unlink()
Missing columns detected

Production usage with horizon and weights drawn from the config:

idx = pd.date_range("2020-01-01", periods=30, freq="h", name="timestamp")
sample_df = pd.DataFrame(
    {
        "load": [100 + i for i in range(30)],
        "solar": [50 + i for i in range(30)],
        "wind": [25 + i for i in range(30)],
    },
    index=idx,
)

with tempfile.NamedTemporaryFile(mode="w", suffix=".csv", delete=False) as f:
    sample_df.to_csv(f.name)
    temp_path = Path(f.name)

config = DemoConfig(
    data_path=temp_path, forecast_horizon=24, weights=[1.0, -0.5, -0.5]
)
result = load_actual_combined(config, columns=["load", "solar", "wind"])
print(f"Production forecast length: {len(result)}")
print(f"Result is pandas Series: {isinstance(result, pd.Series)}")
temp_path.unlink()
Production forecast length: 24
Result is pandas Series: True