manager.datasets.demo_loader

manager.datasets.demo_loader

Demo data loader for safety-critical forecasting tasks.

This module provides flexible data loading functions for ground truth validation in forecasting demonstrations and production workflows.

Functions

Name Description
load_actual_combined Load ground truth and compute combined actual series with validation.

load_actual_combined

manager.datasets.demo_loader.load_actual_combined(
    config,
    columns,
    forecast_horizon=None,
    weights=None,
    data_path=None,
)

Load ground truth and compute combined actual series with validation.

This function loads a CSV file containing ground truth data, validates the presence of required columns, extracts a subset based on forecast horizon, and aggregates multiple columns using weighted averaging.

Parameters

Name Type Description Default
config DemoConfig Configuration object containing default paths and parameters. required
columns List[str] List of column names to extract from the ground truth data. required
forecast_horizon Optional[int] Number of time steps to extract. If None, uses config.forecast_horizon. Must be positive. None
weights Optional[List[float]] Weights for aggregating columns. If None, uses config.weights. Length must match number of columns. None
data_path Optional[Path] Path to the ground truth CSV file. If None, uses config.data_path. File must exist. None

Returns

Name Type Description
pd.Series Aggregated time series combining all columns with specified weights.

Raises

Name Type Description
FileNotFoundError If the ground truth file does not exist.
ValueError If required columns are missing from the data or if weights length doesn’t match columns.

Examples

>>> import tempfile
>>> import pandas as pd
>>> from pathlib import Path
>>> from spotforecast2_safe.manager.datasets.demo_data import DemoConfig
>>> from spotforecast2_safe.manager.datasets.demo_loader import load_actual_combined
>>>
>>> # Example 1: Basic usage with default config parameters
>>> with tempfile.NamedTemporaryFile(mode='w', suffix='.csv', delete=False) as f:
...     _ = f.write('timestamp,col1,col2\n')
...     _ = f.write('2020-01-01 00:00:00,1.0,2.0\n')
...     _ = f.write('2020-01-01 01:00:00,3.0,4.0\n')
...     _ = f.write('2020-01-01 02:00:00,5.0,6.0\n')
...     temp_path = Path(f.name)
>>> config = DemoConfig(data_path=temp_path)
>>> result = load_actual_combined(config, columns=['col1', 'col2'],
...                               forecast_horizon=2, weights=[1.0, 1.0])
>>> print(f"Result length: {len(result)}")
Result length: 2
>>> print(f"First value: {result.iloc[0]:.1f}")
First value: 3.0
>>> temp_path.unlink()  # Clean up
>>>
>>> # Example 2: Override forecast_horizon
>>> with tempfile.NamedTemporaryFile(mode='w', suffix='.csv', delete=False) as f:
...     _ = f.write('timestamp,col1,col2\n')
...     for i in range(10):
...         _ = f.write(f'2020-01-01 {i:02d}:00:00,{i}.0,{i*2}.0\n')
...     temp_path = Path(f.name)
>>> config = DemoConfig(data_path=temp_path, forecast_horizon=24)
>>> result = load_actual_combined(config, columns=['col1', 'col2'],
...                               forecast_horizon=5, weights=[1.0, 0.5])
>>> print(f"Custom horizon length: {len(result)}")
Custom horizon length: 5
>>> temp_path.unlink()
>>>
>>> # Example 3: Override weights for custom aggregation
>>> with tempfile.NamedTemporaryFile(mode='w', suffix='.csv', delete=False) as f:
...     _ = f.write('timestamp,A,B,C\n')
...     _ = f.write('2020-01-01 00:00:00,10.0,5.0,2.0\n')
...     _ = f.write('2020-01-01 01:00:00,20.0,10.0,4.0\n')
...     temp_path = Path(f.name)
>>> config = DemoConfig(data_path=temp_path)
>>> result = load_actual_combined(config, columns=['A', 'B', 'C'],
...                               forecast_horizon=2,
...                               weights=[1.0, -1.0, 1.0])
>>> print(f"Weighted result shape: {result.shape}")
Weighted result shape: (2,)
>>> temp_path.unlink()
>>>
>>> # Example 4: Override data_path
>>> with tempfile.NamedTemporaryFile(mode='w', suffix='.csv', delete=False) as f:
...     _ = f.write('timestamp,X,Y\n')
...     _ = f.write('2020-01-01 00:00:00,100.0,200.0\n')
...     custom_path = Path(f.name)
>>> config = DemoConfig()  # Uses default path
>>> result = load_actual_combined(config, columns=['X', 'Y'],
...                               data_path=custom_path,
...                               forecast_horizon=1,
...                               weights=[0.5, 0.5])
>>> print(f"Custom path result: {result.iloc[0]:.1f}")
Custom path result: 150.0
>>> custom_path.unlink()
>>>
>>> # Example 5: Error handling - missing file
>>> config = DemoConfig(data_path=Path('/nonexistent/file.csv'))
>>> try:
...     result = load_actual_combined(config, columns=['A'],
...                                   forecast_horizon=1, weights=[1.0])
... except FileNotFoundError as e:
...     print("File not found as expected")
File not found as expected
>>>
>>> # Example 6: Error handling - missing columns
>>> with tempfile.NamedTemporaryFile(mode='w', suffix='.csv', delete=False) as f:
...     _ = f.write('timestamp,col1\n')
...     _ = f.write('2020-01-01 00:00:00,1.0\n')
...     temp_path = Path(f.name)
>>> config = DemoConfig(data_path=temp_path)
>>> try:
...     result = load_actual_combined(config, columns=['col1', 'col2'],
...                                   forecast_horizon=1, weights=[1.0, 1.0])
... except ValueError as e:
...     print("Missing columns detected")
Missing columns detected
>>> temp_path.unlink()
>>>
>>> # Example 7: Production usage with all defaults from config
>>> with tempfile.NamedTemporaryFile(mode='w', suffix='.csv', delete=False) as f:
...     _ = f.write('timestamp,load,solar,wind\n')
...     for i in range(30):
...         _ = f.write(f'2020-01-01 {i:02d}:00:00,{100+i},{50+i},{25+i}\n')
...     temp_path = Path(f.name)
>>> config = DemoConfig(
...     data_path=temp_path,
...     forecast_horizon=24,
...     weights=[1.0, -0.5, -0.5]
... )
>>> result = load_actual_combined(config, columns=['load', 'solar', 'wind'])
>>> print(f"Production forecast length: {len(result)}")
Production forecast length: 24
>>> print(f"Result is pandas Series: {isinstance(result, pd.Series)}")
Result is pandas Series: True
>>> temp_path.unlink()