import pandas as pd
import numpy as np
from types import SimpleNamespace
from spotforecast2.manager.plotter import plot_with_outliers
# Create synthetic data
dates = pd.date_range("2023-01-01", periods=100, freq="h", tz="UTC")
data = pd.DataFrame({
"target1": np.random.rand(100) * 100,
"target2": np.random.rand(100) * 50,
}, index=dates)
# Introduce outliers
data.loc[dates[10], "target1"] = 300 # Outlier in target1
data.loc[dates[20], "target2"] = 150 # Outlier in target2
df_pipeline = data.copy()
df_pipeline.loc[[dates[10], dates[20]], ["target1", "target2"]] = np.nan
# Config with bounds
config = SimpleNamespace(
targets=["target1", "target2"],
bounds=[(-10, 200), (0, 100)],
)
plot_with_outliers(df_pipeline, data, config)manager.plotter
manager.plotter
Module for generating interactive prediction plots.
This module provides the PredictionFigure class and make_plot function to visualize time series forecasting results, including actual values, predictions, and performance metrics.
Classes
| Name | Description |
|---|---|
| PredictionFigure | Encapsulates the generation of an interactive Plotly figure for predictions. |
PredictionFigure
manager.plotter.PredictionFigure(
prediction_package,
title='Energy Demand Prediction',
)Encapsulates the generation of an interactive Plotly figure for predictions.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| prediction_package | Dict[str, Any] | A dictionary containing prediction data and metrics. Expected keys include: - ‘train_actual’: pd.Series - ‘future_actual’: pd.Series - ‘train_pred’: pd.Series - ‘future_pred’: pd.Series - ‘future_forecast’: pd.Series (e.g., benchmark/ENTSOE) - ‘test_actual’: pd.Series (optional external ground truth for the forecast period, e.g. from data_test.csv; absent in genuine-future mode when no ground truth is available yet) - ‘metrics_train’: Dict[str, float] - ‘metrics_future’: Dict[str, float] - ‘metrics_future_one_day’: Dict[str, float] - ‘metrics_forecast’: Dict[str, float] - ‘metrics_forecast_one_day’: Dict[str, float] | required |
| title | str | Figure title shown at the top of the plot. | 'Energy Demand Prediction' |
Methods
| Name | Description |
|---|---|
| make_plot | Generate the Plotly figure with traces and annotations. |
make_plot
manager.plotter.PredictionFigure.make_plot()Generate the Plotly figure with traces and annotations.
Traces added (always): - Total system load — actual (training window, clipped to visible range) - Total system load — model prediction (training + forecast, clipped) - Actual (last week) — time-shifted actual for seasonality context
Traces added (when data is available): - Benchmark Forecast (e.g., ENTSOE) — if future_forecast key present - Actual (test / ground truth) — if test_actual key present
The X-axis is fixed to [end_training − 1 day, future_pred.max() + 1 h] so the full forecast window is always visible, including in genuine-future mode where future_actual is an empty Series. Only the data slice in that window is serialised into the Plotly JSON, keeping HTML output small.
Examples
>>> import pandas as pd
>>> import numpy as np
>>> from spotforecast2.manager.plotter import PredictionFigure
>>> dates = pd.date_range("2023-01-01", periods=100, freq="h", tz="UTC")
>>> train_end = dates[70]
>>> y = pd.Series(np.random.rand(100) * 100, index=dates, name="load")
>>> p = y + np.random.normal(0, 5, 100)
>>> pkg = {
... "train_actual": y.loc[:train_end],
... "future_actual": y.loc[train_end:],
... "train_pred": p.loc[:train_end],
... "future_pred": p.loc[train_end:],
... "metrics_train": {"mae": 5.0, "mape": 0.1},
... "metrics_future": {"mae": 6.0, "mape": 0.12},
... "metrics_future_one_day": {"mae": 4.5, "mape": 0.08},
... }
>>> fig = PredictionFigure(pkg).make_plot()
>>> isinstance(fig.data, tuple)
TrueFunctions
| Name | Description |
|---|---|
| make_plot | Generate and optionally save an interactive prediction plot. |
| plot_actual_vs_predicted | Plot actual vs predicted combined values for model comparison. |
| plot_with_outliers | Interactive time series plot with outliers and optional bounds. |
make_plot
manager.plotter.make_plot(
prediction_package,
output_path=None,
title='Energy Demand Prediction',
)Generate and optionally save an interactive prediction plot.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| prediction_package | Dict[str, Any] | Dictionary of results (actuals, preds, metrics). | required |
| output_path | Optional[Union[str, Path]] | Path to save the HTML file. If None, it defaults to ‘index.html’ in the package’s data home directory. | None |
| title | str | Figure title shown at the top of the plot. | 'Energy Demand Prediction' |
Returns
| Name | Type | Description |
|---|---|---|
go.Figure |
The generated Plotly Figure object. |
Examples
>>> from spotforecast2.manager.plotter import make_plot
>>> # fig = make_plot(results)plot_actual_vs_predicted
manager.plotter.plot_actual_vs_predicted(
actual_combined,
baseline_combined,
covariates_combined,
custom_lgbm_combined,
html_path=None,
)Plot actual vs predicted combined values for model comparison.
This function creates an interactive Plotly figure comparing ground truth with predictions from three different forecasting models: baseline, covariate-enhanced, and custom LightGBM. The plot includes interactive hover information and can be saved as a standalone HTML file.
Safety-Critical Features
- Interactive visualization for model validation
- Supports HTML export for audit trails
- Shows all models simultaneously for easy comparison
- Uses consistent color scheme and line styles
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| actual_combined | pd.Series | Ground truth combined series with datetime index. | required |
| baseline_combined | pd.Series | Baseline combined prediction series. Must have same index as actual_combined. | required |
| covariates_combined | pd.Series | Covariate-enhanced combined prediction series. Must have same index as actual_combined. | required |
| custom_lgbm_combined | pd.Series | Custom LightGBM (optimized params) combined prediction series. Must have same index as actual_combined. | required |
| html_path | Optional[str] | If set, save the plot as a single self-contained HTML file to this path. If None, displays plot interactively only. | None |
Returns
| Name | Type | Description |
|---|---|---|
| None | None. Displays plot and optionally saves to HTML file. |
Raises
| Name | Type | Description |
|---|---|---|
| ValueError | If series indices don’t align or are empty. |
Examples
>>> import pandas as pd
>>> import tempfile
>>> from pathlib import Path
>>> from spotforecast2.manager.plotter import plot_actual_vs_predicted
>>>
>>> # Example 1: Create synthetic data for testing
>>> index = pd.date_range('2020-01-01', periods=24, freq='h')
>>> actual = pd.Series(range(100, 124), index=index, name='actual')
>>> baseline = pd.Series(range(101, 125), index=index, name='baseline')
>>> covariates = pd.Series(range(99, 123), index=index, name='covariates')
>>> custom = pd.Series(range(100, 124), index=index, name='custom')
>>>
>>> # Verify data properties
>>> print(f"Data length: {len(actual)}")
Data length: 24
>>> print(f"Index type: {type(actual.index).__name__}")
Index type: DatetimeIndex
>>>
>>> # Example 2: Comparing models with different accuracies
>>> import numpy as np
>>> np.random.seed(42)
>>> index = pd.date_range('2020-01-01 00:00:00', periods=48, freq='h')
>>> actual = pd.Series(
... 100 + 10 * np.sin(np.arange(48) * 2 * np.pi / 24),
... index=index
... )
>>> baseline = actual + np.random.normal(0, 2, 48)
>>> covariates = actual + np.random.normal(0, 1, 48)
>>> custom = actual + np.random.normal(0, 0.5, 48)
>>>
>>> # Verify series properties before plotting
>>> print(f"Actual range: [{actual.min():.1f}, {actual.max():.1f}]")
Actual range: [90.0, 110.0]
>>> print(f"All indices aligned: {(actual.index == baseline.index).all()}")
All indices aligned: True
>>>
>>> # Example 3: Production workflow with actual forecast data
>>> index = pd.date_range('2020-01-01', periods=24, freq='h')
>>> ground_truth = pd.Series([100 + i for i in range(24)], index=index)
>>> model1_pred = pd.Series([101 + i for i in range(24)], index=index)
>>> model2_pred = pd.Series([99 + i for i in range(24)], index=index)
>>> model3_pred = pd.Series([100 + i for i in range(24)], index=index)
>>>
>>> # Calculate errors
>>> mae_baseline = abs(ground_truth - model1_pred).mean()
>>> mae_covariates = abs(ground_truth - model2_pred).mean()
>>> mae_custom = abs(ground_truth - model3_pred).mean()
>>> print(f"Baseline MAE: {mae_baseline:.2f}")
Baseline MAE: 1.00
>>> print(f"Covariates MAE: {mae_covariates:.2f}")
Covariates MAE: 1.00
>>> print(f"Custom MAE: {mae_custom:.2f}")
Custom MAE: 0.00
>>>
>>> # Example 4: Verify data alignment before plotting
>>> index1 = pd.date_range('2020-01-01', periods=24, freq='h')
>>> index2 = pd.date_range('2020-01-02', periods=24, freq='h')
>>> series1 = pd.Series(range(24), index=index1)
>>> series2 = pd.Series(range(24), index=index2)
>>>
>>> # Check alignment
>>> indices_match = (series1.index == series2.index).all()
>>> print(f"Indices aligned: {indices_match}")
Indices aligned: False
>>>
>>> # Reindex to align
>>> series2_aligned = series2.reindex(series1.index)
>>> print(f"After reindex: {(series1.index == series2_aligned.index).all()}")
After reindex: True
>>>
>>> # Example 5: Verify all series have correct properties
>>> index = pd.date_range('2020-01-01', periods=10, freq='h')
>>> actual = pd.Series(range(10), index=index)
>>> pred1 = pd.Series(range(1, 11), index=index)
>>> pred2 = pd.Series(range(10), index=index)
>>> pred3 = pd.Series(range(10), index=index)
>>>
>>> # Safety checks
>>> assert isinstance(actual.index, pd.DatetimeIndex), "Index must be DatetimeIndex"
>>> assert len(actual) == len(pred1) == len(pred2) == len(pred3), "All series must have same length"
>>> assert (actual.index == pred1.index).all(), "Indices must align"
>>> print("All safety checks passed")
All safety checks passed
>>>
>>> # Example 6: Calculate metrics for model comparison
>>> index = pd.date_range('2020-01-01', periods=100, freq='h')
>>> actual = pd.Series(100 + np.random.randn(100) * 5, index=index)
>>> pred1 = actual + np.random.randn(100) * 2
>>> pred2 = actual + np.random.randn(100) * 1.5
>>> pred3 = actual + np.random.randn(100) * 1
>>>
>>> # Calculate MAE for each model
>>> mae1 = abs(actual - pred1).mean()
>>> mae2 = abs(actual - pred2).mean()
>>> mae3 = abs(actual - pred3).mean()
>>> print(f"Model 1 MAE: {mae1:.2f}")
Model 1 MAE: ...
>>> print(f"Model 2 MAE: {mae2:.2f}")
Model 2 MAE: ...
>>> print(f"Model 3 MAE: {mae3:.2f}")
Model 3 MAE: ...plot_with_outliers
manager.plotter.plot_with_outliers(df_pipeline, df_pipeline_original, config)Interactive time series plot with outliers and optional bounds.
This function generates an interactive Plotly figure that visualizes the time series data from the pipeline, highlighting any detected outliers. Regular data points are shown in light grey, while outliers are marked in red. When config.bounds is set, two horizontal reference lines in lightblue are added per plot — one for the lower bound and one for the upper bound — to indicate the acceptable value range for that target.
The plot title includes the percentage of outliers detected for each target variable.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| df_pipeline | pd.DataFrame | The processed DataFrame from the pipeline, which may contain NaN values where outliers have been detected and removed. | required |
| df_pipeline_original | pd.DataFrame | The original DataFrame before outlier removal. | required |
| config | Any | Configuration object containing targets (list of column names) and optionally bounds (list of (lower, upper) tuples, one per target, in the same order as targets). |
required |
Returns
| Name | Type | Description |
|---|---|---|
| None | None. Displays one interactive Plotly figure per target variable. |