manager.predictor.build_prediction_package

manager.predictor.build_prediction_package(
    forecaster,
    target,
    y_train,
    predict_size,
    exog_train=None,
    exog_future=None,
    df_test=None,
    index_name='DateTime',
)

Build a prediction package for downstream plotting/reporting consumers.

Computes true in-sample predictions via the fitted estimator and generates a genuine future forecast of predict_size steps. If df_test is supplied, ground-truth test values are injected and future metrics (MAE, MAPE) are computed.

Parameters

Name	Type	Description	Default
forecaster	Any	A fitted skforecast `ForecasterRecursive` (or compatible) instance with `create_train_X_y`, `estimator`, and `predict` methods.	required
target	str	Column name of the target series. Used to look up ground-truth values inside `df_test` when provided.	required
y_train	pd.Series	Training time series indexed by a timezone-aware `DatetimeIndex`.	required
predict_size	int	Number of future steps to forecast.	required
exog_train	Optional[pd.DataFrame]	Exogenous feature DataFrame aligned with `y_train`. Pass `None` when no exogenous features are used.	`None`
exog_future	Optional[pd.DataFrame]	Exogenous feature DataFrame covering the forecast horizon. Pass `None` when no exogenous features are used.	`None`
df_test	Optional[pd.DataFrame]	Optional test DataFrame that must contain an `index_name` column and a column named `target`. When supplied, the matching ground-truth slice is injected into the returned package and future metrics are computed.	`None`
index_name	str	Name of the timestamp column in `df_test` to use as the index when aligning ground-truth values. Defaults to `"DateTime"` for backward compatibility; callers using a different timestamp column (e.g. ENTSO-E’s `"Time (UTC)"`) should pass the matching name — typically `config.index_name`.	`'DateTime'`

Returns

Name	Type	Description
	Dict[str, Any]	A dictionary with the following keys:
	Dict[str, Any]	- train_actual (`pd.Series`) — observed training values aligned to the in-sample prediction index (lags consumed from the start).
	Dict[str, Any]	- train_pred (`pd.Series`) — in-sample fitted values from the underlying estimator.
	Dict[str, Any]	- future_actual (`pd.Series`) — always an empty `float64` Series; the field exists for interface compatibility with downstream plotting consumers.
	Dict[str, Any]	- future_pred (`pd.Series`) — `predict_size`-step-ahead forecast.
	Dict[str, Any]	- metrics_train (`dict`) — `{"mae": float, "mape": float}` computed on the aligned in-sample window.
	Dict[str, Any]	- metrics_future (`dict`) — `{"mae": float, "mape": float}` computed against test ground truth, or `{}` when unavailable.
	Dict[str, Any]	- metrics_future_one_day (`dict`) — reserved for downstream one-day metrics; always `{}`.
	Dict[str, Any]	- validation_passed (`bool`) — always `True`; field reserved for downstream safety checks.
	Dict[str, Any]	- test_actual (`pd.Series`, optional) — present only when `df_test` contains matching rows for the forecast horizon.

Examples

import numpy as np
import pandas as pd
from sklearn.linear_model import Ridge
from spotforecast2_safe.forecaster.recursive import ForecasterRecursive
from spotforecast2_safe.manager.predictor import build_prediction_package

rng = np.random.default_rng(42)
idx = pd.date_range("2024-01-01", periods=200, freq="h", tz="UTC")
y_train = pd.Series(rng.normal(100, 10, 200), index=idx, name="load")

forecaster = ForecasterRecursive(estimator=Ridge(), lags=24)
forecaster.fit(y=y_train)

pkg = build_prediction_package(
    forecaster=forecaster,
    target="load",
    y_train=y_train,
    predict_size=24,
)
print(f"Keys: {sorted(pkg.keys())}")
print(f"Future predictions: {len(pkg['future_pred'])} steps")
print(f"Validation passed: {pkg['validation_passed']}")

Keys: ['future_actual', 'future_pred', 'metrics_future', 'metrics_future_one_day', 'metrics_train', 'train_actual', 'train_pred', 'validation_passed']
Future predictions: 24 steps
Validation passed: True

import numpy as np
import pandas as pd
from sklearn.linear_model import Ridge
from spotforecast2_safe.forecaster.recursive import ForecasterRecursive
from spotforecast2_safe.manager.predictor import build_prediction_package

rng = np.random.default_rng(0)
idx = pd.date_range("2024-01-01", periods=200, freq="h", tz="UTC")
y_train = pd.Series(rng.normal(50, 5, 200), index=idx, name="power")

forecaster = ForecasterRecursive(estimator=Ridge(), lags=24)
forecaster.fit(y=y_train)

# Test DataFrame covering the 24-hour forecast horizon
df_test = pd.DataFrame({
    "DateTime": pd.date_range("2024-01-09 08:00", periods=24, freq="h"),
    "power": rng.normal(50, 5, 24),
})

pkg = build_prediction_package(
    forecaster=forecaster,
    target="power",
    y_train=y_train,
    predict_size=24,
    df_test=df_test,
)
print(f"test_actual present: {'test_actual' in pkg}")
print(f"metrics_future keys: {list(pkg['metrics_future'].keys())}")
print(f"MAE on future: {pkg['metrics_future']['mae']:.4f}")

test_actual present: True
metrics_future keys: ['mae', 'mape']
MAE on future: 4.9177