Shared base for all multi-target forecasting pipeline tasks.
BaseTask encapsulates the data-preparation pipeline (steps 1–7) and all helper methods shared across the five task modes (lazy, optuna, spotoptim, predict, clean). Subclasses implement the run method with task-specific training, tuning, or prediction logic.
Pre-loaded input DataFrame with training data. The DataFrame must contain a datetime column matching index_name plus at least one numeric target column.
Pre-loaded input DataFrame with test data (ground truth for the forecast horizon). The DataFrame must contain a datetime column matching index_name plus at least one numeric target column. Optional.
Number of days in each validation fold. Note that the total validation window is val_days * number_folds. Each fold is a contiguous block of val_days days, and folds are non-overlapping and sequential immediately after the training window.
Whether to automatically save fitted models to disk after each training run. Defaults to True so that saved models are immediately available for PredictTask without any manual call to save_models().
Aggregate per-target prediction packages into a weighted forecast.
Delegates to the module-level agg_predictor function. Available as an instance method so that subclasses can override the aggregation strategy when needed.
Constructs the cross-validation splitter used by all tuning tasks (OptunaTask, SpotOptimTask).
Internally uses sklearn.model_selection.TimeSeriesSplit to compute split boundaries that respect temporal ordering and avoid data leakage between folds. Classical cross-validation techniques such as KFold assume i.i.d. samples and yield unreliable estimates on time series data; sklearn.model_selection.TimeSeriesSplit instead ensures every test fold consists only of observations that come after the corresponding training observations.
The validation boundary is determined by config.end_train_ts minus config.delta_val. When config.train_size is set, the sklearn splitter uses a sliding fixed-size training window (max_train_size); otherwise an expanding window is used so that each subsequent fold sees more historical data.
Training time series for the current target. Used both to determine the validation boundary and as the sequence passed to sklearn.model_selection.TimeSeriesSplit.split to derive initial_train_size.
Load the most recent fitted models from the cache directory.
Scans <cache_home>/models/<data_frame_name>/ for .joblib files matching the current data_frame_name. Optionally filters by task_name, target, and max_age_days.
Load the most recent tuning results for a target from cache.
Scans <cache_home>/tuning_results/ for files matching the current data_frame_name and target. Optionally filters by task_name and discards results older than max_age_days.
from spotforecast2.manager.multitask import LazyTasktask = LazyTask(data_frame_name="demo10")# Save first so there is something to loadtask.save_tuning_results( target="target_0", task_name="optuna", best_params={"n_estimators": 100}, best_lags=24,)result = task.load_tuning_results(target="target_0")print(result["best_params"])
{'n_estimators': 100}
log_summary
manager.multitask.BaseTask.log_summary()
Log a summary of the current pipeline configuration.
plot_with_outliers
manager.multitask.BaseTask.plot_with_outliers()
Visualise original vs. cleaned data with outlier markers.
Save fitted forecaster models to the cache directory.
Each model is serialised with joblib (compress=3) into <cache_home>/models/<data_frame_name>/ using a datetime-stamped filename so that multiple snapshots can coexist.
If forecasters is None the method collects fitted models from self.results[task_name], where each prediction package is expected to contain a "forecaster" key.