manager.multitask.MultiTask(
task= 'lazy' ,
dataframe= None ,
data_test= None ,
data_frame_name= 'default' ,
cache_home= None ,
agg_weights= None ,
index_name= 'DateTime' ,
number_folds= 10 ,
predict_size= 24 ,
bounds= None ,
contamination= 0.03 ,
imputation_method= 'weighted' ,
use_exogenous_features= True ,
n_trials_optuna= 15 ,
n_trials_spotoptim= 10 ,
n_initial_spotoptim= 5 ,
auto_save_models= True ,
train_days= 365 * 2 ,
val_days= 7 * 2 ,
log_level= logging.INFO,
verbose= False ,
dry_run= False ,
show_progress= False ,
** config_overrides,
)
Orchestrates a multi-target time-series forecasting pipeline.
Data must be provided either as a pandas DataFrame via dataframe. A test dataset can optionally be provided via data_test.
The typical usage flow is:
Instantiate with configuration arguments.
Call method prepare_data to load, resample, and validate data.
Call method detect_outliers to apply hard bounds and IsolationForest.
Call method impute to fill gaps.
Call method build_exogenous_features to construct weather / calendar / day-night / holiday covariates.
Call method run (or individual run_task_* methods) to train, predict, and aggregate.
Parameters
task
str
Pipeline task mode — "lazy", "optuna", "spotoptim", "predict", or "clean". Defaults to "lazy".
'lazy'
dataframe
Optional [pd .DataFrame ]
Pre-loaded input DataFrame with Train data. The DataFrame must contain a datetime column matching index_name plus at least one numeric target column. Optional for the “clean” task, but required for all other tasks.
None
data_test
Optional [pd .DataFrame ]
Pre-loaded input DataFrame with Test data. The DataFrame must contain a datetime column matching index_name plus at least one numeric target column. Optional.
None
cache_home
Optional [Path ]
Cache directory path.
None
agg_weights
Optional [List [float ]]
Per-target aggregation weights.
None
index_name
str
Datetime column name in the raw CSV / DataFrame.
'DateTime'
number_folds
int
Number of validation folds.
10
predict_size
int
Forecast horizon in hours.
24
bounds
Optional [List [tuple ]]
Per-column hard outlier bounds (lower, upper).
None
contamination
float
IsolationForest contamination fraction.
0.03
imputation_method
str
Gap-filling strategy.
'weighted'
use_exogenous_features
bool
Whether to build exogenous features.
True
n_trials_optuna
int
Number of Optuna Bayesian-search trials.
15
n_trials_spotoptim
int
Number of SpotOptim surrogate-search trials.
10
n_initial_spotoptim
int
Initial random evaluations for SpotOptim.
5
auto_save_models
bool
Whether to automatically save fitted models to disk after each training run. Defaults to True so that saved models are immediately available for the predict task without any manual call to save_models.
True
train_days
int
Length of the training window in days. Controls TRAIN_SIZE and config.train_size. Defaults to 365 * 2 (two years).
365 * 2
val_days
int
Length of each validation fold in days. The total validation span is val_days * number_folds. Controls DELTA_VAL and config.delta_val. Defaults to 7 * 10 (ten weeks).
7 * 2
log_level
int
Logging level for the pipeline logger.
logging.INFO
dry_run
bool
If True, do not clean cache or save models. Useful for testing and debugging.
False
config_overrides
Any
Extra keyword arguments forwarded to ConfigMulti.
{}
Examples
import pandas as pd
from spotforecast2.manager.multitask import MultiTask
from spotforecast2_safe.data.fetch_data import fetch_data, get_package_data_home
data_home = get_package_data_home()
df = fetch_data(filename= str (data_home / "demo10.csv" ))
mt = MultiTask(dataframe= df, predict_size= 24 )
print (f"DataFrame stored: { mt. _dataframe is not None } " )
print (f"Task: { mt. TASK} " )
DataFrame stored: True
Task: lazy
Methods
run
manager.multitask.MultiTask.run(task= None , show= True , ** kwargs)
Run the task specified by task (or self.TASK).
Parameters
task
Optional [str ]
Override the task mode. None uses self.TASK.
None
show
bool
If True, display prediction figures.
True
Returns
Dict [str , Any ]
Aggregated prediction package. Per-target results are stored
Dict [str , Any ]
on self.results[<task_key>].
Raises
ValueError
If task is not one of "lazy", "optuna", "spotoptim", "predict", "clean".
RuntimeError
If method prepare_data has not been called (for training and prediction tasks).
run_task_clean
manager.multitask.MultiTask.run_task_clean(
show= True ,
dry_run= False ,
cache_home= None ,
)
Remove all cached data from the pipeline cache directory.
Does not require prepare_data() to be called first.
Parameters
show
bool
Accepted for API consistency. Not used by the clean task.
True
dry_run
bool
If True, report what would be deleted without actually removing anything.
False
cache_home
Optional [Path ]
Override the directory to clean. None uses the cache directory configured on this instance.
None
Returns
Dict [str , Any ]
Dict with keys status, cache_dir, and deleted_items.
run_task_lazy
manager.multitask.MultiTask.run_task_lazy(show= True )
Lazy Fitting with default LightGBM parameters.
Parameters
show
bool
If True, display prediction figures.
True
Returns
Dict [str , Any ]
Aggregated prediction package. Per-target results in
Dict [str , Any ]
self.results["lazy"].
run_task_optuna
manager.multitask.MultiTask.run_task_optuna(
search_space= None ,
show= True ,
show_progress= False ,
)
Optuna Bayesian hyperparameter tuning.
Parameters
search_space
Optional [Callable ]
Callable (trial) -> dict.
None
show
bool
If True, display prediction figures.
True
Returns
Dict [str , Any ]
Aggregated prediction package. Per-target results in
Dict [str , Any ]
self.results["optuna"].
run_task_predict
manager.multitask.MultiTask.run_task_predict(
show= True ,
task_name= None ,
max_age_days= None ,
)
Predict-only using previously saved models.
Loads fitted models from the cache directory and produces predictions without any training. Raises RuntimeError if no saved models are found.
Parameters
show
bool
If True, display prediction figures.
True
task_name
Optional [str ]
Restrict model loading to a specific source task ("lazy", "optuna", or "spotoptim"). None loads the most recent model regardless of source.
None
max_age_days
Optional [float ]
Maximum age in days for saved models. None accepts any age.
None
Returns
Dict [str , Any ]
Aggregated prediction package. Per-target results in
Dict [str , Any ]
self.results["predict"].
run_task_spotoptim
manager.multitask.MultiTask.run_task_spotoptim(search_space= None , show= True )
SpotOptim surrogate-model Bayesian tuning.
Parameters
search_space
Optional [Dict [str , Any ]]
Dictionary defining the SpotOptim search space.
None
show
bool
If True, display prediction figures.
True
Returns
Dict [str , Any ]
Aggregated prediction package. Per-target results in
Dict [str , Any ]
self.results["spotoptim"].