manager.multitask.run

manager.multitask.run(
    dataframe=None,
    task='lazy',
    cache_home=None,
    bounds=None,
    agg_weights=None,
    project_name='test_project',
    n_trials_optuna=10,
    train_days=3 * 365,
    val_days=31,
    imputation_method='weighted',
    show_progress=False,
    plot_with_outliers=False,
    show=False,
    verbose=False,
    log_level=40,
    **kwargs,
)

Run the MultiTask forecasting pipeline and return predictions.

Wraps the standard pipeline sequence into a single call. For the "clean" task only the cache directory is wiped and an empty DataFrame is returned. For all other tasks the full sequence

prepare_data → detect_outliers → impute →
build_exogenous_features → run

is executed and the aggregated future predictions are returned as a DataFrame.

Parameters

Name Type Description Default
dataframe pd.DataFrame Input time-series data. Must contain a datetime column matching the configured index_name and at least one numeric target column. Optional for the "clean" task, but required for all other tasks. Defaults to None. None
task str Pipeline mode — one of "lazy", "optuna", "spotoptim", "predict", or "clean". Defaults to "lazy". 'lazy'
cache_home Optional[str] Optional path to the cache directory. Defaults to None, which uses the package default cache location that is defined via spotforecast2_safe’s get_cache_home(). None
bounds Optional[List[Tuple[float, float]]] Per-column hard outlier bounds as a list of (lower, upper) tuples, one per target column. None uses the package defaults. None
agg_weights Optional[List[float]] Per-column weights for the final aggregation step as a list of floats, one per target column. None uses the package defaults. None
project_name str Identifier used for cache-directory and model-file naming. Defaults to "test_project". 'test_project'
train_days Optional[int] Optional number of days in the training window. Defaults to 3 years (1095 days). 3 * 365
val_days Optional[int] Optional number of days in the validation window. If None, the default of 31 days is used. 31
imputation_method str Method used for imputation of detected outliers. Passed to the imputation_method argument of MultiTask. Options are "weighted" or "linear". Defaults to "weighted". 'weighted'
show_progress bool Whether to print progress messages during pipeline execution. Defaults to False. False
plot_with_outliers bool Whether to generate a visualization of the data with outliers highlighted. Defaults to False. False
show bool Whether to display prediction figures after running each task. Defaults to False. False
verbose bool Default is False. False
log_level int Logging level. Default is 40 (ERROR). Other common values include 0 (NOTSET), 10 (DEBUG), 20 (INFO), 30 (WARNING), 50 (CRITICAL). 40
**kwargs Any Additional keyword arguments forwarded verbatim to MultiTask. {}

Returns

Name Type Description
DataFrame pd.DataFrame DataFrame whose index is the forecast horizon timestamps and whose single column "forecast" contains the aggregated predicted values. For the "clean" task an empty DataFrame is returned.

Raises

Name Type Description
ValueError If task is not one of the supported task names.

Examples

Run the pipeline using cached or default model parameters ("lazy" task):

from spotforecast2.manager.multitask.runner import run
from spotforecast2_safe.data.fetch_data import fetch_data, get_package_data_home
import warnings
warnings.filterwarnings("ignore")

data_home = get_package_data_home()
df = fetch_data(filename=str(data_home / "demo02.csv"))

forecast = run(df, task="lazy", project_name="demo02", train_days = 365, predict_size=24, imputation_method="linear")
print(forecast)
                             forecast
1975-06-18 19:00:00+00:00  -89.884356
1975-06-18 20:00:00+00:00   31.167150
1975-06-18 21:00:00+00:00  -87.979310
1975-06-18 22:00:00+00:00  -74.535661
1975-06-18 23:00:00+00:00   -0.950101
1975-06-19 00:00:00+00:00   18.045331
1975-06-19 01:00:00+00:00   22.515805
1975-06-19 02:00:00+00:00  -23.375324
1975-06-19 03:00:00+00:00  -72.343383
1975-06-19 04:00:00+00:00 -121.724357
1975-06-19 05:00:00+00:00 -109.145628
1975-06-19 06:00:00+00:00 -121.964624
1975-06-19 07:00:00+00:00 -137.573909
1975-06-19 08:00:00+00:00 -129.502406
1975-06-19 09:00:00+00:00 -142.469535
1975-06-19 10:00:00+00:00 -119.326536
1975-06-19 11:00:00+00:00 -135.700305
1975-06-19 12:00:00+00:00 -169.127924
1975-06-19 13:00:00+00:00 -143.357856
1975-06-19 14:00:00+00:00 -105.624973
1975-06-19 15:00:00+00:00 -151.880904
1975-06-19 16:00:00+00:00 -132.496487
1975-06-19 17:00:00+00:00  -87.029015
1975-06-19 18:00:00+00:00  -90.311355

Tune hyperparameters via Optuna Bayesian search ("optuna" task):

from spotforecast2.manager.multitask.runner import run
from spotforecast2_safe.data.fetch_data import fetch_data, get_package_data_home
import warnings
warnings.filterwarnings("ignore")

data_home = get_package_data_home()
df = fetch_data(filename=str(data_home / "demo02.csv"))

forecast = run(
    df,
    task="optuna",
    project_name="demo02",
    n_trials_optuna=5,
    predict_size=24,
    train_days=365,
    val_days=7,
    imputation_method="linear"
)
print(forecast)
                             forecast
1975-06-18 19:00:00+00:00  -83.823922
1975-06-18 20:00:00+00:00   14.564727
1975-06-18 21:00:00+00:00 -124.670035
1975-06-18 22:00:00+00:00  -88.091592
1975-06-18 23:00:00+00:00  -96.395296
1975-06-19 00:00:00+00:00 -124.266838
1975-06-19 01:00:00+00:00 -121.308235
1975-06-19 02:00:00+00:00   15.789655
1975-06-19 03:00:00+00:00  -30.926066
1975-06-19 04:00:00+00:00  -64.735700
1975-06-19 05:00:00+00:00  -91.799719
1975-06-19 06:00:00+00:00 -118.489573
1975-06-19 07:00:00+00:00 -156.223891
1975-06-19 08:00:00+00:00 -141.867139
1975-06-19 09:00:00+00:00 -100.769773
1975-06-19 10:00:00+00:00  -81.010745
1975-06-19 11:00:00+00:00 -150.707992
1975-06-19 12:00:00+00:00 -171.736224
1975-06-19 13:00:00+00:00 -132.933374
1975-06-19 14:00:00+00:00 -122.523714
1975-06-19 15:00:00+00:00 -166.821392
1975-06-19 16:00:00+00:00 -144.400132
1975-06-19 17:00:00+00:00 -120.908573
1975-06-19 18:00:00+00:00 -138.971944

Load previously saved models and predict without retraining ("predict" task). A prior training run ("lazy" or "optuna") must have saved models to the cache first:

from spotforecast2.manager.multitask.runner import run
from spotforecast2_safe.data.fetch_data import fetch_data, get_package_data_home
import warnings
warnings.filterwarnings("ignore")

data_home = get_package_data_home()
df = fetch_data(filename=str(data_home / "demo02.csv"))

forecast = run(df, task="predict", project_name="demo02", predict_size=24, imputation_method="linear")
print(forecast)
                             forecast
1975-06-18 19:00:00+00:00  -83.823922
1975-06-18 20:00:00+00:00   14.564727
1975-06-18 21:00:00+00:00 -124.670035
1975-06-18 22:00:00+00:00  -88.091592
1975-06-18 23:00:00+00:00  -96.395296
1975-06-19 00:00:00+00:00 -124.266838
1975-06-19 01:00:00+00:00 -121.308235
1975-06-19 02:00:00+00:00   15.789655
1975-06-19 03:00:00+00:00  -30.926066
1975-06-19 04:00:00+00:00  -64.735700
1975-06-19 05:00:00+00:00  -91.799719
1975-06-19 06:00:00+00:00 -118.489573
1975-06-19 07:00:00+00:00 -156.223891
1975-06-19 08:00:00+00:00 -141.867139
1975-06-19 09:00:00+00:00 -100.769773
1975-06-19 10:00:00+00:00  -81.010745
1975-06-19 11:00:00+00:00 -150.707992
1975-06-19 12:00:00+00:00 -171.736224
1975-06-19 13:00:00+00:00 -132.933374
1975-06-19 14:00:00+00:00 -122.523714
1975-06-19 15:00:00+00:00 -166.821392
1975-06-19 16:00:00+00:00 -144.400132
1975-06-19 17:00:00+00:00 -120.908573
1975-06-19 18:00:00+00:00 -138.971944

Remove all cached models and artefacts for a project ("clean" task). Returns an empty DataFrame:

from spotforecast2.manager.multitask.runner import run

result = run(task="clean", project_name="demo02")
print(result.empty)
[clean] Cache removed successfully: /home/runner/.spotforecast2_cache
True