multitask.MultiTask

multitask.MultiTask(
    config=None,
    *,
    task='lazy',
    dataframe=None,
    data_test=None,
    cache_home=None,
    dry_run=False,
    show_progress=False,
    log_level=logging.INFO,
    **overrides,
)

Orchestrates a multi-target time-series forecasting pipeline.

Data must be provided either as a pandas DataFrame via dataframe. A test dataset can optionally be provided via data_test.

The typical usage flow is:

  1. Instantiate with config (or omit to auto-construct ConfigMulti()).
  2. Call method prepare_data to load, resample, and validate data.
  3. Call method detect_outliers to apply hard bounds and IsolationForest.
  4. Call method impute to fill gaps.
  5. Call method build_exogenous_features to construct weather / calendar / day-night / holiday covariates.
  6. Call method run (or individual run_task_* methods) to train, predict, and aggregate.

Parameters

Name Type Description Default
config Optional[PipelineConfig] A PipelineConfig-conforming object (e.g. ConfigMulti or ConfigEntsoe). When None, a fresh ConfigMulti() is constructed with default fields. None
task str Pipeline task mode — "lazy", "defaults", "optuna", "spotoptim", "predict", or "clean". Defaults to "lazy". 'lazy'
dataframe Optional[pd.DataFrame] Pre-loaded input DataFrame with training data. The DataFrame must contain a datetime column matching config.index_name plus at least one numeric target column. Optional for the "clean" task, required for all others. None
data_test Optional[pd.DataFrame] Pre-loaded input DataFrame with test data. Optional. None
cache_home Optional[Path] Cache directory override. When not None, replaces config.cache_home for this task instance. None
dry_run bool If True, do not clean cache or save models. False
show_progress bool Whether to print progress messages during pipeline execution. False
log_level int Logging level for the pipeline logger. logging.INFO
**overrides Any Forwarded to config.set_params(**overrides) — a convenience for one-line tweaks without building a fresh config. Mutates the caller’s config object. {}

Examples

import pandas as pd
from spotforecast2.multitask import MultiTask
from spotforecast2_safe.configurator.config_multi import ConfigMulti
from spotforecast2_safe.data.fetch_data import fetch_data, get_package_data_home

data_home = get_package_data_home()
df = fetch_data(filename=str(data_home / "demo10.csv"))

mt = MultiTask(ConfigMulti(predict_size=24), dataframe=df)
print(f"DataFrame stored: {mt._dataframe is not None}")
print(f"Task: {mt.TASK}")
DataFrame stored: True
Task: lazy

Methods

Name Description
run Run the task specified by task (or self.TASK).
run_task_clean Remove all cached data from the pipeline cache directory.
run_task_defaults Defaults fitting — no tuning, no cached params.
run_task_lazy Lazy Fitting with default LightGBM parameters.
run_task_optuna Optuna Bayesian hyperparameter tuning.
run_task_predict Predict-only using previously saved models.
run_task_spotoptim SpotOptim surrogate-model Bayesian tuning.

run

multitask.MultiTask.run(task=None, show=True, **kwargs)

Run the task specified by task (or self.TASK).

Parameters

Name Type Description Default
task Optional[str] Override the task mode. None uses self.TASK. None
show bool If True, display prediction figures. True

Returns

Name Type Description
Dict[str, Any] Aggregated prediction package. Per-target results are stored
Dict[str, Any] on self.results[<task_key>].

Raises

Name Type Description
ValueError If task is not one of "lazy", "defaults", "optuna", "spotoptim", "predict", "clean".
RuntimeError If method prepare_data has not been called (for training and prediction tasks).

Examples

import warnings
import tempfile
warnings.filterwarnings("ignore")
from spotforecast2_safe.data.fetch_data import fetch_data, get_package_data_home
from spotforecast2_safe.configurator.config_multi import ConfigMulti
from spotforecast2.multitask import MultiTask

data_home = get_package_data_home()
df = fetch_data(filename=str(data_home / "demo10.csv")).iloc[:500]

config = ConfigMulti(
    predict_size=12,
    targets=["A"],
    lags_consider=[1, 2, 3],
    window_size=4,
    number_folds=2,
    use_exogenous_features=False,
    use_outlier_detection=False,
    auto_save_models=False,
    verbose=False,
)
config.cache_home = tempfile.mkdtemp()

# run() dispatches to run_task_lazy when task="lazy".
mt = MultiTask(config, task="lazy", dataframe=df, show_progress=False)
mt.prepare_data()
mt.impute()
result = mt.run(task="lazy", show=False)
print("Result keys:", list(result.keys())[:4])
assert "future_pred" in result
WeightFunction: all sample weights for the requested index are zero (the window falls entirely within gap-penalty zones). Returning None so ForecasterRecursive uses uniform weighting.
Result keys: ['train_actual', 'train_pred', 'future_actual', 'future_pred']

run_task_clean

multitask.MultiTask.run_task_clean(show=True, dry_run=False, cache_home=None)

Remove all cached data from the pipeline cache directory.

Does not require prepare_data() to be called first.

Parameters

Name Type Description Default
show bool Accepted for API consistency. Not used by the clean task. True
dry_run bool If True, report what would be deleted without actually removing anything. False
cache_home Optional[Path] Override the directory to clean. None uses the cache directory configured on this instance. None

Returns

Name Type Description
Dict[str, Any] Dict with keys status, cache_dir, and deleted_items.

Raises

Name Type Description
RuntimeError If the cache directory cannot be removed.

Examples

import warnings
import tempfile
warnings.filterwarnings("ignore")
from spotforecast2_safe.data.fetch_data import fetch_data, get_package_data_home
from spotforecast2_safe.configurator.config_multi import ConfigMulti
from spotforecast2.multitask import MultiTask

data_home = get_package_data_home()
df = fetch_data(filename=str(data_home / "demo10.csv")).iloc[:500]
cache_dir = tempfile.mkdtemp()

config = ConfigMulti(
    predict_size=12,
    targets=["A"],
    lags_consider=[1, 2, 3],
    window_size=4,
    number_folds=2,
    use_exogenous_features=False,
    use_outlier_detection=False,
    auto_save_models=False,
    verbose=False,
)
config.cache_home = cache_dir

# dry_run=True reports what would be removed without deleting.
mt = MultiTask(config, task="clean", dataframe=df, show_progress=False)
result = mt.run_task_clean(dry_run=True)
print("status:", result["status"])
assert result["status"] == "dry_run"
[clean] Dry run — would delete: /tmp/tmpy8toxilu
  Would remove: logging
status: dry_run

run_task_defaults

multitask.MultiTask.run_task_defaults(show=True)

Defaults fitting — no tuning, no cached params.

Distinct from run_task_lazy only in that it never consults the tuning-result cache. Use this for deterministic baselines and for ENTSO-E “Approach 2: Training without Tuning”.

Parameters

Name Type Description Default
show bool If True, display prediction figures. True

Returns

Name Type Description
Dict[str, Any] Aggregated prediction package. Per-target results in
Dict[str, Any] self.results["defaults"].

Examples

import warnings
import tempfile
warnings.filterwarnings("ignore")
from spotforecast2_safe.data.fetch_data import fetch_data, get_package_data_home
from spotforecast2_safe.configurator.config_multi import ConfigMulti
from spotforecast2.multitask import MultiTask

data_home = get_package_data_home()
df = fetch_data(filename=str(data_home / "demo10.csv")).iloc[:500]

config = ConfigMulti(
    predict_size=12,
    targets=["A"],
    lags_consider=[1, 2, 3],
    window_size=4,
    number_folds=2,
    use_exogenous_features=False,
    use_outlier_detection=False,
    auto_save_models=False,
    verbose=False,
)
config.cache_home = tempfile.mkdtemp()

mt = MultiTask(config, task="defaults", dataframe=df, show_progress=False)
mt.prepare_data()
mt.impute()
result = mt.run_task_defaults(show=False)
print("Result keys:", list(result.keys())[:4])
assert "future_pred" in result
WeightFunction: all sample weights for the requested index are zero (the window falls entirely within gap-penalty zones). Returning None so ForecasterRecursive uses uniform weighting.
Result keys: ['train_actual', 'train_pred', 'future_actual', 'future_pred']

run_task_lazy

multitask.MultiTask.run_task_lazy(show=True)

Lazy Fitting with default LightGBM parameters.

Parameters

Name Type Description Default
show bool If True, display prediction figures. True

Returns

Name Type Description
Dict[str, Any] Aggregated prediction package. Per-target results in
Dict[str, Any] self.results["lazy"].

Examples

import warnings
import tempfile
warnings.filterwarnings("ignore")
from spotforecast2_safe.data.fetch_data import fetch_data, get_package_data_home
from spotforecast2_safe.configurator.config_multi import ConfigMulti
from spotforecast2.multitask import MultiTask

data_home = get_package_data_home()
df = fetch_data(filename=str(data_home / "demo10.csv")).iloc[:500]

config = ConfigMulti(
    predict_size=12,
    targets=["A"],
    lags_consider=[1, 2, 3],
    window_size=4,
    number_folds=2,
    use_exogenous_features=False,
    use_outlier_detection=False,
    auto_save_models=False,
    verbose=False,
)
config.cache_home = tempfile.mkdtemp()

mt = MultiTask(config, task="lazy", dataframe=df, show_progress=False)
mt.prepare_data()
mt.impute()
result = mt.run_task_lazy(show=False)
print("Result keys:", list(result.keys())[:4])
assert "future_pred" in result
WeightFunction: all sample weights for the requested index are zero (the window falls entirely within gap-penalty zones). Returning None so ForecasterRecursive uses uniform weighting.
Result keys: ['train_actual', 'train_pred', 'future_actual', 'future_pred']

run_task_optuna

multitask.MultiTask.run_task_optuna(
    search_space=None,
    show=True,
    show_progress=False,
)

Optuna Bayesian hyperparameter tuning.

Parameters

Name Type Description Default
search_space Optional[Callable] Callable (trial) -> dict. None
show bool If True, display prediction figures. True

Returns

Name Type Description
Dict[str, Any] Aggregated prediction package. Per-target results in
Dict[str, Any] self.results["optuna"].

Examples

import warnings
import tempfile
warnings.filterwarnings("ignore")
from spotforecast2_safe.data.fetch_data import fetch_data, get_package_data_home
from spotforecast2_safe.configurator.config_multi import ConfigMulti
from spotforecast2.multitask import MultiTask

data_home = get_package_data_home()
df = fetch_data(filename=str(data_home / "demo10.csv")).iloc[:500]

config = ConfigMulti(
    predict_size=12,
    targets=["A"],
    lags_consider=[1, 2, 3],
    window_size=4,
    number_folds=2,
    use_exogenous_features=False,
    use_outlier_detection=False,
    auto_save_models=False,
    verbose=False,
    n_trials_optuna=2,
)
config.cache_home = tempfile.mkdtemp()

mt = MultiTask(config, task="optuna", dataframe=df, show_progress=False)
mt.prepare_data()
mt.impute()
result = mt.run_task_optuna(show=False)
print("Result keys:", list(result.keys())[:4])
assert "future_pred" in result
WeightFunction: all sample weights for the requested index are zero (the window falls entirely within gap-penalty zones). Returning None so ForecasterRecursive uses uniform weighting.
WeightFunction: all sample weights for the requested index are zero (the window falls entirely within gap-penalty zones). Returning None so ForecasterRecursive uses uniform weighting.
WeightFunction: all sample weights for the requested index are zero (the window falls entirely within gap-penalty zones). Returning None so ForecasterRecursive uses uniform weighting.
WeightFunction: all sample weights for the requested index are zero (the window falls entirely within gap-penalty zones). Returning None so ForecasterRecursive uses uniform weighting.
Result keys: ['train_actual', 'train_pred', 'future_actual', 'future_pred']

run_task_predict

multitask.MultiTask.run_task_predict(
    show=True,
    task_name=None,
    max_age_days=None,
)

Predict-only using previously saved models.

Loads fitted models from the cache directory and produces predictions without any training. Raises RuntimeError if no saved models are found.

Parameters

Name Type Description Default
show bool If True, display prediction figures. True
task_name Optional[str] Restrict model loading to a specific source task ("lazy", "optuna", or "spotoptim"). None loads the most recent model regardless of source. None
max_age_days Optional[float] Maximum age in days for saved models. None accepts any age. None

Returns

Name Type Description
Dict[str, Any] Aggregated prediction package. Per-target results in
Dict[str, Any] self.results["predict"].

Raises

Name Type Description
RuntimeError If no saved models are found.

Examples

import warnings
import tempfile
warnings.filterwarnings("ignore")
from spotforecast2_safe.data.fetch_data import fetch_data, get_package_data_home
from spotforecast2_safe.configurator.config_multi import ConfigMulti
from spotforecast2.multitask import MultiTask

data_home = get_package_data_home()
df = fetch_data(filename=str(data_home / "demo10.csv")).iloc[:500]
cache_dir = tempfile.mkdtemp()

# First train and save a model with the lazy task.
config_train = ConfigMulti(
    predict_size=12,
    targets=["A"],
    lags_consider=[1, 2, 3],
    window_size=4,
    number_folds=2,
    use_exogenous_features=False,
    use_outlier_detection=False,
    auto_save_models=True,
    verbose=False,
)
config_train.cache_home = cache_dir
mt_train = MultiTask(config_train, task="lazy", dataframe=df, show_progress=False)
mt_train.prepare_data()
mt_train.impute()
mt_train.run_task_lazy(show=False)

# Then load and predict without re-training.
config_pred = ConfigMulti(
    predict_size=12,
    targets=["A"],
    lags_consider=[1, 2, 3],
    window_size=4,
    number_folds=2,
    use_exogenous_features=False,
    use_outlier_detection=False,
    auto_save_models=False,
    verbose=False,
)
config_pred.cache_home = cache_dir
mt_pred = MultiTask(config_pred, task="predict", dataframe=df, show_progress=False)
mt_pred.prepare_data()
mt_pred.impute()
result = mt_pred.run_task_predict(show=False, task_name="lazy")
print("Result keys:", list(result.keys())[:4])
assert "future_pred" in result
WeightFunction: all sample weights for the requested index are zero (the window falls entirely within gap-penalty zones). Returning None so ForecasterRecursive uses uniform weighting.
Result keys: ['train_actual', 'train_pred', 'future_actual', 'future_pred']

run_task_spotoptim

multitask.MultiTask.run_task_spotoptim(search_space=None, show=True)

SpotOptim surrogate-model Bayesian tuning.

Parameters

Name Type Description Default
search_space Optional[Dict[str, Any]] Dictionary defining the SpotOptim search space. None
show bool If True, display prediction figures. True

Returns

Name Type Description
Dict[str, Any] Aggregated prediction package. Per-target results in
Dict[str, Any] self.results["spotoptim"].

Examples

import warnings
import tempfile
warnings.filterwarnings("ignore")
from spotforecast2_safe.data.fetch_data import fetch_data, get_package_data_home
from spotforecast2_safe.configurator.config_multi import ConfigMulti
from spotforecast2.multitask import MultiTask

data_home = get_package_data_home()
df = fetch_data(filename=str(data_home / "demo10.csv")).iloc[:500]

config = ConfigMulti(
    predict_size=12,
    targets=["A"],
    lags_consider=[1, 2, 3],
    window_size=4,
    number_folds=2,
    use_exogenous_features=False,
    use_outlier_detection=False,
    auto_save_models=False,
    verbose=False,
    n_trials_spotoptim=2,
    n_initial_spotoptim=1,
)
config.cache_home = tempfile.mkdtemp()

mt = MultiTask(config, task="spotoptim", dataframe=df, show_progress=False)
mt.prepare_data()
mt.impute()
result = mt.run_task_spotoptim(show=False)
print("Result keys:", list(result.keys())[:4])
assert "future_pred" in result
WeightFunction: all sample weights for the requested index are zero (the window falls entirely within gap-penalty zones). Returning None so ForecasterRecursive uses uniform weighting.
WeightFunction: all sample weights for the requested index are zero (the window falls entirely within gap-penalty zones). Returning None so ForecasterRecursive uses uniform weighting.
WeightFunction: all sample weights for the requested index are zero (the window falls entirely within gap-penalty zones). Returning None so ForecasterRecursive uses uniform weighting.
WeightFunction: all sample weights for the requested index are zero (the window falls entirely within gap-penalty zones). Returning None so ForecasterRecursive uses uniform weighting.
`Forecaster` refitted using the best-found lags and parameters, and the whole data set: 
  Lags: [ 1  2 23 24 47 48] 
  Parameters: {'estimator__num_leaves': 212, 'estimator__max_depth': 8, 'estimator__learning_rate': 0.022237942587898952, 'estimator__n_estimators': 23, 'estimator__bagging_fraction': 0.8966718932567622, 'estimator__feature_fraction': 0.6759904724131447, 'estimator__reg_alpha': 80.46568296843319, 'estimator__reg_lambda': 37.13270777695332}
  Backtesting metric: 24013.098401317144
Result keys: ['train_actual', 'train_pred', 'future_actual', 'future_pred']