manager.configurator.config_multi.ConfigMulti

manager.configurator.config_multi.ConfigMulti(
    country_code='DE',
    periods=None,
    lags_consider=None,
    train_size=None,
    end_train_default='2025-12-31 00:00+00:00',
    delta_val=None,
    predict_size=24,
    refit_size=7,
    random_state=314159,
    n_hyperparameters_trials=20,
    data_filename='interim/energy_load.csv',
    targets=None,
    use_outlier_detection=True,
    contamination=0.01,
    imputation_method='weighted',
    window_size=72,
    use_exogenous_features=True,
    latitude=51.5136,
    longitude=7.4653,
    timezone='UTC',
    state='NW',
    include_weather_windows=False,
    include_holiday_features=False,
    include_poly_features=False,
    index_name='DateTime',
    start_download=None,
    end_download=None,
    data_start=None,
    data_end=None,
    cov_start=None,
    cov_end=None,
    bounds=None,
    verbose=False,
    cache_home=None,
    end_train_ts=None,
    start_train_ts=None,
    n_trials_optuna=15,
    n_trials_spotoptim=10,
    n_initial_spotoptim=5,
    task='lazy',
    agg_weights=None,
)

Configuration for the multi-input forecasting pipeline.

This class manages all configuration parameters for the multi-input task, including training/prediction intervals, data sources, and feature engineering specifications. All parameters can be customized during initialization or used with sensible defaults.

country_code serves as the single ISO country code used for both API queries (exposed via the API_COUNTRY_CODE property for backward compatibility) and holiday feature generation.

Parameters

Name Type Description Default
country_code str ISO 3166-1 alpha-2 country code (e.g. "DE"). Used for both API queries and holiday feature generation. 'DE'
periods Optional[List[Period]] List of Period objects defining cyclical feature encodings. None
lags_consider Optional[List[int]] List of lag values to consider for feature selection. None
train_size Optional[pd.Timedelta] Time window for training data. None
end_train_default str Default end date for training period (ISO format with timezone). '2025-12-31 00:00+00:00'
delta_val Optional[pd.Timedelta] Validation window size. None
predict_size int Number of hours to predict ahead. 24
refit_size int Number of days between model refits. 7
random_state int Random seed for reproducibility. 314159
n_hyperparameters_trials int Number of trials for hyperparameter optimization. 20
data_filename str Path to the interim merged data file. 'interim/energy_load.csv'
targets Optional[List[str]] List of target column names to train models for. When None (default), no targets are pre-selected; set this attribute after loading the dataset (e.g. config.targets = df.columns.tolist()). Replaces standalone TARGETS and target_columns variables in pipeline scripts, providing a single source of truth for the active target set. None
use_outlier_detection bool If True, apply IsolationForest-based outlier removal. True
contamination float Proportion of outliers for IsolationForest (0 < contamination < 0.5). 0.01
imputation_method str Gap-filling strategy — "weighted" (n2n-style rolling weights) or "linear" (linear interpolation). 'weighted'
window_size int Rolling window size in hours for gap detection (weighted imputation). 72
use_exogenous_features bool If True, build weather/calendar/day-night/holiday features. True
latitude float Latitude of the target location in decimal degrees. 51.5136
longitude float Longitude of the target location in decimal degrees. 7.4653
timezone str IANA timezone string for the target location (e.g. "Europe/Berlin"). 'UTC'
state str ISO 3166-2 subdivision code for regional holidays (e.g. "NW"). 'NW'
include_weather_windows bool If True, include rolling weather-window features. False
include_holiday_features bool If True, include public-holiday indicator features. False
include_poly_features bool If True, include polynomial interaction features. False
index_name str Name assigned to the datetime column when the index is reset. Defaults to "DateTime". 'DateTime'
start_download Optional[str] Start of the download/data range as a string (format "YYYYMMDDHHMM"). Derived from the loaded dataset; None until set. None
end_download Optional[str] End of the download/data range as a string (format "YYYYMMDDHHMM"). Derived from the loaded dataset; None until set. None
data_start Optional[pd.Timestamp] First timestamp of the pipeline data range. Derived from the loaded dataset via get_start_end(); None until set. None
data_end Optional[pd.Timestamp] Last timestamp of the pipeline data range. Derived from the loaded dataset via get_start_end(); None until set. None
cov_start Optional[pd.Timestamp] Start of the covariate range (same as data_start). Derived from the loaded dataset via get_start_end(); None until set. None
cov_end Optional[pd.Timestamp] End of the covariate range (extends data_end by predict_size hours). Derived via get_start_end(); None until set. None
bounds Optional[List[tuple]] Per-column outlier bounds as a list of (lower, upper) tuples, one entry per target column. None until set. None
verbose bool If True, enable verbose output for pipeline steps. Defaults to False. False
cache_home Optional[Any] Path to the cache directory. None means the library default (~/spotforecast2_cache/) is used. None
end_train_ts Optional[pd.Timestamp] End of the training window. Derived from end_train_default after data loading; None until set. None
start_train_ts Optional[pd.Timestamp] Start of the training window. Derived as end_train_ts - train_size after data loading; None until set. None
n_trials_optuna int Number of Optuna Bayesian-search trials for hyperparameter optimization (task 3). Defaults to 15. 15
n_trials_spotoptim int Number of SpotOptim surrogate-search trials (task 4). Defaults to 10. 10
n_initial_spotoptim int Number of initial random evaluations for SpotOptim (task 4). Defaults to 5. 5
task str Active prediction task — one of "lazy", "training", "optuna", or "spotoptim". Defaults to "lazy". 'lazy'
agg_weights Optional[List[float]] Per-target aggregation weights used when combining individual target forecasts into a single weighted sum. The list must contain one weight per entry in targets (in the same order). Positive values add the target’s contribution; negative values invert it. Slice the list to agg_weights[:len(targets)] when only a subset of targets is active. Defaults to None (no weights pre-defined; set after loading the dataset). None

Attributes

Name Type Description
API_COUNTRY_CODE str Read-only property — returns country_code. Preserved for backward compatibility with ForecasterRecursiveModel.
country_code str ISO country code for API queries and holiday generation.
periods List[Period] Cyclical feature encoding specifications.
lags_consider List[int] Lag values for autoregressive features.
train_size pd.Timedelta Training data window.
end_train_default str Default training end date.
delta_val pd.Timedelta Validation window.
predict_size int Prediction horizon in hours.
refit_size int Refit interval in days.
random_state int Random seed.
n_hyperparameters_trials int Hyperparameter tuning trials.
targets Optional[List[str]] Active target column names. None until explicitly set from the loaded dataset.
use_outlier_detection bool IsolationForest outlier removal toggle.
contamination float IsolationForest contamination fraction.
imputation_method str Gap-filling strategy ("weighted" or "linear").
window_size int Rolling window size for weighted imputation.
use_exogenous_features bool Exogenous feature construction toggle.
latitude float Location latitude.
longitude float Location longitude.
timezone str IANA timezone string.
state str Subdivision code for regional holidays.
include_weather_windows bool Weather-window feature toggle.
include_holiday_features bool Holiday feature toggle.
include_poly_features bool Polynomial feature toggle.
index_name str Datetime column name used when resetting the index.
start_download Optional[str] Start of the data download range.
end_download Optional[str] End of the data download range.
data_start Optional[pd.Timestamp] First timestamp of the pipeline data.
data_end Optional[pd.Timestamp] Last timestamp of the pipeline data.
cov_start Optional[pd.Timestamp] Start of the covariate date range.
cov_end Optional[pd.Timestamp] End of the covariate date range.
bounds Optional[List[tuple]] Per-column outlier bounds (lower, upper).
verbose bool Verbose output toggle.
cache_home Optional[Any] Path to the cache directory.
end_train_ts Optional[pd.Timestamp] End of the training window.
start_train_ts Optional[pd.Timestamp] Start of the training window.
n_trials_optuna int Number of Optuna hyperparameter-search trials.
n_trials_spotoptim int Number of SpotOptim search trials.
n_initial_spotoptim int Number of initial SpotOptim evaluations.
task str Active prediction task ("lazy", "training", "optuna", or "spotoptim").
agg_weights Optional[List[float]] Per-target aggregation weights. One weight per entry in targets; positive values add, negative values invert the target’s contribution. None until set.

Notes

The default period configurations use specific n_periods to balance resolution and smoothing: - Daily: n_periods=12 (24h) provides ~2h resolution, smoothing hourly noise and halving dimensionality. - Weekly: n_periods typically matches range (1:1) to distinguish day-of-week patterns. - Yearly: n_periods=12 (365d) provides ~1 month resolution, capturing broad seasonal trends without overfitting.

See docs/PERIOD_CONFIGURATION_RATIONALE.md for a detailed analysis.

Examples

import pandas as pd
from spotforecast2_safe.manager.configurator.config_multi import ConfigMulti
config = ConfigMulti()
print(f"country_code: {config.country_code}")
print(f"API_COUNTRY_CODE: {config.API_COUNTRY_CODE}")
print(f"Predict size: {config.predict_size}")
print(f"Random state: {config.random_state}")
print(f"Targets (default): {config.targets}")
print(f"agg_weights (default): {config.agg_weights}")
print(f"index_name: {config.index_name}")
print(f"start_download: {config.start_download}")
print(f"end_download: {config.end_download}")
print(f"data_start: {config.data_start}")
print(f"data_end: {config.data_end}")
print(f"cov_start: {config.cov_start}")
print(f"cov_end: {config.cov_end}")
print(f"bounds: {config.bounds}")

# Set targets and derived ranges after loading data
config.targets = ["A", "B", "C"]
config.start_download = "202401010000"
config.end_download = "202412312300"
config.data_start = pd.Timestamp("2022-01-01", tz="UTC")
config.data_end = pd.Timestamp("2024-12-31", tz="UTC")
config.cov_start = pd.Timestamp("2022-01-01", tz="UTC")
config.cov_end = pd.Timestamp("2025-01-01", tz="UTC")
config.bounds = [(-2500, 4500), (-10, 3000)]
print(f"Targets (after setting): {config.targets}")
print(f"start_download: {config.start_download}")
print(f"data_start: {config.data_start}")
print(f"bounds: {config.bounds}")

# Create custom configuration — country_code serves both API and holiday purposes
custom_config = ConfigMulti(
    country_code='FR',
    predict_size=48,
    random_state=42,
    targets=["A", "B"],
    index_name="DateTime",
)
print(f"country_code: {custom_config.country_code}")
print(f"API_COUNTRY_CODE: {custom_config.API_COUNTRY_CODE}")
print(f"Predict size: {custom_config.predict_size}")
print(f"Random state: {custom_config.random_state}")
print(f"Targets: {custom_config.targets}")

# Verify training window
print(f"Training window: {config.train_size == pd.Timedelta(days=3 * 365)}")

# Check default periods
print(f"Number of periods: {len(config.periods)}")
print(f"First period name: {config.periods[0].name}")
country_code: DE
API_COUNTRY_CODE: DE
Predict size: 24
Random state: 314159
Targets (default): None
agg_weights (default): None
index_name: DateTime
start_download: None
end_download: None
data_start: None
data_end: None
cov_start: None
cov_end: None
bounds: None
Targets (after setting): ['A', 'B', 'C']
start_download: 202401010000
data_start: 2022-01-01 00:00:00+00:00
bounds: [(-2500, 4500), (-10, 3000)]
country_code: FR
API_COUNTRY_CODE: FR
Predict size: 48
Random state: 42
Targets: ['A', 'B']
Training window: True
Number of periods: 5
First period name: daily

Methods

Name Description
get_params Get parameters for this configuration object.
set_params Set the parameters of this configuration object.

get_params

manager.configurator.config_multi.ConfigMulti.get_params(deep=True)

Get parameters for this configuration object.

Parameters

Name Type Description Default
deep bool If True, will return the parameters for this configuration and contained sub-objects that are estimators. True

Returns

Name Type Description
params Dict[str, object] Dictionary of parameter names mapped to their values.

Examples

from spotforecast2_safe.manager.configurator.config_multi import ConfigMulti
config = ConfigMulti(country_code="FR")
p = config.get_params()
print(f"country_code: {p['country_code']}")
print(f"Predict size: {p['predict_size']}")
print(f"Random state: {p['random_state']}")
print(f"index_name: {p['index_name']}")
print(f"data_start: {p['data_start']}")
print(f"data_end: {p['data_end']}")
print(f"cov_start: {p['cov_start']}")
print(f"cov_end: {p['cov_end']}")
print(f"bounds: {p['bounds']}")
print(f"agg_weights: {p['agg_weights']}")
country_code: FR
Predict size: 24
Random state: 314159
index_name: DateTime
data_start: None
data_end: None
cov_start: None
cov_end: None
bounds: None
agg_weights: None

set_params

manager.configurator.config_multi.ConfigMulti.set_params(params=None, **kwargs)

Set the parameters of this configuration object.

Parameters

Name Type Description Default
params Dict[str, object] Optional dictionary of parameter names mapped to their new values. None
**kwargs object Additional parameter names mapped to their new values. It supports configuring nested ‘Period’ objects using the periods__<name>__<param> notation. {}

Returns

Name Type Description
ConfigMulti ConfigMulti The configuration instance with updated parameters (supports method chaining).

Examples

from spotforecast2_safe.manager.configurator.config_multi import ConfigMulti
config = ConfigMulti()
_ = config.set_params(country_code="FR", predict_size=48)
print(f"country_code: {config.country_code}")
print(f"API_COUNTRY_CODE: {config.API_COUNTRY_CODE}")
print(f"Predict size: {config.predict_size}")
print(f"Random state: {config.random_state}")

# Set derived download range after loading data
_ = config.set_params(start_download="202401010000", end_download="202412312300")
print(f"start_download: {config.start_download}")

# Deep parameter setting
_ = config.set_params(periods__daily__n_periods=24)
print(next(p.n_periods for p in config.periods if p.name == "daily"))
country_code: FR
API_COUNTRY_CODE: FR
Predict size: 48
Random state: 314159
start_download: 202401010000
24