| country_code |
str |
ISO 3166-1 alpha-2 country code (e.g. "DE"). Used for both API queries and holiday feature generation. |
'DE' |
| periods |
Optional[List[Period]] |
List of Period objects defining cyclical feature encodings. |
default_periods() |
| lags_consider |
Optional[List[int]] |
List of lag values to consider for feature selection. |
(lambda: list(range(1, 24)))() |
| train_size |
Optional[pd.Timedelta] |
Time window for training data. |
(lambda: pd.Timedelta(days=(3 * 365)))() |
| end_train_default |
str |
Default end date for training period (ISO format with timezone). |
'2025-12-31 00:00+00:00' |
| delta_val |
Optional[pd.Timedelta] |
Validation window size. |
(lambda: pd.Timedelta(hours=(24 * 7 * 10)))() |
| predict_size |
int |
Number of hours to predict ahead. |
24 |
| cv_block_size |
int | None |
Cross-validation test-block width in hours. Defaults to None, meaning the CV uses predict_size. Set to a fixed value (e.g. 24) to decouple the cross-validation horizon from a render-dependent live predict_size. |
None |
| refit_size |
int |
Number of days between model refits. |
7 |
| random_state |
int |
Random seed for reproducibility. |
314159 |
| n_hyperparameters_trials |
int |
Number of trials for hyperparameter optimization. |
20 |
| data_filename |
str |
Path to the interim merged data file. |
'interim/energy_load.csv' |
| targets |
Optional[List[str]] |
List of target column names to train models for. When None (default), no targets are pre-selected; set this attribute after loading the dataset (e.g. config.targets = df.columns.tolist()). Replaces standalone TARGETS and target_columns variables in pipeline scripts, providing a single source of truth for the active target set. |
None |
| use_outlier_detection |
bool |
If True, apply IsolationForest-based outlier removal. |
True |
| contamination |
float |
Proportion of outliers for IsolationForest (0 < contamination < 0.5). |
0.01 |
| imputation_method |
str |
Gap-filling strategy — "weighted" (n2n-style rolling weights) or "linear" (linear interpolation). |
'weighted' |
| window_size |
int |
Rolling window size in hours for gap detection (weighted imputation). |
72 |
| use_exogenous_features |
bool |
If True, build weather/calendar/day-night/holiday features. |
True |
| latitude |
float |
Latitude of the target location in decimal degrees. |
51.5136 |
| longitude |
float |
Longitude of the target location in decimal degrees. |
7.4653 |
| timezone |
str |
IANA timezone string for the target location (e.g. "Europe/Berlin"). |
'UTC' |
| state |
str |
ISO 3166-2 subdivision code for regional holidays (e.g. "NW"). |
'NW' |
| include_weather_windows |
bool |
If True, include rolling weather-window features. |
False |
| include_holiday_features |
bool |
If True, include public-holiday indicator features. |
False |
| include_holiday_adjacency_features |
bool |
If True, include Brückentag and before/after-holiday indicators (is_brueckentag, is_before_holiday, is_after_holiday). Defaults to False. |
False |
| include_ephemeris_features |
bool |
If True, include solar-elevation and daylight-duration features. Defaults to False. |
False |
| include_day_type_features |
bool |
If True, include working-day and day-type class features (is_workday, day_type). Defaults to False. |
False |
| include_school_holiday_features |
bool |
Append the is_school_holiday binary indicator from the bundled OpenHolidays API dataset (ODbL-1.0). Coverage 2022-01-01 to 2027-12-31 for all 16 German Bundesländer. Only country_code="DE" is supported. Defaults to False. |
False |
| poly_features_degree |
int |
Polynomial-interaction degree. 1 (default) generates no interactions; 2 adds pairwise bilinear terms; 3+ higher order. |
1 |
| max_poly_features |
int |
Cap on polynomial interaction columns; only the top max_poly_features ranked by mutual information with the target are kept (<= 0 disables). Defaults to 10. |
10 |
| poly_mi_n_jobs |
Optional[int] |
Parallel jobs for the mutual-information ranking that enforces max_poly_features. -1 (default) uses all cores; None runs single-threaded. Parallelism does not change the selection. |
-1 |
| poly_mi_sample_size |
Optional[int] |
Row cap for that ranking; longer series are scored on a reproducible random subsample of this size (seeded by random_state), which can change which borderline columns make the top K. None scores every row (the pre-15.8 behaviour). Defaults to 4000. |
4000 |
| index_name |
str |
Name assigned to the datetime column when the index is reset. Defaults to "DateTime". |
'DateTime' |
| bounds |
Optional[List[tuple]] |
Per-column outlier bounds as a list of (lower, upper) tuples, one entry per target column. None until set. |
None |
| verbose |
bool |
If True, enable verbose output for pipeline steps. Defaults to False. |
False |
| cache_home |
Optional[Any] |
Path to the cache directory. None means the library default (~/spotforecast2_cache/) is used. |
None |
| n_trials_optuna |
int |
Number of Optuna Bayesian-search trials for hyperparameter optimization (task 3). Defaults to 15. |
15 |
| n_trials_spotoptim |
int |
Number of SpotOptim surrogate-search trials (task 4). Defaults to 10. |
10 |
| n_initial_spotoptim |
int |
Number of initial random evaluations for SpotOptim (task 4). Defaults to 5. |
5 |
| max_time_spotoptim |
Optional[float] |
Wall-clock budget for the SpotOptim search in minutes (task 4). The search stops when either n_trials_spotoptim evaluations or this time limit is reached, whichever comes first. None (the default) disables the limit. |
None |
| warm_start_lags |
Optional[List[int]] |
Lag set the SpotOptim task injects as a search-space candidate and uses to seed the optimizer’s first evaluation. Defaults to DEFAULT_WARM_START_LAGS ([1, 2, 3, 23, 24, 25, 47, 48, 167, 168, 169, 336]). None or an empty list disables the warm start. |
(lambda: list(DEFAULT_WARM_START_LAGS))() |
| task |
str |
Active prediction task — one of "lazy", "training", "optuna", or "spotoptim". Defaults to "lazy". |
'lazy' |
| agg_weights |
Optional[List[float]] |
Per-target aggregation weights used when combining individual target forecasts into a single weighted sum. The list must contain one weight per entry in targets (in the same order). Positive values add the target’s contribution; negative values invert it. Slice the list to agg_weights[:len(targets)] when only a subset of targets is active. Defaults to None (no weights pre-defined; set after loading the dataset). |
None |
| auto_save_models |
bool |
Whether BaseTask._run_strategy should persist fitted forecasters to <cache_home>/models/ after every training run. Defaults to True so that saved models are immediately available for PredictTask without an explicit save_models() call. |
True |
| data_frame_name |
str |
Identifier for the active dataset. Used by BaseTask to name cache subdirectories, model files, and the per-dataset log file. Defaults to "default". |
'default' |
| on_weather_failure |
Literal['raise', 'skip'] |
Policy for handling Open-Meteo fetch failures inside BaseTask.build_exogenous_features. "raise" (default) aborts the pipeline with a WeatherFetchError and preserves the safety-critical fail-safe semantics. "skip" logs a warning and continues with empty weather features so the rest of the pipeline can run without the Open-Meteo dependency. |
'raise' |
| exog_max_gap_hours |
int |
Maximum length, in hours, of a contiguous run of missing exogenous-provider values healed before the provider is rejected. Interior gaps are time-interpolated; leading/trailing edge gaps are back-/forward-filled. 0 (default) keeps the strict fail-safe (any gap raises). Healed runs are logged with count and span. Only already-published day-ahead vintages are involved, so healing is leakage-clean (CR-3). |
0 |
| exog_max_tail_gap_hours |
int |
Extended healing budget, in hours, applied exclusively to the trailing-edge NaN run (the run containing the last index timestamp). The effective tail budget is max(exog_max_gap_hours, exog_max_tail_gap_hours). The canonical use case is the ENTSO-E day-ahead publication frontier: the last published vintage is zero-order-held forward to the forecast horizon without touching interior gaps (CR-3-clean). When exog_max_tail_gap_hours <= exog_max_gap_hours the parameter is inert (the interior budget already covers the tail) and a warning is logged. Defaults to 0. |
0 |
| exog_provider_window |
Literal['full', 'train'] |
Span the exogenous providers are validated against. "full" (default) requires coverage of the entire data_start→cov_end request, matching prior behaviour. "train" validates only the consumed window [start_train_ts, cov_end], tolerating missing values before the training window. Honoured by the MultiTask pipeline; the forecaster-wrapper path currently always validates the full span. |
'full' |