End-to-end baseline forecasting using equivalent date method.
This function implements a complete forecasting pipeline that: 1. Loads and validates target data 2. Detects and removes outliers 3. Imputes missing values 4. Splits into train/validation/test sets 5. Trains or loads equivalent date forecasters 6. Generates multi-step ahead predictions
Models are persisted to disk following scikit-learn conventions using joblib. By default, models are retrained (force_train=True). Set force_train=False to reuse existing cached models.
╭─────────────────────────────── IgnoredArgumentWarning ───────────────────────────────╮│ The number of bins has been reduced from 10 to 4 due to duplicated edges caused by ││ repeated predicted values. ││││ Category : spotforecast2.exceptions.IgnoredArgumentWarning ││ Location : ││ /Users/bartz/.claude/jobs/fcb86c32/tmp/wt-freeze/src/spotforecast2_safe/preprocessin ││ g/_binner.py:259 ││ Suppress : warnings.simplefilter('ignore', category=IgnoredArgumentWarning) │╰──────────────────────────────────────────────────────────────────────────────────────╯
╭─────────────────────────────── IgnoredArgumentWarning ───────────────────────────────╮│ The number of bins has been reduced from 10 to 4 due to duplicated edges caused by ││ repeated predicted values. ││││ Category : spotforecast2.exceptions.IgnoredArgumentWarning ││ Location : ││ /Users/bartz/.claude/jobs/fcb86c32/tmp/wt-freeze/src/spotforecast2_safe/preprocessin ││ g/_binner.py:259 ││ Suppress : warnings.simplefilter('ignore', category=IgnoredArgumentWarning) │╰──────────────────────────────────────────────────────────────────────────────────────╯
╭─────────────────────────────── IgnoredArgumentWarning ───────────────────────────────╮│ The number of bins has been reduced from 10 to 4 due to duplicated edges caused by ││ repeated predicted values. ││││ Category : spotforecast2.exceptions.IgnoredArgumentWarning ││ Location : ││ /Users/bartz/.claude/jobs/fcb86c32/tmp/wt-freeze/src/spotforecast2_safe/preprocessin ││ g/_binner.py:259 ││ Suppress : warnings.simplefilter('ignore', category=IgnoredArgumentWarning) │╰──────────────────────────────────────────────────────────────────────────────────────╯
Trained models are saved to disk using joblib for fast reuse.
When force_train=False, existing models are loaded and prediction proceeds without retraining. This significantly speeds up prediction for repeated calls with the same configuration.
The model_dir directory is created automatically if it doesn’t exist.
Default model_dir uses get_cache_home() which respects the SPOTFORECAST2_CACHE environment variable.
Performance Notes
First run: Full training (~2-5 minutes depending on data size)
Subsequent runs (force_train=False): Model loading only (~1-2 seconds)
Force retrain (force_train=True): Full training again (~2-5 minutes)