Introduction to Hyperparameter Tuning with SpotOptim
SpotOptim provides an advanced surrogate-model-based optimization engine integrated directly into spotforecast2. It acts as a powerful, drop-in alternative to bayesian_search_forecaster, leveraging surrogate modeling (such as Kriging/Gaussian Processes, Random Forests, etc.) rather than Optuna’s TPE sampler, often discovering better hyperparameter configurations with fewer evaluations.
This guide will walk you through the core function spotoptim_search_forecaster(), from a simple baseline example up to an advanced configuration, explaining every argument available.
Core Arguments Overview
When calling spotoptim_search_forecaster(), you can comprehensively control the optimization environment. Table 1 shows a breakdown of every available argument:
Table 1: Available arguments for spotoptim_search_forecaster().
Argument
Type
Description
forecaster
object
The base forecaster model to be tuned (e.g., ForecasterRecursive).
y
pd.Series
The target training time series. Must have a datetime or numeric index.
cv
TimeSeriesFold | OneStepAheadFold
The cross-validation strategy used to evaluate configurations during the search.
search_space
ParameterSet | dict
The boundaries and categories of the hyperparameters to explore.
metric
str | Callable | list
The metric(s) used to evaluate forecaster performance. First metric dictates sorting.
exog
pd.Series | pd.DataFrame | None
Optional exogenous variables to include during modeling.
n_trials
int
The total number of evaluations to perform throughout the entire search.
n_initial
int
The number of random initial design points before the surrogate model takes over.
random_state
int
RNG seed ensuring reproducible designs and sampling.
return_best
bool
If True, the passed forecaster is automatically refit on the entirey using the best found configuration.
n_jobs
int | str
Number of parallel jobs for backtesting parallelization. Use "auto" for automatic CPU detection.
verbose
bool
Print detailed SpotOptim engine optimization logs to stdout.
show_progress
bool
Display a tqdm progress bar for evaluation progression.
suppress_warnings
bool
Temporarily suppress spotforecast2 warnings during rapid evaluation cycles.
output_file
str | None
Path to save the trial results incrementally as a TSV file.
kwargs_spotoptim
dict | None
Advanced dictionary passed directly to the underlying SpotOptim engine (e.g., custom surrogate configurations, acquisition functions).
Simple Tuning Example
In this first example, we want to tune a basic Ridge regression forecaster. We will evaluate configurations using a simple cross-validation fold, passing the search_space as a standard dictionary.
import numpy as npimport pandas as pdfrom sklearn.linear_model import Ridgefrom spotforecast2_safe.forecaster.recursive import ForecasterRecursivefrom spotforecast2.model_selection import ( TimeSeriesFold, spotoptim_search_forecaster,)# 1. Generate realistic target datanp.random.seed(42)y = pd.Series( np.random.randn(200).cumsum(), index=pd.date_range("2022-01-01", periods=200, freq="h"), name="load",)# 2. Define Forecaster and Validationforecaster = ForecasterRecursive(estimator=Ridge(), lags=5)cv = TimeSeriesFold( steps=5, initial_train_size=150, refit=False,)# 3. Define a simple search space using a dictionarysearch_space = {"alpha": (0.01, 10.0)}# 4. execute SpotOptim Searchresults, optimizer = spotoptim_search_forecaster( forecaster=forecaster, y=y, cv=cv, search_space=search_space, metric="mean_absolute_error", n_trials=5, # 5 total trials n_initial=3, # 3 random, 2 surrogate-guided random_state=42, return_best=True, # Automatically refits the `forecaster` object show_progress=False,)print(results.head(3))print(f"Best found alpha: {results.loc[0, 'alpha']}")
`Forecaster` refitted using the best-found lags and parameters, and the whole data set:
Lags: [1 2 3 4 5]
Parameters: {'alpha': 6.947241955342156}
Backtesting metric: 1.019885673261824
lags params mean_absolute_error \
0 [1, 2, 3, 4, 5] {'alpha': 6.947241955342156} 1.019886
1 [1, 2, 3, 4, 5] {'alpha': 3.6364143967777034} 1.019886
2 [1, 2, 3, 4, 5] {'alpha': 0.42094695964921375} 1.019886
alpha
0 6.947242
1 3.636414
2 0.420947
Best found alpha: 6.947241955342156
Because return_best=True, the forecaster object is now perfectly equipped to make predictions immediately, using alpha configuration results.loc[0, "alpha"].
Advanced Tuning Example
Now, let’s explore an advanced configuration. We’ll use a ParameterSet object (from the spotoptim library) for more granular control over numerical types and transformations. We will also inject dictionary arguments explicitly to the SpotOptim engine.
This advanced setup illustrates how spotoptim_search_forecaster acts as an extremely flexible orchestration bridge, allowing you to efficiently explore your state space while capturing comprehensive performance metadata.