model_selection.spotoptim_search

model_selection.spotoptim_search

Hyperparameter search functions for forecasters using SpotOptim.

This module provides an alternative to Bayesian (Optuna-based) search by leveraging the SpotOptim surrogate-model-based optimizer. It follows the same interface as spotforecast2.model_selection.bayesian_search_forecaster(), so the two can be used interchangeably.

Functions

Name Description
array_to_params Convert a SpotOptim parameter array back to a dict.
build_warm_start_x0 Build a single warm-start point x0 for :class:SpotOptim.
convert_search_space Convert search space into SpotOptim compatible format.
parse_lags_from_strings Parse a lags representation back to a Python object.
spotoptim_objective SpotOptim objective function to evaluate hyperparameter sets.
spotoptim_search Core implementation of the SpotOptim search logic.
spotoptim_search_forecaster Hyperparameter optimisation for a Forecaster using SpotOptim.

array_to_params

model_selection.spotoptim_search.array_to_params(
    params_array,
    var_name,
    var_type,
    bounds,
)

Convert a SpotOptim parameter array back to a dict.

Each element of params_array is mapped to the corresponding name / type / bounds entry, converting to the correct Python type.

Parameters

Name Type Description Default
params_array np.ndarray 1-D array of raw parameter values from SpotOptim. required
var_name list Parameter names (same order as params_array). required
var_type list Parameter types ("int", "float", "factor"). required
bounds list Parameter bounds. required

Returns

Name Type Description
Dict[str, Any] Dictionary mapping parameter names to typed values.

Examples

import numpy as np
from spotforecast2.model_selection.spotoptim_search import (
    array_to_params,
)

result = array_to_params(
    np.array([100.0, 0.05]),
    var_name=["n_estimators", "lr"],
    var_type=["int", "float"],
    bounds=[(50, 200), (0.01, 0.3)],
)
print(result)
assert result == {"n_estimators": 100, "lr": 0.05}
{'n_estimators': 100, 'lr': 0.05}
import numpy as np
from spotforecast2.model_selection.spotoptim_search import array_to_params

params_array = np.array([0.05, 5.0, 2.0])
var_name = ["alpha", "max_depth", "model"]
var_type = ["float", "int", "factor"]
bounds = [(0.01, 10.0), (2, 8), ["Ridge", "Lasso", "ElasticNet"]]

params_dict = array_to_params(params_array, var_name, var_type, bounds)

for k, v in params_dict.items():
    print(f"{k}: {v} (type: {type(v).__name__})")
alpha: 0.05 (type: float)
max_depth: 5 (type: int)
model: ElasticNet (type: str)

build_warm_start_x0

model_selection.spotoptim_search.build_warm_start_x0(
    search_space,
    forecaster,
    lags_seed,
)

Build a single warm-start point x0 for :class:SpotOptim.

The returned point seeds the optimizer’s first evaluation at the lag configuration lags_seed combined with the current estimator hyperparameters of forecaster. It is expressed in natural scale — SpotOptim applies any var_trans (e.g. log10) itself when it validates x0 — and inserted as the first point of the initial design, so the seeded configuration is always evaluated.

Parameters

Name Type Description Default
search_space ParameterSet | dict[str, Any] Search space that already contains str(list(lags_seed)) as a candidate in its "lags" factor (see SpotOptimStrategy.prepare_forecaster). required
forecaster object The pre-tuning forecaster; its estimator supplies the starting values for the numeric hyperparameter dimensions. required
lags_seed Any The lag configuration to seed (e.g. config.warm_start_lags). required

Returns

Name Type Description
np.ndarray | None A 1-D float array of length len(var_name), or None when the
np.ndarray | None search space has no "lags" factor or the seed is not a candidate.

Examples

import numpy as np
from sklearn.linear_model import Ridge
from spotforecast2_safe.forecaster.recursive import ForecasterRecursive
from spotforecast2.model_selection.spotoptim_search import (
    build_warm_start_x0,
)

forecaster = ForecasterRecursive(estimator=Ridge(alpha=2.0), lags=3)
search_space = {"alpha": (0.1, 10.0), "lags": ["[1, 2, 24]", "24"]}
x0 = build_warm_start_x0(search_space, forecaster, [1, 2, 24])
print(x0)
assert x0[0] == 2.0      # Ridge alpha, clipped into (0.1, 10.0)
assert x0[1] == 0.0      # index of "[1, 2, 24]" in the lags candidates
[2. 0.]

convert_search_space

model_selection.spotoptim_search.convert_search_space(search_space)

Convert search space into SpotOptim compatible format.

Parameters

Name Type Description Default
search_space ParameterSet | dict[str, Any] Search space as a SpotOptim ParameterSet or a dictionary. required

Returns

Name Type Description
list[Any] tuple containing:
list[str] - bounds: List of parameter bounds or categories.
list[str] - var_type: List of variable types (‘float’, ‘int’, or ‘factor’).
list[Callable | None] - var_name: List of variable names.
tuple[list[Any], list[str], list[str], list[Callable | None]] - var_trans: List of transformation functions (e.g., log10) or None.

Examples

from spotoptim.hyperparameters import ParameterSet
from spotforecast2.model_selection.spotoptim_search import (
    convert_search_space,
)

ps = ParameterSet()
_ = ps.add_float("alpha", 0.01, 10.0)
b, t, n, tr = convert_search_space(ps)
print(b)
assert b == [(0.01, 10.0)]
print(t)
assert t == ["float"]
[(0.01, 10.0)]
['float']
from spotforecast2.model_selection.spotoptim_search import convert_search_space

search_space = {
    "learning_rate": (0.001, 0.1, "log10"),
    "max_depth": (2, 10),
    "model_type": ["RandomForest", "XGBoost"],
}

bounds, vt, vn, vtr = convert_search_space(search_space)

for name, typ, bound, trans in zip(vn, vt, bounds, vtr):
    print(f"{name} ({typ}): {bound} | transform: {trans}")
learning_rate (float): (0.001, 0.1) | transform: log10
max_depth (int): (2, 10) | transform: None
model_type (factor): ['RandomForest', 'XGBoost'] | transform: None

parse_lags_from_strings

model_selection.spotoptim_search.parse_lags_from_strings(lags_str)

Parse a lags representation back to a Python object.

Handles three input scenarios: 1. Already an integer or list: returned as is. 2. Single integer as string: "24"24 3. List representation: "[1, 2, 3]"[1, 2, 3]

Parameters

Name Type Description Default
lags_str str | int | list Lag specification (string, int, or list). required

Returns

Name Type Description
int | list Either an integer or a list of integers representing lags.

Examples

from spotforecast2.model_selection.spotoptim_search import (
    parse_lags_from_strings,
)

result_int = parse_lags_from_strings(24)
print(result_int)
assert result_int == 24

result_list = parse_lags_from_strings("[1, 2, 3]")
print(result_list)
assert result_list == [1, 2, 3]

result_passthrough = parse_lags_from_strings([4, 8, 12])
print(result_passthrough)
assert result_passthrough == [4, 8, 12]
24
[1, 2, 3]
[4, 8, 12]

spotoptim_objective

model_selection.spotoptim_search.spotoptim_objective(
    X,
    forecaster_search,
    cv_name,
    cv,
    metric,
    y,
    exog,
    n_jobs,
    verbose,
    show_progress,
    suppress_warnings,
    var_name,
    var_type,
    bounds,
    all_metric_values,
    all_lags,
    all_params,
    n_trials=None,
)

SpotOptim objective function to evaluate hyperparameter sets.

Evaluates a given array of hyperparameter configurations X and returns an array of the primary metric errors.

Parameters

Name Type Description Default
X np.ndarray 2D array of hyperparameters from SpotOptim. required
forecaster_search object The forecaster to evaluate. required
cv_name str Type of cross-validation (“TimeSeriesFold” or “OneStepAheadFold”). required
cv TimeSeriesFold | OneStepAheadFold Cross-validation configuration. required
metric list[Callable] List of metrics to compute. required
y pd.Series Target time series. required
exog pd.Series | pd.DataFrame | None Exogenous variables. required
n_jobs int Number of parallel jobs. required
verbose bool Verbosity level flag. required
show_progress bool Show progress bar flag. required
suppress_warnings bool Suppress warnings flag. required
var_name list Parameter names. required
var_type list Parameter types. required
bounds list Parameter bounds. required
all_metric_values list[list[float]] List to record all metric results. required
all_lags list List to record all evaluated lag configurations. required
all_params list[dict] List to record all evaluated parameters. required
n_trials int | None Total number of candidate configurations in the SpotOptim budget. Used only to build the coarse-grained “config k/N” label shown as a prefix on each per-fold progress bar (when show_progress is True). When None, the label omits the total and reads “config k”. Does not affect the optimisation. None

Returns

Name Type Description
np.ndarray np.ndarray: 1D array of results for the primary metric.

Examples

# Demonstrate the call structure of spotoptim_objective.
import numpy as np
import pandas as pd
from spotforecast2_safe.splitter import TimeSeriesFold
from spotforecast2.model_selection.spotoptim_search import spotoptim_objective

# Mock forecaster for documentation
class MockForecaster:
    def set_params(self, **kwargs): pass
    def set_lags(self, lags): pass

# Provide dummy data and configuration
X = np.array([[0.05], [0.1]])
cv = TimeSeriesFold(initial_train_size=10, steps=2)
metric = [lambda y_true, y_pred: np.mean(np.abs(y_true - y_pred))]

# Track results
metric_vals, lags, params = [], [], []

# When evaluated for real, the mock objects would produce metrics.
# Here we just show the call structure.
print("Ready to evaluate hyperparameters.")
Ready to evaluate hyperparameters.

spotoptim_search_forecaster

model_selection.spotoptim_search.spotoptim_search_forecaster(
    forecaster,
    y,
    cv,
    search_space,
    metric,
    exog=None,
    n_trials=10,
    n_initial=5,
    random_state=123,
    return_best=True,
    n_jobs='auto',
    verbose=False,
    show_progress=True,
    suppress_warnings=False,
    output_file=None,
    kwargs_spotoptim=None,
)

Hyperparameter optimisation for a Forecaster using SpotOptim.

Drop-in alternative to bayesian_search_forecaster() that uses the SpotOptim surrogate-model-based optimizer instead of Optuna’s TPE sampler.

Parameters

Name Type Description Default
forecaster object Forecaster model (e.g. ForecasterRecursive). required
y pd.Series Training time series. Must have a datetime or numeric index. required
cv TimeSeriesFold | OneStepAheadFold Cross-validation strategy — TimeSeriesFold or OneStepAheadFold. required
search_space ParameterSet | Dict[str, Any] Hyperparameter search space. Either a ParameterSet or a plain dict (see examples below). required
metric str | Callable | list[str | Callable] Metric name, callable, or list thereof. required
exog pd.Series | pd.DataFrame | None Optional exogenous variable(s). None
n_trials int Total evaluations (initial + sequential). 10
n_initial int Random initial points before surrogate kicks in. 5
random_state int RNG seed. 123
return_best bool Re-fit forecaster with best params after search. True
n_jobs int | str Parallel jobs for backtesting ("auto" or int). 'auto'
verbose bool Print optimisation progress. False
show_progress bool Show progress bar during backtesting/validation. True
suppress_warnings bool Suppress spotforecast warnings. False
output_file str | None Save results as TSV to this path. None
kwargs_spotoptim dict | None Extra kwargs passed to SpotOptim(). None

Returns

Name Type Description
tuple pd.DataFrame (results, optimizer) where results is a sorted
object DataFrame and optimizer is the SpotOptim instance.

Raises

Name Type Description
ValueError If exog length ≠ y length and return_best is True.
TypeError If cv is not TimeSeriesFold or OneStepAheadFold.

Examples

# 1 — Dict-based search space (no ParameterSet needed):
import numpy as np
import pandas as pd
from sklearn.linear_model import Ridge
from spotforecast2_safe.forecaster.recursive import ForecasterRecursive
from spotforecast2_safe.splitter import TimeSeriesFold
from spotforecast2.model_selection import spotoptim_search_forecaster

np.random.seed(42)
y = pd.Series(
    np.random.randn(200).cumsum(),
    index=pd.date_range("2022-01-01", periods=200, freq="h"),
    name="load",
)

forecaster = ForecasterRecursive(estimator=Ridge(), lags=5)
cv = TimeSeriesFold(
    steps=5,
    initial_train_size=150,
    refit=False,
)

search_space = {"alpha": (0.01, 10.0)}

results, optimizer = spotoptim_search_forecaster(
    forecaster=forecaster,
    y=y,
    cv=cv,
    search_space=search_space,
    metric="mean_absolute_error",
    n_trials=5,
    n_initial=3,
    random_state=42,
    return_best=False,
    verbose=False,
    show_progress=False,
)

print(f"Is DataFrame: {isinstance(results, pd.DataFrame)}")
print(f"Contains 'alpha': {'alpha' in results.columns}")
Is DataFrame: True
Contains 'alpha': True
# 2 — ParameterSet-based search space:
import numpy as np
import pandas as pd
from sklearn.linear_model import Ridge
from spotoptim.hyperparameters import ParameterSet
from spotforecast2_safe.forecaster.recursive import ForecasterRecursive
from spotforecast2_safe.splitter import TimeSeriesFold
from spotforecast2.model_selection import spotoptim_search_forecaster

np.random.seed(42)
y = pd.Series(
    np.random.randn(200).cumsum(),
    index=pd.date_range("2022-01-01", periods=200, freq="h"),
    name="load",
)
cv = TimeSeriesFold(steps=5, initial_train_size=150, refit=False)

ps = ParameterSet()
_ = ps.add_float("alpha", low=0.01, high=10.0)

results2, _ = spotoptim_search_forecaster(
    forecaster=ForecasterRecursive(estimator=Ridge(), lags=5),
    y=y,
    cv=cv,
    search_space=ps,
    metric="mean_absolute_error",
    n_trials=5,
    n_initial=3,
    return_best=False,
    verbose=False,
    show_progress=False,
)

print(f"Number of configurations evaluated: {len(results2)}")
Number of configurations evaluated: 5