model_selection.spotoptim_search

model_selection.spotoptim_search

Hyperparameter search functions for forecasters using SpotOptim.

This module provides an alternative to Bayesian (Optuna-based) search by leveraging the SpotOptim surrogate-model-based optimizer. It follows the same interface as :func:spotforecast2.model_selection.bayesian_search_forecaster, so the two can be used interchangeably.

Functions

Name Description
array_to_params Convert a SpotOptim parameter array back to a dict.
convert_search_space Convert search space into SpotOptim compatible format.
parse_lags_from_strings Parse a lags representation back to a Python object.
spotoptim_objective SpotOptim objective function to evaluate hyperparameter sets.
spotoptim_search Core implementation of the SpotOptim search logic.
spotoptim_search_forecaster Hyperparameter optimisation for a Forecaster using SpotOptim.

array_to_params

model_selection.spotoptim_search.array_to_params(
    params_array,
    var_name,
    var_type,
    bounds,
)

Convert a SpotOptim parameter array back to a dict.

Each element of params_array is mapped to the corresponding name / type / bounds entry, converting to the correct Python type.

Parameters

Name Type Description Default
params_array np.ndarray 1-D array of raw parameter values from SpotOptim. required
var_name list Parameter names (same order as params_array). required
var_type list Parameter types ("int", "float", "factor"). required
bounds list Parameter bounds. required

Returns

Name Type Description
Dict[str, Any] Dictionary mapping parameter names to typed values.

Examples

Basic usage:

>>> import numpy as np
>>> from spotforecast2.model_selection.spotoptim_search import (
...     array_to_params,
... )
>>> array_to_params(
...     np.array([100.0, 0.05]),
...     var_name=["n_estimators", "lr"],
...     var_type=["int", "float"],
...     bounds=[(50, 200), (0.01, 0.3)],
... )
{'n_estimators': 100, 'lr': 0.05}

Generating textual output of parameter mapping:

import numpy as np
from spotforecast2.model_selection.spotoptim_search import array_to_params

params_array = np.array([0.05, 5.0, 2.0])
var_name = ["alpha", "max_depth", "model"]
var_type = ["float", "int", "factor"]
bounds = [(0.01, 10.0), (2, 8), ["Ridge", "Lasso", "ElasticNet"]]

params_dict = array_to_params(params_array, var_name, var_type, bounds)

for k, v in params_dict.items():
    print(f"{k}: {v} (type: {type(v).__name__})")
alpha: 0.05 (type: float)
max_depth: 5 (type: int)
model: ElasticNet (type: str)

convert_search_space

model_selection.spotoptim_search.convert_search_space(search_space)

Convert search space into SpotOptim compatible format.

Parameters

Name Type Description Default
search_space ParameterSet | dict[str, Any] Search space as a SpotOptim ParameterSet or a dictionary. required

Returns

Name Type Description
list[Any] tuple containing:
list[str] - bounds: List of parameter bounds or categories.
list[str] - var_type: List of variable types (‘float’, ‘int’, or ‘factor’).
list[Callable | None] - var_name: List of variable names.
tuple[list[Any], list[str], list[str], list[Callable | None]] - var_trans: List of transformation functions (e.g., log10) or None.

Examples

Basic usage:

>>> from spotoptim.hyperparameters import ParameterSet
>>> from spotforecast2.model_selection.spotoptim_search import (
...     convert_search_space,
... )
>>> ps = ParameterSet()
>>> _ = ps.add_float("alpha", 0.01, 10.0)
>>> b, t, n, tr = convert_search_space(ps)
>>> b
[(0.01, 10.0)]
>>> t
['float']

Converting a complex dictionary search space:

from spotforecast2.model_selection.spotoptim_search import convert_search_space

search_space = {
    "learning_rate": (0.001, 0.1, "log10"),
    "max_depth": (2, 10),
    "model_type": ["RandomForest", "XGBoost"]
}

bounds, vt, vn, vtr = convert_search_space(search_space)

for name, typ, bound, trans in zip(vn, vt, bounds, vtr):
    print(f"{name} ({typ}): {bound} | transform: {trans}")
learning_rate (float): (0.001, 0.1) | transform: log10
max_depth (int): (2, 10) | transform: None
model_type (factor): ['RandomForest', 'XGBoost'] | transform: None

parse_lags_from_strings

model_selection.spotoptim_search.parse_lags_from_strings(lags_str)

Parse a lags representation back to a Python object.

Handles three input scenarios: 1. Already an integer or list: returned as is. 2. Single integer as string: "24"24 3. List representation: "[1, 2, 3]"[1, 2, 3]

Parameters

Name Type Description Default
lags_str str | int | list Lag specification (string, int, or list). required

Returns

Name Type Description
int | list Either an integer or a list of integers representing lags.

Examples

Basic parsing:

>>> from spotforecast2.model_selection.spotoptim_search import (
...     parse_lags_from_strings,
... )
>>> parse_lags_from_strings(24)
24
>>> parse_lags_from_strings("[1, 2, 3]")
[1, 2, 3]

Visualizing the safety threshold (Example of dynamic documentation):

import matplotlib.pyplot as plt
import numpy as np

def check_safety_threshold(val, threshold):
    return 1 if val >= threshold else 0

threshold = 0.95
x = np.linspace(0.8, 1.0, 50)
y = [check_safety_threshold(val, threshold) for val in x]

plt.step(x, y, where='post')
plt.axvline(threshold, color='red', linestyle='--')
plt.title("Safety Status Transition")
# plt.show()  # Commented for non-interactive environments
Text(0.5, 1.0, 'Safety Status Transition')

spotoptim_objective

model_selection.spotoptim_search.spotoptim_objective(
    X,
    forecaster_search,
    cv_name,
    cv,
    metric,
    y,
    exog,
    n_jobs,
    verbose,
    show_progress,
    suppress_warnings,
    var_name,
    var_type,
    bounds,
    all_metric_values,
    all_lags,
    all_params,
)

SpotOptim objective function to evaluate hyperparameter sets.

Evaluates a given array of hyperparameter configurations X and returns an array of the primary metric errors.

Parameters

Name Type Description Default
X np.ndarray 2D array of hyperparameters from SpotOptim. required
forecaster_search object The forecaster to evaluate. required
cv_name str Type of cross-validation (“TimeSeriesFold” or “OneStepAheadFold”). required
cv TimeSeriesFold | OneStepAheadFold Cross-validation configuration. required
metric list[Callable] List of metrics to compute. required
y pd.Series Target time series. required
exog pd.Series | pd.DataFrame | None Exogenous variables. required
n_jobs int Number of parallel jobs. required
verbose bool Verbosity level flag. required
show_progress bool Show progress bar flag. required
suppress_warnings bool Suppress warnings flag. required
var_name list Parameter names. required
var_type list Parameter types. required
bounds list Parameter bounds. required
all_metric_values list[list[float]] List to record all metric results. required
all_lags list List to record all evaluated lag configurations. required
all_params list[dict] List to record all evaluated parameters. required

Returns

Name Type Description
np.ndarray np.ndarray: 1D array of results for the primary metric.

Examples

Generating textual output of parameter evaluation:

import numpy as np
import pandas as pd
from spotforecast2_safe.model_selection import TimeSeriesFold
from spotforecast2.model_selection.spotoptim_search import spotoptim_objective

# Mock forecaster for documentation
class MockForecaster:
    def set_params(self, **kwargs): pass
    def set_lags(self, lags): pass

# Provide dummy data and configuration
X = np.array([[0.05], [0.1]])
cv = TimeSeriesFold(initial_train_size=10, steps=2)
metric = [lambda y_true, y_pred: np.mean(np.abs(y_true - y_pred))]

# Track results
metric_vals, lags, params = [], [], []

# When evaluated for real, the mock objects would produce metrics.
# Here we just show the call structure.
print("Ready to evaluate hyperparameters.")
Ready to evaluate hyperparameters.

spotoptim_search_forecaster

model_selection.spotoptim_search.spotoptim_search_forecaster(
    forecaster,
    y,
    cv,
    search_space,
    metric,
    exog=None,
    n_trials=10,
    n_initial=5,
    random_state=123,
    return_best=True,
    n_jobs='auto',
    verbose=False,
    show_progress=True,
    suppress_warnings=False,
    output_file=None,
    kwargs_spotoptim=None,
)

Hyperparameter optimisation for a Forecaster using SpotOptim.

Drop-in alternative to :func:~spotforecast2.model_selection.bayesian_search_forecaster that uses the SpotOptim surrogate-model-based optimizer instead of Optuna’s TPE sampler.

Parameters

Name Type Description Default
forecaster object Forecaster model (e.g. ForecasterRecursive). required
y pd.Series Training time series. Must have a datetime or numeric index. required
cv TimeSeriesFold | OneStepAheadFold Cross-validation strategy — TimeSeriesFold or OneStepAheadFold. required
search_space ParameterSet | Dict[str, Any] Hyperparameter search space. Either a :class:~spotoptim.hyperparameters.ParameterSet or a plain dict (see examples below). required
metric str | Callable | list[str | Callable] Metric name, callable, or list thereof. required
exog pd.Series | pd.DataFrame | None Optional exogenous variable(s). None
n_trials int Total evaluations (initial + sequential). 10
n_initial int Random initial points before surrogate kicks in. 5
random_state int RNG seed. 123
return_best bool Re-fit forecaster with best params after search. True
n_jobs int | str Parallel jobs for backtesting ("auto" or int). 'auto'
verbose bool Print optimisation progress. False
show_progress bool Show progress bar during backtesting/validation. True
suppress_warnings bool Suppress spotforecast warnings. False
output_file str | None Save results as TSV to this path. None
kwargs_spotoptim dict | None Extra kwargs passed to SpotOptim(). None

Returns

Name Type Description
tuple pd.DataFrame (results, optimizer) where results is a sorted
object DataFrame and optimizer is the SpotOptim instance.

Raises

Name Type Description
ValueError If exog length ≠ y length and return_best is True.
TypeError If cv is not TimeSeriesFold or OneStepAheadFold.

Examples

1 — Dict-based search space (no ParameterSet needed):

import numpy as np
import pandas as pd
from sklearn.linear_model import Ridge
from spotforecast2_safe.forecaster.recursive import ForecasterRecursive
from spotforecast2.model_selection import (
    TimeSeriesFold,
    spotoptim_search_forecaster,
)

np.random.seed(42)
y = pd.Series(
    np.random.randn(200).cumsum(),
    index=pd.date_range("2022-01-01", periods=200, freq="h"),
    name="load",
)

forecaster = ForecasterRecursive(estimator=Ridge(), lags=5)
cv = TimeSeriesFold(
    steps=5,
    initial_train_size=150,
    refit=False,
)

search_space = {"alpha": (0.01, 10.0)}

results, optimizer = spotoptim_search_forecaster(
    forecaster=forecaster,
    y=y,
    cv=cv,
    search_space=search_space,
    metric="mean_absolute_error",
    n_trials=5,
    n_initial=3,
    random_state=42,
    return_best=False,
    verbose=False,
    show_progress=False,
)

print(f"Is DataFrame: {isinstance(results, pd.DataFrame)}")
print(f"Contains 'alpha': {'alpha' in results.columns}")
Is DataFrame: True
Contains 'alpha': True

2 — ParameterSet-based search space:

from spotoptim.hyperparameters import ParameterSet

ps = ParameterSet()
_ = ps.add_float("alpha", low=0.01, high=10.0)

results2, _ = spotoptim_search_forecaster(
    forecaster=ForecasterRecursive(estimator=Ridge(), lags=5),
    y=y,
    cv=cv,
    search_space=ps,
    metric="mean_absolute_error",
    n_trials=5,
    n_initial=3,
    return_best=False,
    verbose=False,
    show_progress=False,
)

print(f"Number of configurations evaluated: {len(results2)}")
Number of configurations evaluated: 5