model_selection.spotoptim_search

model_selection.spotoptim_search

Hyperparameter search functions for forecasters using SpotOptim.

This module provides an alternative to Bayesian (Optuna-based) search by leveraging the SpotOptim surrogate-model-based optimizer. It follows the same interface as spotforecast2.model_selection.bayesian_search_forecaster(), so the two can be used interchangeably.

Functions

Name	Description
array_to_params	Convert a SpotOptim parameter array back to a dict.
build_warm_start_x0	Build a single warm-start point `x0` for :class:`SpotOptim`.
convert_search_space	Convert search space into SpotOptim compatible format.
parse_lags_from_strings	Parse a lags representation back to a Python object.
spotoptim_objective	SpotOptim objective function to evaluate hyperparameter sets.
spotoptim_search	Core implementation of the SpotOptim search logic.
spotoptim_search_forecaster	Hyperparameter optimisation for a Forecaster using SpotOptim.

array_to_params

model_selection.spotoptim_search.array_to_params(
    params_array,
    var_name,
    var_type,
    bounds,
)

Convert a SpotOptim parameter array back to a dict.

Each element of params_array is mapped to the corresponding name / type / bounds entry, converting to the correct Python type.

Parameters

Name	Type	Description	Default
params_array	np.ndarray	1-D array of raw parameter values from SpotOptim.	required
var_name	list	Parameter names (same order as params_array).	required
var_type	list	Parameter types (`"int"`, `"float"`, `"factor"`).	required
bounds	list	Parameter bounds.	required

Returns

Name	Type	Description
	Dict[str, Any]	Dictionary mapping parameter names to typed values.

Examples

import numpy as np
from spotforecast2.model_selection.spotoptim_search import (
    array_to_params,
)

result = array_to_params(
    np.array([100.0, 0.05]),
    var_name=["n_estimators", "lr"],
    var_type=["int", "float"],
    bounds=[(50, 200), (0.01, 0.3)],
)
print(result)
assert result == {"n_estimators": 100, "lr": 0.05}

{'n_estimators': 100, 'lr': 0.05}

import numpy as np
from spotforecast2.model_selection.spotoptim_search import array_to_params

params_array = np.array([0.05, 5.0, 2.0])
var_name = ["alpha", "max_depth", "model"]
var_type = ["float", "int", "factor"]
bounds = [(0.01, 10.0), (2, 8), ["Ridge", "Lasso", "ElasticNet"]]

params_dict = array_to_params(params_array, var_name, var_type, bounds)

for k, v in params_dict.items():
    print(f"{k}: {v} (type: {type(v).__name__})")

alpha: 0.05 (type: float)
max_depth: 5 (type: int)
model: ElasticNet (type: str)

build_warm_start_x0

model_selection.spotoptim_search.build_warm_start_x0(
    search_space,
    forecaster,
    lags_seed,
)

Build a single warm-start point x0 for :class:SpotOptim.

The returned point seeds the optimizer’s first evaluation at the lag configuration lags_seed combined with the current estimator hyperparameters of forecaster. It is expressed in natural scale — SpotOptim applies any var_trans (e.g. log10) itself when it validates x0 — and inserted as the first point of the initial design, so the seeded configuration is always evaluated.

Parameters

Name	Type	Description	Default
search_space	`ParameterSet` \| dict[str, Any]	Search space that already contains `str(list(lags_seed))` as a candidate in its `"lags"` factor (see `SpotOptimStrategy.prepare_forecaster`).	required
forecaster	object	The pre-tuning forecaster; its `estimator` supplies the starting values for the numeric hyperparameter dimensions.	required
lags_seed	Any	The lag configuration to seed (e.g. `config.warm_start_lags`).	required

Returns

Name	Type	Description
	np.ndarray \| None	A 1-D float array of length `len(var_name)`, or `None` when the
	np.ndarray \| None	search space has no `"lags"` factor or the seed is not a candidate.

Examples

import numpy as np
from sklearn.linear_model import Ridge
from spotforecast2_safe.forecaster.recursive import ForecasterRecursive
from spotforecast2.model_selection.spotoptim_search import (
    build_warm_start_x0,
)

forecaster = ForecasterRecursive(estimator=Ridge(alpha=2.0), lags=3)
search_space = {"alpha": (0.1, 10.0), "lags": ["[1, 2, 24]", "24"]}
x0 = build_warm_start_x0(search_space, forecaster, [1, 2, 24])
print(x0)
assert x0[0] == 2.0      # Ridge alpha, clipped into (0.1, 10.0)
assert x0[1] == 0.0      # index of "[1, 2, 24]" in the lags candidates

[2. 0.]

convert_search_space

model_selection.spotoptim_search.convert_search_space(search_space)

Convert search space into SpotOptim compatible format.

Parameters

Name	Type	Description	Default
search_space	`ParameterSet` \| dict[str, Any]	Search space as a SpotOptim ParameterSet or a dictionary.	required

Returns

Name	Type	Description
	list[Any]	tuple containing:
	list[str]	- bounds: List of parameter bounds or categories.
	list[str]	- var_type: List of variable types (‘float’, ‘int’, or ‘factor’).
	list[Callable \| None]	- var_name: List of variable names.
	tuple[list[Any], list[str], list[str], list[Callable \| None]]	- var_trans: List of transformation functions (e.g., log10) or None.

Examples

from spotoptim.hyperparameters import ParameterSet
from spotforecast2.model_selection.spotoptim_search import (
    convert_search_space,
)

ps = ParameterSet()
_ = ps.add_float("alpha", 0.01, 10.0)
b, t, n, tr = convert_search_space(ps)
print(b)
assert b == [(0.01, 10.0)]
print(t)
assert t == ["float"]

[(0.01, 10.0)]
['float']

from spotforecast2.model_selection.spotoptim_search import convert_search_space

search_space = {
    "learning_rate": (0.001, 0.1, "log10"),
    "max_depth": (2, 10),
    "model_type": ["RandomForest", "XGBoost"],
}

bounds, vt, vn, vtr = convert_search_space(search_space)

for name, typ, bound, trans in zip(vn, vt, bounds, vtr):
    print(f"{name} ({typ}): {bound} | transform: {trans}")

learning_rate (float): (0.001, 0.1) | transform: log10
max_depth (int): (2, 10) | transform: None
model_type (factor): ['RandomForest', 'XGBoost'] | transform: None

parse_lags_from_strings

model_selection.spotoptim_search.parse_lags_from_strings(lags_str)

Parse a lags representation back to a Python object.

Handles three input scenarios: 1. Already an integer or list: returned as is. 2. Single integer as string: "24" → 24 3. List representation: "[1, 2, 3]" → [1, 2, 3]

Parameters

Name	Type	Description	Default
lags_str	str \| int \| list	Lag specification (string, int, or list).	required

Returns

Name	Type	Description
	int \| list	Either an integer or a list of integers representing lags.

Examples

from spotforecast2.model_selection.spotoptim_search import (
    parse_lags_from_strings,
)

result_int = parse_lags_from_strings(24)
print(result_int)
assert result_int == 24

result_list = parse_lags_from_strings("[1, 2, 3]")
print(result_list)
assert result_list == [1, 2, 3]

result_passthrough = parse_lags_from_strings([4, 8, 12])
print(result_passthrough)
assert result_passthrough == [4, 8, 12]

24
[1, 2, 3]
[4, 8, 12]

spotoptim_objective

model_selection.spotoptim_search.spotoptim_objective(
    X,
    forecaster_search,
    cv_name,
    cv,
    metric,
    y,
    exog,
    n_jobs,
    verbose,
    show_progress,
    suppress_warnings,
    var_name,
    var_type,
    bounds,
    all_metric_values,
    all_lags,
    all_params,
    n_trials=None,
)

SpotOptim objective function to evaluate hyperparameter sets.

Evaluates a given array of hyperparameter configurations X and returns an array of the primary metric errors.

Parameters

Name	Type	Description	Default
X	np.ndarray	2D array of hyperparameters from SpotOptim.	required
forecaster_search	object	The forecaster to evaluate.	required
cv_name	str	Type of cross-validation (“TimeSeriesFold” or “OneStepAheadFold”).	required
cv	`TimeSeriesFold` \| `OneStepAheadFold`	Cross-validation configuration.	required
metric	list[Callable]	List of metrics to compute.	required
y	pd.Series	Target time series.	required
exog	pd.Series \| pd.DataFrame \| None	Exogenous variables.	required
n_jobs	int	Number of parallel jobs.	required
verbose	bool	Verbosity level flag.	required
show_progress	bool	Show progress bar flag.	required
suppress_warnings	bool	Suppress warnings flag.	required
var_name	list	Parameter names.	required
var_type	list	Parameter types.	required
bounds	list	Parameter bounds.	required
all_metric_values	list[list[float]]	List to record all metric results.	required
all_lags	list	List to record all evaluated lag configurations.	required
all_params	list[dict]	List to record all evaluated parameters.	required
n_trials	int \| None	Total number of candidate configurations in the SpotOptim budget. Used only to build the coarse-grained “config k/N” label shown as a prefix on each per-fold progress bar (when `show_progress` is `True`). When `None`, the label omits the total and reads “config k”. Does not affect the optimisation.	`None`

Returns

Name	Type	Description
	np.ndarray	np.ndarray: 1D array of results for the primary metric.

Examples

# Demonstrate the call structure of spotoptim_objective.
import numpy as np
import pandas as pd
from spotforecast2_safe.splitter import TimeSeriesFold
from spotforecast2.model_selection.spotoptim_search import spotoptim_objective

# Mock forecaster for documentation
class MockForecaster:
    def set_params(self, **kwargs): pass
    def set_lags(self, lags): pass

# Provide dummy data and configuration
X = np.array([[0.05], [0.1]])
cv = TimeSeriesFold(initial_train_size=10, steps=2)
metric = [lambda y_true, y_pred: np.mean(np.abs(y_true - y_pred))]

# Track results
metric_vals, lags, params = [], [], []

# When evaluated for real, the mock objects would produce metrics.
# Here we just show the call structure.
print("Ready to evaluate hyperparameters.")

Ready to evaluate hyperparameters.

spotoptim_search

model_selection.spotoptim_search.spotoptim_search(
    forecaster,
    y,
    cv,
    search_space,
    metric,
    exog=None,
    n_trials=10,
    n_initial=5,
    random_state=123,
    return_best=True,
    n_jobs='auto',
    verbose=False,
    show_progress=True,
    suppress_warnings=False,
    output_file=None,
    kwargs_spotoptim=None,
)

Core implementation of the SpotOptim search logic.

This function performs the hyperparameter optimization process using SpotOptim, evaluating configurations via cross-validation or one-step-ahead forecasting.

Parameters

Name	Type	Description	Default
forecaster	object	The initial forecaster object.	required
y	pd.Series	The target time series.	required
cv	`TimeSeriesFold` \| `OneStepAheadFold`	Cross-validation or one-step-ahead configuration.	required
search_space	`ParameterSet` \| Dict[str, Any]	Parameter bounds for SpotOptim.	required
metric	str \| Callable \| list[str \| Callable]	Optimization metric(s).	required
exog	pd.Series \| pd.DataFrame \| None	Exogenous variables.	`None`
n_trials	int	Maximum number of trials.	`10`
n_initial	int	Number of initial evaluations.	`5`
random_state	int	Random seed.	`123`
return_best	bool	Refit internal forecaster with best params.	`True`
n_jobs	int \| str	Number of parallel jobs.	`'auto'`
verbose	bool	Verbosity flag.	`False`
show_progress	bool	Show progress bar flag.	`True`
suppress_warnings	bool	Suppress warnings during evaluation.	`False`
output_file	str \| None	File to save results to.	`None`
kwargs_spotoptim	dict \| None	Additional args for SpotOptim.	`None`

Returns

Name	Type	Description
tuple	tuple[pd.DataFrame, object]	`(results_df, optimizer)`

Examples

import numpy as np
import pandas as pd
from sklearn.linear_model import Ridge
from spotforecast2_safe.forecaster.recursive import ForecasterRecursive
from spotforecast2_safe.splitter import TimeSeriesFold
from spotforecast2.model_selection.spotoptim_search import spotoptim_search

np.random.seed(42)
y = pd.Series(
    np.random.randn(100).cumsum(),
    index=pd.date_range("2022-01-01", periods=100, freq="h"),
)
forecaster = ForecasterRecursive(estimator=Ridge(), lags=3)
cv = TimeSeriesFold(steps=5, initial_train_size=80, refit=False)
search_space = {"alpha": (0.01, 10.0)}

results, _ = spotoptim_search(
    forecaster=forecaster,
    y=y,
    cv=cv,
    search_space=search_space,
    metric="mean_absolute_error",
    n_trials=2,
    n_initial=1,
    return_best=False,
    show_progress=False,
)

print(f"Evaluated {len(results)} configurations.")

Evaluated 2 configurations.

spotoptim_search_forecaster

model_selection.spotoptim_search.spotoptim_search_forecaster(
    forecaster,
    y,
    cv,
    search_space,
    metric,
    exog=None,
    n_trials=10,
    n_initial=5,
    random_state=123,
    return_best=True,
    n_jobs='auto',
    verbose=False,
    show_progress=True,
    suppress_warnings=False,
    output_file=None,
    kwargs_spotoptim=None,
)

Hyperparameter optimisation for a Forecaster using SpotOptim.

Drop-in alternative to bayesian_search_forecaster() that uses the SpotOptim surrogate-model-based optimizer instead of Optuna’s TPE sampler.

Parameters

Name	Type	Description	Default
forecaster	object	Forecaster model (e.g. `ForecasterRecursive`).	required
y	pd.Series	Training time series. Must have a datetime or numeric index.	required
cv	`TimeSeriesFold` \| `OneStepAheadFold`	Cross-validation strategy — `TimeSeriesFold` or `OneStepAheadFold`.	required
search_space	`ParameterSet` \| Dict[str, Any]	Hyperparameter search space. Either a `ParameterSet` or a plain `dict` (see examples below).	required
metric	str \| Callable \| list[str \| Callable]	Metric name, callable, or list thereof.	required
exog	pd.Series \| pd.DataFrame \| None	Optional exogenous variable(s).	`None`
n_trials	int	Total evaluations (initial + sequential).	`10`
n_initial	int	Random initial points before surrogate kicks in.	`5`
random_state	int	RNG seed.	`123`
return_best	bool	Re-fit forecaster with best params after search.	`True`
n_jobs	int \| str	Parallel jobs for backtesting (`"auto"` or int).	`'auto'`
verbose	bool	Print optimisation progress.	`False`
show_progress	bool	Show progress bar during backtesting/validation.	`True`
suppress_warnings	bool	Suppress spotforecast warnings.	`False`
output_file	str \| None	Save results as TSV to this path.	`None`
kwargs_spotoptim	dict \| None	Extra kwargs passed to `SpotOptim()`.	`None`

Returns

Name	Type	Description
tuple	pd.DataFrame	`(results, optimizer)` where results is a sorted
	object	`DataFrame` and optimizer is the `SpotOptim` instance.

Raises

Name	Type	Description
	ValueError	If `exog` length ≠ `y` length and `return_best` is True.
	TypeError	If `cv` is not `TimeSeriesFold` or `OneStepAheadFold`.

Examples

# 1 — Dict-based search space (no ParameterSet needed):
import numpy as np
import pandas as pd
from sklearn.linear_model import Ridge
from spotforecast2_safe.forecaster.recursive import ForecasterRecursive
from spotforecast2_safe.splitter import TimeSeriesFold
from spotforecast2.model_selection import spotoptim_search_forecaster

np.random.seed(42)
y = pd.Series(
    np.random.randn(200).cumsum(),
    index=pd.date_range("2022-01-01", periods=200, freq="h"),
    name="load",
)

forecaster = ForecasterRecursive(estimator=Ridge(), lags=5)
cv = TimeSeriesFold(
    steps=5,
    initial_train_size=150,
    refit=False,
)

search_space = {"alpha": (0.01, 10.0)}

results, optimizer = spotoptim_search_forecaster(
    forecaster=forecaster,
    y=y,
    cv=cv,
    search_space=search_space,
    metric="mean_absolute_error",
    n_trials=5,
    n_initial=3,
    random_state=42,
    return_best=False,
    verbose=False,
    show_progress=False,
)

print(f"Is DataFrame: {isinstance(results, pd.DataFrame)}")
print(f"Contains 'alpha': {'alpha' in results.columns}")

Is DataFrame: True
Contains 'alpha': True

# 2 — ParameterSet-based search space:
import numpy as np
import pandas as pd
from sklearn.linear_model import Ridge
from spotoptim.hyperparameters import ParameterSet
from spotforecast2_safe.forecaster.recursive import ForecasterRecursive
from spotforecast2_safe.splitter import TimeSeriesFold
from spotforecast2.model_selection import spotoptim_search_forecaster

np.random.seed(42)
y = pd.Series(
    np.random.randn(200).cumsum(),
    index=pd.date_range("2022-01-01", periods=200, freq="h"),
    name="load",
)
cv = TimeSeriesFold(steps=5, initial_train_size=150, refit=False)

ps = ParameterSet()
_ = ps.add_float("alpha", low=0.01, high=10.0)

results2, _ = spotoptim_search_forecaster(
    forecaster=ForecasterRecursive(estimator=Ridge(), lags=5),
    y=y,
    cv=cv,
    search_space=ps,
    metric="mean_absolute_error",
    n_trials=5,
    n_initial=3,
    return_best=False,
    verbose=False,
    show_progress=False,
)

print(f"Number of configurations evaluated: {len(results2)}")

Number of configurations evaluated: 5