model_selection.bayesian_search
model_selection.bayesian_search
Bayesian hyperparameter search functions for forecasters using Optuna.
Functions
| Name | Description |
|---|---|
| bayesian_search_forecaster | Bayesian hyperparameter optimization for a Forecaster using Optuna. |
bayesian_search_forecaster
model_selection.bayesian_search.bayesian_search_forecaster(
forecaster,
y,
cv,
search_space,
metric,
exog=None,
n_trials=10,
random_state=123,
return_best=True,
n_jobs='auto',
verbose=False,
show_progress=False,
suppress_warnings=False,
output_file=None,
kwargs_create_study=None,
kwargs_study_optimize=None,
)Bayesian hyperparameter optimization for a Forecaster using Optuna.
Performs Bayesian hyperparameter search using the Optuna library for a Forecaster object. Validation is done using time series backtesting with the provided cross-validation strategy.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| forecaster | object | Forecaster model. Can be ForecasterRecursive, ForecasterDirect, or any compatible forecaster class. | required |
| y | pd.Series | Training time series values. Must be a pandas Series with a datetime or numeric index. | required |
| cv | TimeSeriesFold | OneStepAheadFold |
Cross-validation strategy with information needed to split the data into folds. Must be an instance of TimeSeriesFold or OneStepAheadFold. | required |
| search_space | Callable | Callable function with argument trial that returns a dictionary with parameter names (str) as keys and Trial objects from optuna (trial.suggest_float, trial.suggest_int, trial.suggest_categorical) as values. Can optionally include ‘lags’ key to search over different lag configurations. |
required |
| metric | str | Callable | list[str | Callable] | Metric(s) to quantify model goodness of fit. Can be: - str: One of ‘mean_squared_error’, ‘mean_absolute_error’, ‘mean_absolute_percentage_error’, ‘mean_squared_log_error’, ‘mean_absolute_scaled_error’, ‘root_mean_squared_scaled_error’. - Callable: Function with arguments (y_true, y_pred) or (y_true, y_pred, y_train) that returns a float. - list: List containing multiple strings and/or Callables. | required |
| exog | pd.Series | pd.DataFrame | None | Exogenous variable(s) included as predictors. Must have the same number of observations as y and aligned so that y[i] is regressed on exog[i]. Default is None. |
None |
| n_trials | int | Number of parameter settings sampled during optimization. Default is 10. | 10 |
| random_state | int | Seed for sampling reproducibility. When passing a custom sampler in kwargs_create_study, set the seed within the sampler (e.g., {‘sampler’: TPESampler(seed=145)}). Default is 123. | 123 |
| return_best | bool | If True, refit the forecaster using the best parameters found on the whole dataset at the end. Default is True. | True |
| n_jobs | int | str | Number of parallel jobs. If -1, uses all cores. If ‘auto’, uses spotforecast.skforecast.utils.select_n_jobs_backtesting to automatically determine the number of jobs. Default is ‘auto’. | 'auto' |
| verbose | bool | If True, print number of folds used for cross-validation. Default is False. | False |
| show_progress | bool | Whether to show an Optuna progress bar during optimization. Default is False. | False |
| suppress_warnings | bool | If True, suppress spotforecast warnings during hyperparameter search. Default is False. | False |
| output_file | str | None | Filename or full path to save results as TSV. If None, results are not saved to file. Default is None. | None |
| kwargs_create_study | dict | None | Additional keyword arguments passed to optuna.create_study(). If not specified, direction is set to ‘minimize’ and TPESampler(seed=123) is used. Default is {}. | None |
| kwargs_study_optimize | dict | None | Additional keyword arguments passed to study.optimize(). Default is {}. | None |
Returns
| Name | Type | Description |
|---|---|---|
| tuple[pd.DataFrame, object] | tuple[pd.DataFrame, object]: A tuple containing: - results: DataFrame with columns ‘lags’, ‘params’, metric values, and individual parameter columns. Sorted by the first metric. - best_trial: Best optimization result as an optuna.FrozenTrial object containing the best parameters and metric value. |
Raises
| Name | Type | Description |
|---|---|---|
| ValueError | If exog length doesn’t match y length when return_best=True. | |
| TypeError | If cv is not an instance of TimeSeriesFold or OneStepAheadFold. | |
| ValueError | If metric list contains duplicate metric names. |