Bayesian hyperparameter optimization for a Forecaster using Optuna.
Performs Bayesian hyperparameter search using the Optuna library for a Forecaster object. Validation is done using time series backtesting with the provided cross-validation strategy.
Callable function with argument trial that returns a dictionary with parameter names (str) as keys and Trial objects from optuna (trial.suggest_float, trial.suggest_int, trial.suggest_categorical) as values. Can optionally include ‘lags’ key to search over different lag configurations.
Metric(s) to quantify model goodness of fit. Can be: - str: One of ‘mean_squared_error’, ‘mean_absolute_error’, ‘mean_absolute_percentage_error’, ‘mean_squared_log_error’, ‘mean_absolute_scaled_error’, ‘root_mean_squared_scaled_error’. - Callable: Function with arguments (y_true, y_pred) or (y_true, y_pred, y_train) that returns a float. - list: List containing multiple strings and/or Callables.
Exogenous variable(s) included as predictors. Must have the same number of observations as y and aligned so that y[i] is regressed on exog[i]. Default is None.
Seed for sampling reproducibility. When passing a custom sampler in kwargs_create_study, set the seed within the sampler (e.g., {‘sampler’: TPESampler(seed=145)}). Default is 123.
Number of parallel jobs. If -1, uses all cores. If ‘auto’, uses spotforecast.skforecast.utils.select_n_jobs_backtesting to automatically determine the number of jobs. Default is ‘auto’.
Additional keyword arguments passed to optuna.create_study(). If not specified, direction is set to ‘minimize’ and TPESampler(seed=123) is used. Default is {}.
tuple[pd.DataFrame, object]: A tuple containing: - results: DataFrame with columns ‘lags’, ‘params’, metric values, and individual parameter columns. Sorted by the first metric. - best_trial: Best optimization result as an optuna.FrozenTrial object containing the best parameters and metric value.