spotforecast2
  1. API Reference
  2. Model Selection
  3. random_search
  • Home
  • API Reference
    • Overview
    • Exceptions
      • exceptions
    • Forecaster
      • metrics
      • utils
      • recursive._warnings
    • Manager
      • models.forecaster_recursive_model_full
      • models.forecaster_recursive_lgbm_full
      • models.forecaster_recursive_xgb_full
      • plotter
      • trainer_full
      • agg_predictor
      • BaseTask
      • LazyTask
      • OptunaTask
      • SpotOptimTask
      • PredictTask
      • CleanTask
      • MultiTask
      • run
    • Model Selection
      • bayesian_search
      • grid_search
      • random_search
      • split_base
      • split_ts_cv
      • spotoptim_search
      • utils_common
      • utils_metrics
    • Preprocessing
      • _binner
      • _common
      • _differentiator
      • _rolling
      • outlier
      • outlier_plots
      • split
      • time_series_visualization
    • Stats
      • autocorrelation
    • Tasks
      • task_demo
      • task_entsoe
      • task_n_to_1
      • task_n_to_1_dataframe
      • task_n_to_1_with_covariates
      • task_n_to_1_with_covariates_and_dataframe
    • Utils
      • data_transform
      • forecaster_config
      • generate_holiday
      • validation
  • Processing Guides
    • Model Persistence
  • Preprocessing Guides
    • Outlier Detection
    • Time Series Visualization
  • Model Selection Guides
    • Intro to Model Training
    • Intro to SpotOptim
    • SpotOptim Lag Handling
  • Tasks Guide
    • Overview
    • ENTSO-E Guide

On this page

  • model_selection.random_search
    • Functions
      • random_search_forecaster
  • Edit this page
  • Report an issue
  1. API Reference
  2. Model Selection
  3. random_search

model_selection.random_search

model_selection.random_search

Random search hyperparameter optimization for forecasters.

Functions

Name Description
random_search_forecaster Random search over parameter distributions for a Forecaster.

random_search_forecaster

model_selection.random_search.random_search_forecaster(
    forecaster,
    y,
    cv,
    param_distributions,
    metric,
    exog=None,
    lags_grid=None,
    n_iter=10,
    random_state=123,
    return_best=True,
    n_jobs='auto',
    verbose=False,
    show_progress=True,
    suppress_warnings=False,
    output_file=None,
)

Random search over parameter distributions for a Forecaster.

Performs random sampling of parameter settings from distributions for a Forecaster object. Validation is done using time series backtesting with the provided cross-validation strategy. This is more efficient than grid search when exploring large parameter spaces.

Parameters

Name Type Description Default
forecaster object Forecaster model (ForecasterRecursive or ForecasterDirect). required
y pd.Series Training time series. required
cv TimeSeriesFold | OneStepAheadFold Cross-validation strategy (TimeSeriesFold or OneStepAheadFold) with information needed to split the data into folds. required
param_distributions dict Dictionary with parameter names (str) as keys and distributions or lists of parameters to try as values. Use scipy.stats distributions for continuous parameters. required
metric str | Callable | list[str | Callable] Metric(s) to quantify model goodness of fit. If str: ‘mean_squared_error’, ‘mean_absolute_error’, ‘mean_absolute_percentage_error’, ‘mean_squared_log_error’, ‘mean_absolute_scaled_error’, ‘root_mean_squared_scaled_error’. If Callable: Function with arguments (y_true, y_pred, y_train) that returns a float. If list: Multiple strings and/or Callables. required
exog pd.Series | pd.DataFrame | None Exogenous variable(s) included as predictors. Must have the same number of observations as y and aligned so that y[i] is regressed on exog[i]. Default is None. None
lags_grid list[int | list[int] | np.ndarray[int] | range[int]] | dict[str, list[int | list[int] | np.ndarray[int] | range[int]]] | None Lists of lags to try. Can be int, lists, numpy ndarray, or range objects. If dict, keys are used as labels in results DataFrame. Default is None. None
n_iter int Number of parameter settings sampled per lags configuration. Trades off runtime vs solution quality. Default is 10. 10
random_state int Seed for random sampling for reproducible output. Default is 123. 123
return_best bool If True, refit the forecaster using best parameters on the whole dataset. Default is True. True
n_jobs int | str Number of jobs to run in parallel. If -1, uses all cores. If ‘auto’, uses select_n_jobs_backtesting. Default is ‘auto’. 'auto'
verbose bool If True, print number of folds used for cv. Default is False. False
show_progress bool Whether to show a progress bar. Default is True. True
suppress_warnings bool If True, suppress spotforecast warnings during hyperparameter search. Default is False. False
output_file str | None Filename or full path to save results as TSV. If None, results are not saved to file. Default is None. None

Returns

Name Type Description
pd.DataFrame Results for each parameter combination with columns: lags (lags
pd.DataFrame configuration), lags_label (descriptive label), params (parameters
pd.DataFrame configuration), metric (metric value), and additional columns with
pd.DataFrame param=value pairs.

Examples

Basic random search with continuous parameter distributions:

>>> import pandas as pd
>>> import numpy as np
>>> from sklearn.linear_model import Ridge
>>> from scipy.stats import uniform
>>> from spotforecast2_safe.forecaster.recursive import ForecasterRecursive
>>> from spotforecast2.model_selection import TimeSeriesFold
>>> from spotforecast2.model_selection.random_search import random_search_forecaster
>>>
>>> # Create sample data
>>> np.random.seed(123)
>>> y = pd.Series(np.random.randn(50), name='y')
>>>
>>> # Set up forecaster and cross-validation
>>> forecaster = ForecasterRecursive(estimator=Ridge(), lags=3)
>>> cv = TimeSeriesFold(steps=3, initial_train_size=20, refit=False)
>>>
>>> # Define parameter distributions with scipy.stats
>>> param_distributions = {
...     'estimator__alpha': uniform(0.1, 10.0)  # Uniform between 0.1 and 10.1
... }
>>>
>>> # Run random search
>>> results = random_search_forecaster(
...     forecaster=forecaster,
...     y=y,
...     cv=cv,
...     param_distributions=param_distributions,
...     metric='mean_squared_error',
...     n_iter=5,
...     random_state=42,
...     return_best=False,
...     verbose=False,
...     show_progress=False
... )
>>>
>>> # Check results
>>> print(results.shape[0])
5
>>> print('estimator__alpha' in results.columns)
True
>>> print('mean_squared_error' in results.columns)
True
 

Copyright © 2024-2026 bartzbeielstein | AGPL-3.0-or-later

  • Edit this page
  • Report an issue