forecaster.utils

forecaster.utils

Functions

Name Description
check_residuals_input Check residuals input arguments in Forecasters.
date_to_index_position Transform a datetime string or pandas Timestamp to an integer. The integer
exog_to_direct_numpy Transforms exog to numpy ndarray with the shape needed for Direct
initialize_estimator Helper to handle the deprecation of ‘regressor’ in favor of ‘estimator’.
initialize_transformer_series Initialize transformer_series_ attribute for multivariate/multiseries forecasters.
initialize_window_features Check window_features argument input and generate the corresponding list.
predict_multivariate Generate multi-output predictions using multiple baseline forecasters.
prepare_steps_direct Prepare list of steps to be predicted in Direct Forecasters.
select_n_jobs_fit_forecaster Select the number of jobs to run in parallel.
transform_numpy Transform raw values of a numpy ndarray with a scikit-learn alike

check_residuals_input

forecaster.utils.check_residuals_input(
    forecaster_name,
    use_in_sample_residuals,
    in_sample_residuals_,
    out_sample_residuals_,
    use_binned_residuals,
    in_sample_residuals_by_bin_,
    out_sample_residuals_by_bin_,
    levels=None,
    encoding=None,
)

Check residuals input arguments in Forecasters.

Parameters

Name Type Description Default
forecaster_name str str Forecaster name. required
use_in_sample_residuals bool bool Indicates if in sample or out sample residuals are used. required
in_sample_residuals_ np.ndarray | dict[str, np.ndarray] | None numpy ndarray, dict Residuals of the model when predicting training data. required
out_sample_residuals_ np.ndarray | dict[str, np.ndarray] | None numpy ndarray, dict Residuals of the model when predicting non training data. required
use_binned_residuals bool bool Indicates if residuals are binned. required
in_sample_residuals_by_bin_ dict[str | int, np.ndarray | dict[int, np.ndarray]] | None dict In sample residuals binned according to the predicted value each residual is associated with. required
out_sample_residuals_by_bin_ dict[str | int, np.ndarray | dict[int, np.ndarray]] | None dict Out of sample residuals binned according to the predicted value each residual is associated with. required
levels list[str] | None list, default None Names of the series (levels) to be predicted (Forecasters multiseries). None
encoding str | None str, default None Encoding used to identify the different series (ForecasterRecursiveMultiSeries). None

Returns

Name Type Description
None None

date_to_index_position

forecaster.utils.date_to_index_position(
    index,
    date_input,
    method='prediction',
    date_literal='steps',
    kwargs_pd_to_datetime=None,
)

Transform a datetime string or pandas Timestamp to an integer. The integer represents the position of the datetime in the index.

Parameters

Name Type Description Default
index pd.Index pandas Index Original datetime index (must be a pandas DatetimeIndex if date_input is not an int). required
date_input int | str | pd.Timestamp int, str, pandas Timestamp Datetime to transform to integer. - If int, returns the same integer. - If str or pandas Timestamp, it is converted and expanded into the index. required
method str str, default ‘prediction’ Can be ‘prediction’ or ‘validation’. - If ‘prediction’, the date must be later than the last date in the index. - If ‘validation’, the date must be within the index range. 'prediction'
date_literal str str, default ‘steps’ Variable name used in error messages. 'steps'
kwargs_pd_to_datetime dict | None dict, default {} Additional keyword arguments to pass to pd.to_datetime(). None

Returns

Name Type Description
int int date_input transformed to integer position in the index.
int + If date_input is an integer, it returns the same integer.
int + If method is ‘prediction’, number of steps to predict from the last
int date in the index.
int + If method is ‘validation’, position plus one of the date in the index,
int this is done to include the target date in the training set when using
int pandas iloc with slices.

exog_to_direct_numpy

forecaster.utils.exog_to_direct_numpy(exog, steps)

Transforms exog to numpy ndarray with the shape needed for Direct forecasting.

Parameters

Name Type Description Default
exog np.ndarray | pd.Series | pd.DataFrame numpy ndarray, pandas Series, pandas DataFrame Exogenous variables, shape(samples,). If exog is a pandas format, the direct exog names are created. required
steps int int Number of steps that will be predicted using exog. required

Returns

Name Type Description
tuple[np.ndarray, list[str] | None] tuple[np.ndarray, list[str] | None]: exog_direct: numpy ndarray Exogenous variables transformed. exog_direct_names: list, None Names of the columns of the exogenous variables transformed. Only created if exog is a pandas Series or DataFrame.

initialize_estimator

forecaster.utils.initialize_estimator(estimator=None, regressor=None)

Helper to handle the deprecation of ‘regressor’ in favor of ‘estimator’. Returns the valid estimator object.

Parameters

Name Type Description Default
estimator object | None estimator or pipeline compatible with the scikit-learn API, default None An instance of a estimator or pipeline compatible with the scikit-learn API. None
regressor object | None estimator or pipeline compatible with the scikit-learn API, default None Deprecated. An instance of a estimator or pipeline compatible with the scikit-learn API. None

Returns

Name Type Description
None estimator or pipeline compatible with the scikit-learn API The valid estimator object.

initialize_transformer_series

forecaster.utils.initialize_transformer_series(
    forecaster_name,
    series_names_in_,
    encoding=None,
    transformer_series=None,
)

Initialize transformer_series_ attribute for multivariate/multiseries forecasters.

Creates a dictionary of transformers for each time series in multivariate or multiseries forecasting. Handles three cases: no transformation (None), same transformer for all series (single object), or different transformers per series (dictionary). Clones transformer objects to avoid overwriting.

Parameters

Name Type Description Default
forecaster_name str Name of the forecaster using this function. Special handling is applied for ‘ForecasterRecursiveMultiSeries’. required
series_names_in_ list[str] Names of the time series (levels) used during training. These will be the keys in the returned transformer dictionary. required
encoding str | None Encoding used to identify different series. Only used for ForecasterRecursiveMultiSeries. If None, creates a single ’_unknown_level’ entry. Defaults to None. None
transformer_series object | dict[str, object | None] | None Transformer(s) to apply to series. Can be: - None: No transformation applied - Single transformer object: Same transformer cloned for all series - Dict mapping series names to transformers: Different transformer per series Defaults to None. None

Returns

Name Type Description
dict dict[str, object | None] Dictionary with series names as keys and transformer objects (or None) as values. Transformers are cloned to prevent overwriting.

Warns

If transformer_series is a dict and some series_names_in_ are not present in the dict keys (those series get no transformation).

Examples

No transformation:

>>> from spotforecast2.forecaster.utils import initialize_transformer_series
>>> series = ['series1', 'series2', 'series3']
>>> result = initialize_transformer_series(
...     forecaster_name='ForecasterDirectMultiVariate',
...     series_names_in_=series,
...     transformer_series=None
... )
>>> print(result)
{'series1': None, 'series2': None, 'series3': None}

Same transformer for all series:

>>> from sklearn.preprocessing import StandardScaler
>>> scaler = StandardScaler()
>>> result = initialize_transformer_series(
...     forecaster_name='ForecasterDirectMultiVariate',
...     series_names_in_=['series1', 'series2'],
...     transformer_series=scaler
... )
>>> len(result)
2
>>> all(isinstance(v, StandardScaler) for v in result.values())
True
>>> result['series1'] is result['series2']  # Different clones
False

Different transformer per series:

>>> from sklearn.preprocessing import MinMaxScaler
>>> transformers = {
...     'series1': StandardScaler(),
...     'series2': MinMaxScaler()
... }
>>> result = initialize_transformer_series(
...     forecaster_name='ForecasterDirectMultiVariate',
...     series_names_in_=['series1', 'series2'],
...     transformer_series=transformers
... )
>>> isinstance(result['series1'], StandardScaler)
True
>>> isinstance(result['series2'], MinMaxScaler)
True

initialize_window_features

forecaster.utils.initialize_window_features(window_features)

Check window_features argument input and generate the corresponding list.

This function validates window feature objects and extracts their metadata, ensuring they have the required attributes (window_sizes, features_names) and methods (transform_batch, transform) for proper forecasting operations.

Parameters

Name Type Description Default
window_features Any Classes used to create window features. Can be a single object or a list of objects. Each object must have window_sizes, features_names attributes and transform_batch, transform methods. required

Returns

Name Type Description
tuple Tuple[Optional[List[object]], Optional[List[str]], Optional[int]] A tuple containing: - window_features (list or None): List of classes used to create window features. - window_features_names (list or None): List with all the features names of the window features. - max_size_window_features (int or None): Maximum value of the window_sizes attribute of all classes.

Raises

Name Type Description
ValueError If window_features is an empty list.
ValueError If a window feature is missing required attributes or methods.
TypeError If window_sizes or features_names have incorrect types.

Examples

>>> from spotforecast2.forecaster.preprocessing import RollingFeatures
>>> wf = RollingFeatures(stats=['mean', 'std'], window_sizes=[7, 14])
>>> wf_list, names, max_size = initialize_window_features(wf)
>>> print(f"Max window size: {max_size}")
Max window size: 14
>>> print(f"Number of features: {len(names)}")
Number of features: 4

Multiple window features:

>>> wf1 = RollingFeatures(stats=['mean'], window_sizes=7)
>>> wf2 = RollingFeatures(stats=['max', 'min'], window_sizes=3)
>>> wf_list, names, max_size = initialize_window_features([wf1, wf2])
>>> print(f"Max window size: {max_size}")
Max window size: 7

predict_multivariate

forecaster.utils.predict_multivariate(
    forecasters,
    steps_ahead,
    exog=None,
    show_progress=False,
)

Generate multi-output predictions using multiple baseline forecasters.

Parameters

Name Type Description Default
forecasters dict Dictionary of fitted forecaster instances (one per target). Keys are target names, values are the fitted forecasters (e.g., ForecasterRecursive, ForecasterEquivalentDate). required
steps_ahead int Number of steps to forecast. required
exog pd.DataFrame Exogenous variables for prediction. If provided, will be passed to each forecaster’s predict method. None
show_progress bool Show progress bar while predicting per target forecaster. Default: False. False

Returns

Name Type Description
pd.DataFrame pd.DataFrame: DataFrame with predictions for all targets.

Examples

>>> import pandas as pd
>>> from sklearn.linear_model import LinearRegression
>>> from spotforecast2_safe.forecaster.recursive import ForecasterRecursive
>>> from spotforecast2.forecaster.utils import predict_multivariate
>>> y1 = pd.Series([1, 2, 3, 4, 5])
>>> y2 = pd.Series([2, 4, 6, 8, 10])
>>> f1 = ForecasterRecursive(estimator=LinearRegression(), lags=2)
>>> f2 = ForecasterRecursive(estimator=LinearRegression(), lags=2)
>>> f1.fit(y=y1)
>>> f2.fit(y=y2)
>>> forecasters = {'target1': f1, 'target2': f2}
>>> predictions = predict_multivariate(forecasters, steps_ahead=2)
>>> predictions
   target1  target2
5      6.0     12.0
6      7.0     14.0

prepare_steps_direct

forecaster.utils.prepare_steps_direct(max_step, steps=None)

Prepare list of steps to be predicted in Direct Forecasters.

Parameters

Name Type Description Default
max_step int | list[int] | np.ndarray int, list, numpy ndarray Maximum number of future steps the forecaster will predict when using predict methods. required
steps int | list[int] | None int, list, None, default None Predict n steps. The value of steps must be less than or equal to the value of steps defined when initializing the forecaster. Starts at 1. - If int: Only steps within the range of 1 to int are predicted. - If list: List of ints. Only the steps contained in the list are predicted. - If None: As many steps are predicted as were defined at initialization. None

Returns

Name Type Description
list[int] list[int]: Steps to be predicted.

select_n_jobs_fit_forecaster

forecaster.utils.select_n_jobs_fit_forecaster(forecaster_name, estimator)

Select the number of jobs to run in parallel.

transform_numpy

forecaster.utils.transform_numpy(
    array,
    transformer,
    fit=False,
    inverse_transform=False,
)

Transform raw values of a numpy ndarray with a scikit-learn alike transformer, preprocessor or ColumnTransformer. The transformer used must have the following methods: fit, transform, fit_transform and inverse_transform. ColumnTransformers are not allowed since they do not have inverse_transform method.

Parameters

Name Type Description Default
array np.ndarray numpy ndarray Array to be transformed. required
transformer object | None scikit-learn alike transformer, preprocessor, or ColumnTransformer. Scikit-learn alike transformer (preprocessor) with methods: fit, transform, fit_transform and inverse_transform. required

fit: bool, default False Train the transformer before applying it. inverse_transform: bool, default False Transform back the data to the original representation. This is not available when using transformers of class scikit-learn ColumnTransformers.

Returns

array_transformed : numpy ndarray Transformed array.