forecaster.utils
forecaster.utils
Functions
| Name | Description |
|---|---|
| check_residuals_input | Check residuals input arguments in Forecasters. |
| date_to_index_position | Transform a datetime string or pandas Timestamp to an integer. The integer |
| exog_to_direct_numpy | Transforms exog to numpy ndarray with the shape needed for Direct |
| initialize_estimator | Helper to handle the deprecation of ‘regressor’ in favor of ‘estimator’. |
| initialize_transformer_series | Initialize transformer_series_ attribute for multivariate/multiseries forecasters. |
| initialize_window_features | Check window_features argument input and generate the corresponding list. |
| predict_multivariate | Generate multi-output predictions using multiple baseline forecasters. |
| prepare_steps_direct | Prepare list of steps to be predicted in Direct Forecasters. |
| select_n_jobs_fit_forecaster | Select the number of jobs to run in parallel. |
| transform_numpy | Transform raw values of a numpy ndarray with a scikit-learn alike |
check_residuals_input
forecaster.utils.check_residuals_input(
forecaster_name,
use_in_sample_residuals,
in_sample_residuals_,
out_sample_residuals_,
use_binned_residuals,
in_sample_residuals_by_bin_,
out_sample_residuals_by_bin_,
levels=None,
encoding=None,
)Check residuals input arguments in Forecasters.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| forecaster_name | str | str Forecaster name. | required |
| use_in_sample_residuals | bool | bool Indicates if in sample or out sample residuals are used. | required |
| in_sample_residuals_ | np.ndarray | dict[str, np.ndarray] | None | numpy ndarray, dict Residuals of the model when predicting training data. | required |
| out_sample_residuals_ | np.ndarray | dict[str, np.ndarray] | None | numpy ndarray, dict Residuals of the model when predicting non training data. | required |
| use_binned_residuals | bool | bool Indicates if residuals are binned. | required |
| in_sample_residuals_by_bin_ | dict[str | int, np.ndarray | dict[int, np.ndarray]] | None | dict In sample residuals binned according to the predicted value each residual is associated with. | required |
| out_sample_residuals_by_bin_ | dict[str | int, np.ndarray | dict[int, np.ndarray]] | None | dict Out of sample residuals binned according to the predicted value each residual is associated with. | required |
| levels | list[str] | None | list, default None Names of the series (levels) to be predicted (Forecasters multiseries). | None |
| encoding | str | None | str, default None Encoding used to identify the different series (ForecasterRecursiveMultiSeries). | None |
Returns
| Name | Type | Description |
|---|---|---|
| None | None |
date_to_index_position
forecaster.utils.date_to_index_position(
index,
date_input,
method='prediction',
date_literal='steps',
kwargs_pd_to_datetime=None,
)Transform a datetime string or pandas Timestamp to an integer. The integer represents the position of the datetime in the index.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| index | pd.Index | pandas Index Original datetime index (must be a pandas DatetimeIndex if date_input is not an int). |
required |
| date_input | int | str | pd.Timestamp | int, str, pandas Timestamp Datetime to transform to integer. - If int, returns the same integer. - If str or pandas Timestamp, it is converted and expanded into the index. | required |
| method | str | str, default ‘prediction’ Can be ‘prediction’ or ‘validation’. - If ‘prediction’, the date must be later than the last date in the index. - If ‘validation’, the date must be within the index range. | 'prediction' |
| date_literal | str | str, default ‘steps’ Variable name used in error messages. | 'steps' |
| kwargs_pd_to_datetime | dict | None | dict, default {} Additional keyword arguments to pass to pd.to_datetime(). |
None |
Returns
| Name | Type | Description |
|---|---|---|
| int | int | date_input transformed to integer position in the index. |
| int | + If date_input is an integer, it returns the same integer. |
|
| int | + If method is ‘prediction’, number of steps to predict from the last | |
| int | date in the index. | |
| int | + If method is ‘validation’, position plus one of the date in the index, | |
| int | this is done to include the target date in the training set when using | |
| int | pandas iloc with slices. |
exog_to_direct_numpy
forecaster.utils.exog_to_direct_numpy(exog, steps)Transforms exog to numpy ndarray with the shape needed for Direct forecasting.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| exog | np.ndarray | pd.Series | pd.DataFrame | numpy ndarray, pandas Series, pandas DataFrame Exogenous variables, shape(samples,). If exog is a pandas format, the direct exog names are created. | required |
| steps | int | int Number of steps that will be predicted using exog. | required |
Returns
| Name | Type | Description |
|---|---|---|
| tuple[np.ndarray, list[str] | None] | tuple[np.ndarray, list[str] | None]: exog_direct: numpy ndarray Exogenous variables transformed. exog_direct_names: list, None Names of the columns of the exogenous variables transformed. Only created if exog is a pandas Series or DataFrame. |
initialize_estimator
forecaster.utils.initialize_estimator(estimator=None, regressor=None)Helper to handle the deprecation of ‘regressor’ in favor of ‘estimator’. Returns the valid estimator object.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| estimator | object | None | estimator or pipeline compatible with the scikit-learn API, default None An instance of a estimator or pipeline compatible with the scikit-learn API. | None |
| regressor | object | None | estimator or pipeline compatible with the scikit-learn API, default None Deprecated. An instance of a estimator or pipeline compatible with the scikit-learn API. | None |
Returns
| Name | Type | Description |
|---|---|---|
| None | estimator or pipeline compatible with the scikit-learn API The valid estimator object. |
initialize_transformer_series
forecaster.utils.initialize_transformer_series(
forecaster_name,
series_names_in_,
encoding=None,
transformer_series=None,
)Initialize transformer_series_ attribute for multivariate/multiseries forecasters.
Creates a dictionary of transformers for each time series in multivariate or multiseries forecasting. Handles three cases: no transformation (None), same transformer for all series (single object), or different transformers per series (dictionary). Clones transformer objects to avoid overwriting.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| forecaster_name | str | Name of the forecaster using this function. Special handling is applied for ‘ForecasterRecursiveMultiSeries’. | required |
| series_names_in_ | list[str] | Names of the time series (levels) used during training. These will be the keys in the returned transformer dictionary. | required |
| encoding | str | None | Encoding used to identify different series. Only used for ForecasterRecursiveMultiSeries. If None, creates a single ’_unknown_level’ entry. Defaults to None. | None |
| transformer_series | object | dict[str, object | None] | None | Transformer(s) to apply to series. Can be: - None: No transformation applied - Single transformer object: Same transformer cloned for all series - Dict mapping series names to transformers: Different transformer per series Defaults to None. | None |
Returns
| Name | Type | Description |
|---|---|---|
| dict | dict[str, object | None] | Dictionary with series names as keys and transformer objects (or None) as values. Transformers are cloned to prevent overwriting. |
Warns
If transformer_series is a dict and some series_names_in_ are not present in the dict keys (those series get no transformation).
Examples
No transformation:
>>> from spotforecast2.forecaster.utils import initialize_transformer_series
>>> series = ['series1', 'series2', 'series3']
>>> result = initialize_transformer_series(
... forecaster_name='ForecasterDirectMultiVariate',
... series_names_in_=series,
... transformer_series=None
... )
>>> print(result)
{'series1': None, 'series2': None, 'series3': None}Same transformer for all series:
>>> from sklearn.preprocessing import StandardScaler
>>> scaler = StandardScaler()
>>> result = initialize_transformer_series(
... forecaster_name='ForecasterDirectMultiVariate',
... series_names_in_=['series1', 'series2'],
... transformer_series=scaler
... )
>>> len(result)
2
>>> all(isinstance(v, StandardScaler) for v in result.values())
True
>>> result['series1'] is result['series2'] # Different clones
FalseDifferent transformer per series:
>>> from sklearn.preprocessing import MinMaxScaler
>>> transformers = {
... 'series1': StandardScaler(),
... 'series2': MinMaxScaler()
... }
>>> result = initialize_transformer_series(
... forecaster_name='ForecasterDirectMultiVariate',
... series_names_in_=['series1', 'series2'],
... transformer_series=transformers
... )
>>> isinstance(result['series1'], StandardScaler)
True
>>> isinstance(result['series2'], MinMaxScaler)
Trueinitialize_window_features
forecaster.utils.initialize_window_features(window_features)Check window_features argument input and generate the corresponding list.
This function validates window feature objects and extracts their metadata, ensuring they have the required attributes (window_sizes, features_names) and methods (transform_batch, transform) for proper forecasting operations.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| window_features | Any | Classes used to create window features. Can be a single object or a list of objects. Each object must have window_sizes, features_names attributes and transform_batch, transform methods. |
required |
Returns
| Name | Type | Description |
|---|---|---|
| tuple | Tuple[Optional[List[object]], Optional[List[str]], Optional[int]] | A tuple containing: - window_features (list or None): List of classes used to create window features. - window_features_names (list or None): List with all the features names of the window features. - max_size_window_features (int or None): Maximum value of the window_sizes attribute of all classes. |
Raises
| Name | Type | Description |
|---|---|---|
| ValueError | If window_features is an empty list. |
|
| ValueError | If a window feature is missing required attributes or methods. | |
| TypeError | If window_sizes or features_names have incorrect types. |
Examples
>>> from spotforecast2.forecaster.preprocessing import RollingFeatures
>>> wf = RollingFeatures(stats=['mean', 'std'], window_sizes=[7, 14])
>>> wf_list, names, max_size = initialize_window_features(wf)
>>> print(f"Max window size: {max_size}")
Max window size: 14
>>> print(f"Number of features: {len(names)}")
Number of features: 4Multiple window features:
>>> wf1 = RollingFeatures(stats=['mean'], window_sizes=7)
>>> wf2 = RollingFeatures(stats=['max', 'min'], window_sizes=3)
>>> wf_list, names, max_size = initialize_window_features([wf1, wf2])
>>> print(f"Max window size: {max_size}")
Max window size: 7predict_multivariate
forecaster.utils.predict_multivariate(
forecasters,
steps_ahead,
exog=None,
show_progress=False,
)Generate multi-output predictions using multiple baseline forecasters.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| forecasters | dict | Dictionary of fitted forecaster instances (one per target). Keys are target names, values are the fitted forecasters (e.g., ForecasterRecursive, ForecasterEquivalentDate). | required |
| steps_ahead | int | Number of steps to forecast. | required |
| exog | pd.DataFrame | Exogenous variables for prediction. If provided, will be passed to each forecaster’s predict method. | None |
| show_progress | bool | Show progress bar while predicting per target forecaster. Default: False. | False |
Returns
| Name | Type | Description |
|---|---|---|
| pd.DataFrame | pd.DataFrame: DataFrame with predictions for all targets. |
Examples
>>> import pandas as pd
>>> from sklearn.linear_model import LinearRegression
>>> from spotforecast2_safe.forecaster.recursive import ForecasterRecursive
>>> from spotforecast2.forecaster.utils import predict_multivariate
>>> y1 = pd.Series([1, 2, 3, 4, 5])
>>> y2 = pd.Series([2, 4, 6, 8, 10])
>>> f1 = ForecasterRecursive(estimator=LinearRegression(), lags=2)
>>> f2 = ForecasterRecursive(estimator=LinearRegression(), lags=2)
>>> f1.fit(y=y1)
>>> f2.fit(y=y2)
>>> forecasters = {'target1': f1, 'target2': f2}
>>> predictions = predict_multivariate(forecasters, steps_ahead=2)
>>> predictions
target1 target2
5 6.0 12.0
6 7.0 14.0prepare_steps_direct
forecaster.utils.prepare_steps_direct(max_step, steps=None)Prepare list of steps to be predicted in Direct Forecasters.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| max_step | int | list[int] | np.ndarray | int, list, numpy ndarray Maximum number of future steps the forecaster will predict when using predict methods. | required |
| steps | int | list[int] | None | int, list, None, default None Predict n steps. The value of steps must be less than or equal to the value of steps defined when initializing the forecaster. Starts at 1. - If int: Only steps within the range of 1 to int are predicted. - If list: List of ints. Only the steps contained in the list are predicted. - If None: As many steps are predicted as were defined at initialization. |
None |
Returns
| Name | Type | Description |
|---|---|---|
| list[int] | list[int]: Steps to be predicted. |
select_n_jobs_fit_forecaster
forecaster.utils.select_n_jobs_fit_forecaster(forecaster_name, estimator)Select the number of jobs to run in parallel.
transform_numpy
forecaster.utils.transform_numpy(
array,
transformer,
fit=False,
inverse_transform=False,
)Transform raw values of a numpy ndarray with a scikit-learn alike transformer, preprocessor or ColumnTransformer. The transformer used must have the following methods: fit, transform, fit_transform and inverse_transform. ColumnTransformers are not allowed since they do not have inverse_transform method.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| array | np.ndarray | numpy ndarray Array to be transformed. | required |
| transformer | object | None | scikit-learn alike transformer, preprocessor, or ColumnTransformer. Scikit-learn alike transformer (preprocessor) with methods: fit, transform, fit_transform and inverse_transform. | required |
fit: bool, default False Train the transformer before applying it. inverse_transform: bool, default False Transform back the data to the original representation. This is not available when using transformers of class scikit-learn ColumnTransformers.
Returns
array_transformed : numpy ndarray Transformed array.