forecaster.utils

forecaster.utils

Functions

Name Description
align_series_and_exog_multiseries Align series and exog according to their index.
check_extract_values_and_index Extract values and index from a pandas Series or DataFrame, ensuring they are valid.
check_preprocess_series Check and preprocess series argument in ForecasterRecursiveMultiSeries class.
check_residuals_input Check residuals input arguments in Forecasters.
date_to_index_position Transform a datetime string or pandas Timestamp to an integer position.
exog_to_direct Transforms exog to a pandas DataFrame with the shape needed for Direct
exog_to_direct_numpy Transforms exog to numpy ndarray with the shape needed for Direct
get_style_repr_html Generate CSS style for HTML representation of the Forecaster.
initialize_differentiator_multiseries Initialize differentiator_ attribute for multiseries forecasters.
initialize_estimator Handle the deprecation of ‘regressor’ in favor of ‘estimator’.
initialize_transformer_series Initialize transformer_series_ attribute for multivariate/multiseries forecasters.
initialize_window_features Check window_features argument input and generate the corresponding list.
predict_multivariate Generate multi-output predictions using multiple baseline forecasters.
prepare_levels_multiseries Prepare list of levels to be predicted in multiseries Forecasters.
prepare_steps_direct Prepare list of steps to be predicted in Direct Forecasters.
preprocess_levels_self_last_window_multiseries Preprocess levels and last_window arguments for prediction.
select_n_jobs_fit_forecaster Select the number of jobs to run in parallel during the fit process.
set_cpu_gpu_device Set the device for the estimator.
transform_numpy Transform raw values of a numpy ndarray with a scikit-learn alike

align_series_and_exog_multiseries

forecaster.utils.align_series_and_exog_multiseries(series_dict, exog_dict=None)

Align series and exog according to their index.

Heading and trailing NaNs are removed from all series in series_dict. If needed, reindexing is applied to exog_dict.

Parameters

Name Type Description Default
series_dict dict Dictionary with the series used during training. required
exog_dict (dict, None) Dictionary with the exogenous variable/s used during training. Defaults to None. None

Returns

Name Type Description
tuple tuple[dict[str, pd.Series], dict[str, pd.DataFrame | None]] A tuple containing: - series_dict (dict): Dictionary with the aligned series. - exog_dict (dict): Dictionary with the aligned exogenous variables.

check_extract_values_and_index

forecaster.utils.check_extract_values_and_index(
    data,
    data_label='`y`',
    ignore_freq=False,
    return_values=True,
)

Extract values and index from a pandas Series or DataFrame, ensuring they are valid.

Validates that the input data has a proper DatetimeIndex or RangeIndex and extracts its values and index for use in forecasting operations. Optionally checks for index frequency consistency.

Parameters

Name Type Description Default
data Union[pd.Series, pd.DataFrame] Input data (pandas Series or DataFrame) to extract values and index from. required
data_label str Label used in exception messages for better error reporting. Defaults to “y”. 'y'
ignore_freq bool If True, the frequency of the index is not checked. Defaults to False. False
return_values bool If True, the values of the data are returned. Defaults to True. True

Returns

Name Type Description
tuple Tuple[Optional[np.ndarray], pd.Index] A tuple containing: - values (numpy.ndarray or None): Values of the data as numpy array, or None if return_values is False. - index (pandas.Index): Index of the data.

Raises

Name Type Description
TypeError If data is not a pandas Series or DataFrame.
TypeError If data index is not a DatetimeIndex or RangeIndex.

Warns

If DatetimeIndex has no frequency (inferred automatically).

Examples

>>> import pandas as pd
>>> import numpy as np
>>> dates = pd.date_range('2020-01-01', periods=10, freq='D')
>>> series = pd.Series(np.arange(10), index=dates)
>>> values, index = check_extract_values_and_index(series)
>>> print(values.shape)
(10,)
>>> print(type(index))
<class 'pandas.core.indexes.datetimes.DatetimeIndex'>

Extract index only:

>>> _, index = check_extract_values_and_index(series, return_values=False)
>>> print(index[0])
2020-01-01 00:00:00

check_preprocess_series

forecaster.utils.check_preprocess_series(series)

Check and preprocess series argument in ForecasterRecursiveMultiSeries class.

- If `series` is a wide-format pandas DataFrame, each column represents a
different time series, and the index must be either a `DatetimeIndex` or
a `RangeIndex` with frequency or step size, as appropriate
- If `series` is a long-format pandas DataFrame with a MultiIndex, the
first level of the index must contain the series IDs, and the second
level must be a `DatetimeIndex` with the same frequency across all series.
- If series is a dictionary, each key must be a series ID, and each value
must be a named pandas Series. All series must have the same index, which
must be either a `DatetimeIndex` or a `RangeIndex`, and they must share the
same frequency or step size, as appropriate.

When series is a pandas DataFrame, it is converted to a dictionary of pandas Series, where the keys are the series IDs and the values are the Series with the same index as the original DataFrame.

Parameters

Name Type Description Default
series (pd.DataFrame, dict) pandas DataFrame or dictionary of pandas Series/DataFrames. required

Returns

Name Type Description
tuple tuple[dict[str, pd.Series], dict[str, pd.Index]] A tuple containing: - series_dict (dict): Dictionary where keys are series IDs and values are pandas Series. - series_indexes (dict): Dictionary where keys are series IDs and values are the index of each series.

Raises: TypeError: If series is not a pandas DataFrame or a dictionary of pandas Series/DataFrames. TypeError: If the index of series is not a DatetimeIndex or RangeIndex with frequency/step size. ValueError: If the series in series have different frequencies or step sizes. ValueError: If all values of any series are NaN. UserWarning: If series is a wide-format DataFrame, only the first column will be used as series values. UserWarning: If series is a DataFrame (either wide or long format), additional internal transformations are required, which can increase computational time. It is recommended to use a dictionary of pandas Series instead.

Examples

>>> import pandas as pd
>>> from spotforecast2_safe.forecaster.utils import check_preprocess_series
>>> # Example with wide-format DataFrame
>>> dates = pd.date_range('2020-01-01', periods=5, freq='D')
>>> df_wide = pd.DataFrame({
...     'series_1': [1, 2, 3, 4, 5],
...     'series_2': [5, 4, 3, 2, 1],
... }, index=dates)
>>> series_dict, series_indexes = check_preprocess_series(df_wide)
UserWarning: `series` DataFrame has multiple columns. Only the values of first column, 'series_1', will be used as series values. All other columns will be ignored.
UserWarning: Passing a DataFrame (either wide or long format) as `series` requires additional internal transformations, which can increase computational time.
It is recommended to use a dictionary of pandas Series instead.
>>> print(series_dict['series_1'])
2020-01-01    1
2020-01-02    2
2020-01-03    3
2020-01-04    4
2020-01-05    5
Name: series_1, dtype: int64
>>> print(series_indexes['series_1'])
DatetimeIndex(['2020-01-01', '2020-01-02', '2020-01-03', '2020-01-04',
               '2020-01-05'],
              dtype='datetime64[ns]', freq='D')
>>> # Example with long-format DataFrame
>>> df_long = pd.DataFrame({
...     'series_id': ['series_1'] * 5 + ['series_2'] * 5,
...     'value': [1, 2, 3, 4, 5, 5, 4, 3, 2, 1],
... }, index=pd.MultiIndex.from_product([['series_1', 'series_2'], dates], names=['series_id', 'date']))
>>> series_dict, series_indexes = check_preprocess_series(df_long)
UserWarning: `series` DataFrame has multiple columns. Only the values of first column, 'value', will be used as series values. All other columns will be ignored.
UserWarning: Passing a DataFrame (either wide or long format) as `series` requires additional internal transformations, which can increase computational time.
It is recommended to use a dictionary of pandas Series instead.
>>> print(series_dict['series_1'])
2020-01-01    1
2020-01-02    2
2020-01-03    3
2020-01-04    4
2020-01-05    5
Name: series_1, dtype: int64
>>> print(series_indexes['series_1'])
DatetimeIndex(['2020-01-01', '2020-01-02', '2020-01-03', '2020-01-04',
                  '2020-01-05'],
                 dtype='datetime64[ns]', freq='D')
>>> # Example with dictionary of Series
>>> series_dict_input = {
...     'series_1': pd.Series([1, 2, 3, 4, 5], index=dates),
...     'series_2': pd.Series([5, 4, 3, 2, 1], index=dates),
... }
>>> series_dict, series_indexes = check_preprocess_series(series_dict_input)
>>> print(series_dict['series_1'])
2020-01-01    1
2020-01-02    2
2020-01-03    3
2020-01-04    4
2020-01-05    5
Name: series_1, dtype: int64
>>> print(series_indexes['series_1'])
DatetimeIndex(['2020-01-01', '2020-01-02', '2020-01-03', '2020-01-04',
               '2020-01-05'],
              dtype='datetime64[ns]', freq='D')
    >>> # Example with dictionary of DataFrames
    >>> df_series_1 = pd.DataFrame({'value': [1, 2, 3, 4, 5]}, index=dates)
    >>> df_series_2 = pd.DataFrame({'value': [5, 4, 3, 2, 1]}, index=dates)
    >>> series_dict_input = {
    ...     'series_1': df_series_1,
    ...     'series_2': df_series_2,
    ... }
    >>> series_dict, series_indexes = check_preprocess_series(series_dict_input)
    >>> print(series_dict['series_1'])
2020-01-01    1
2020-01-02    2
2020-01-03    3
2020-01-04    4
2020-01-05    5
Name: series_1, dtype: int64
>>> print(series_indexes['series_1'])
DatetimeIndex(['2020-01-01', '2020-01-02', '2020-01-03', '2020-01-04',
               '2020-01-05'],
              dtype='datetime64[ns]', freq='D')

check_residuals_input

forecaster.utils.check_residuals_input(
    forecaster_name,
    use_in_sample_residuals,
    in_sample_residuals_,
    out_sample_residuals_,
    use_binned_residuals,
    in_sample_residuals_by_bin_,
    out_sample_residuals_by_bin_,
    levels=None,
    encoding=None,
)

Check residuals input arguments in Forecasters.

Parameters

Name Type Description Default
forecaster_name str Forecaster name. required
use_in_sample_residuals bool Indicates if in sample or out sample residuals are used. required
in_sample_residuals_ (np.ndarray, dict, None) Residuals of the model when predicting training data. required
out_sample_residuals_ (np.ndarray, dict, None) Residuals of the model when predicting non training data. required
use_binned_residuals bool Indicates if residuals are binned. required
in_sample_residuals_by_bin_ (dict, None) In sample residuals binned according to the predicted value. required
out_sample_residuals_by_bin_ (dict, None) Out of sample residuals binned according to the predicted value. required
levels (list, None) Names of the series (levels) to be predicted. Defaults to None. None
encoding (str, None) Encoding used to identify the different series. Defaults to None. None

Returns

Name Type Description
None None

Examples

>>> from spotforecast2_safe.forecaster.utils import check_residuals_input
>>> import numpy as np
>>> forecaster_name = "ForecasterRecursiveMultiSeries"
>>> use_in_sample_residuals = True
>>> in_sample_residuals_ = {'series_1': np.array([0.1, -0.2]), 'series_2': np.array([0.3, -0.1])}
>>> out_sample_residuals_ = None
>>> use_binned_residuals = False
>>> check_residuals_input(
...     forecaster_name,
...     use_in_sample_residuals,
...     in_sample_residuals_,
...     out_sample_residuals_,
...     use_binned_residuals,
...     in_sample_residuals_by_bin_=None,
...     out_sample_residuals_by_bin_=None,
...     levels=['series_1', 'series_2'],
...     encoding='onehot'
... )

date_to_index_position

forecaster.utils.date_to_index_position(
    index,
    date_input,
    method='prediction',
    date_literal='steps',
    kwargs_pd_to_datetime={},
)

Transform a datetime string or pandas Timestamp to an integer position.

The integer represents the position of the datetime in the index.

Parameters

Name Type Description Default
index pd.Index Original datetime index. required
date_input (int, str, pd.Timestamp) Datetime to transform. required
method str Strategy to use. Options: ‘prediction’, ‘validation’. Defaults to 'prediction'. 'prediction'
date_literal str Variable name used in error messages. Defaults to 'steps'. 'steps'
kwargs_pd_to_datetime dict Keyword arguments for pd.to_datetime(). Defaults to {}. {}

Returns

Name Type Description
int int date_input transformed to integer position in the index.

Raises

Name Type Description
ValueError If method is not ‘prediction’ or ‘validation’.
TypeError If date_input is not an int, str, or pandas Timestamp.
TypeError If index is not a pandas DatetimeIndex when date_input is not an int.
ValueError If date_input is a date and does not meet requirement.

Examples

>>> from spotforecast2_safe.forecaster.utils import date_to_index_position
>>> import pandas as pd
>>> index = pd.date_range(start='2020-01-01', periods=10, freq='D')
>>> # Using an integer input
>>> date_to_index_position(index, 5)
5
>>> # Using a date input for prediction
>>> date_to_index_position(index, '2020-01-15', method='prediction')
5
>>> # Using a date input for validation
>>> date_to_index_position(index, '2020-01-05', method='validation')
5

exog_to_direct

forecaster.utils.exog_to_direct(exog, steps)

Transforms exog to a pandas DataFrame with the shape needed for Direct forecasting.

Parameters

Name Type Description Default
exog (pd.Series, pd.DataFrame) Exogenous variables. required
steps int Number of steps that will be predicted using exog. required

Returns

Name Type Description
tuple tuple[pd.DataFrame, list[str]] A tuple containing: - exog_direct (pd.DataFrame): Exogenous variables transformed. - exog_direct_names (list): Names of the columns of the exogenous variables transformed.

exog_to_direct_numpy

forecaster.utils.exog_to_direct_numpy(exog, steps)

Transforms exog to numpy ndarray with the shape needed for Direct forecasting.

Parameters

Name Type Description Default
exog (np.ndarray, pd.Series, pd.DataFrame) Exogenous variables, shape(samples,). If exog is a pandas format, the direct exog names are created. required
steps int Number of steps that will be predicted using exog. required

Returns

Name Type Description
tuple tuple[np.ndarray, list[str] | None] A tuple containing: - exog_direct (np.ndarray): Exogenous variables transformed. - exog_direct_names (list, None): Names of the columns of the exogenous variables transformed. Only created if exog is a pandas format.

Examples

>>> from spotforecast2_safe.forecaster.utils import exog_to_direct_numpy
>>> import numpy as np
>>> exog = np.array([10, 20, 30, 40, 50])
>>> steps = 3
>>> exog_direct, exog_direct_names = exog_to_direct_numpy(exog, steps)
>>> print(exog_direct)
[[10 20 30]
 [20 30 40]
 [30 40 50]]
>>> print(exog_direct_names)
None

get_style_repr_html

forecaster.utils.get_style_repr_html(is_fitted=False)

Generate CSS style for HTML representation of the Forecaster.

Creates a unique CSS style block with a container ID for rendering forecaster objects in Jupyter notebooks or HTML documents. The styling provides a clean, monospace display with a light gray background.

Parameters

Name Type Description Default
is_fitted bool Parameter to indicate if the Forecaster has been fitted. Currently not used in styling but reserved for future extensions. False

Returns

Name Type Description
tuple Tuple[str, str] A tuple containing: - style (str): CSS style block as a string with unique container class. - unique_id (str): Unique 8-character ID for the container element.

Examples

>>> style, uid = get_style_repr_html(is_fitted=True)
>>> print(f"Container ID: {uid}")
Container ID: a1b2c3d4
>>> print(f"Style contains CSS: {'container-' in style}")
Style contains CSS: True

Using in HTML rendering:

>>> style, uid = get_style_repr_html(is_fitted=False)
>>> html = f"{style}<div class='container-{uid}'>Forecaster Info</div>"
>>> print("background-color" in html)
True

initialize_differentiator_multiseries

forecaster.utils.initialize_differentiator_multiseries(
    series_names_in_,
    differentiator=None,
)

Initialize differentiator_ attribute for multiseries forecasters.

Parameters

Name Type Description Default
series_names_in_ list Names of the series (levels) used during training. required
differentiator (object, dict, None) Skforecast object (or dict of objects) used to differentiate the time series. Defaults to None. None

Returns

Name Type Description
dict dict[str, object | None] Dictionary with the differentiator for each series.

initialize_estimator

forecaster.utils.initialize_estimator(estimator=None, regressor=None)

Handle the deprecation of ‘regressor’ in favor of ‘estimator’.

Parameters

Name Type Description Default
estimator (object, None) Estimator or pipeline compatible with the scikit-learn API. Defaults to None. None
regressor (object, None) Deprecated. Alias for estimator. Defaults to None. None

Returns

Name Type Description
object None The valid estimator object.

Raises

Name Type Description
ValueError If both estimator and regressor are provided.

Examples

>>> from spotforecast2_safe.forecaster.utils import initialize_estimator
>>> from sklearn.linear_model import LinearRegression
>>> # Using the `estimator` argument
>>> estimator = LinearRegression()
>>> initialize_estimator(estimator=estimator)
LinearRegression()
>>> # Using the deprecated `regressor` argument
>>> regressor = LinearRegression()
>>> initialize_estimator(regressor=regressor)
LinearRegression()

initialize_transformer_series

forecaster.utils.initialize_transformer_series(
    forecaster_name,
    series_names_in_,
    encoding=None,
    transformer_series=None,
)

Initialize transformer_series_ attribute for multivariate/multiseries forecasters.

Creates a dictionary of transformers for each time series in multivariate or multiseries forecasting. Handles three cases: no transformation (None), same transformer for all series (single object), or different transformers per series (dictionary). Clones transformer objects to avoid overwriting.

Parameters

Name Type Description Default
forecaster_name str Name of the forecaster using this function. Special handling is applied for ‘ForecasterRecursiveMultiSeries’. required
series_names_in_ list[str] Names of the time series (levels) used during training. These will be the keys in the returned transformer dictionary. required
encoding str | None Encoding used to identify different series. Only used for ForecasterRecursiveMultiSeries. If None, creates a single ’_unknown_level’ entry. Defaults to None. None
transformer_series object | dict[str, object | None] | None Transformer(s) to apply to series. Can be: - None: No transformation applied - Single transformer object: Same transformer cloned for all series - Dict mapping series names to transformers: Different transformer per series Defaults to None. None

Returns

Name Type Description
dict dict[str, object | None] Dictionary with series names as keys and transformer objects (or None) as values. Transformers are cloned to prevent overwriting.

Warns

If transformer_series is a dict and some series_names_in_ are not present in the dict keys (those series get no transformation).

Examples

No transformation:

>>> from spotforecast2_safe.forecaster.utils import initialize_transformer_series
>>> series = ['series1', 'series2', 'series3']
>>> result = initialize_transformer_series(
...     forecaster_name='ForecasterDirectMultiVariate',
...     series_names_in_=series,
...     transformer_series=None
... )
>>> print(result)
{'series1': None, 'series2': None, 'series3': None}

Same transformer for all series:

>>> from sklearn.preprocessing import StandardScaler
>>> scaler = StandardScaler()
>>> result = initialize_transformer_series(
...     forecaster_name='ForecasterDirectMultiVariate',
...     series_names_in_=['series1', 'series2'],
...     transformer_series=scaler
... )
>>> len(result)
2
>>> all(isinstance(v, StandardScaler) for v in result.values())
True
>>> result['series1'] is result['series2']  # Different clones
False

Different transformer per series:

>>> from sklearn.preprocessing import MinMaxScaler
>>> transformers = {
...     'series1': StandardScaler(),
...     'series2': MinMaxScaler()
... }
>>> result = initialize_transformer_series(
...     forecaster_name='ForecasterDirectMultiVariate',
...     series_names_in_=['series1', 'series2'],
...     transformer_series=transformers
... )
>>> isinstance(result['series1'], StandardScaler)
True
>>> isinstance(result['series2'], MinMaxScaler)
True

initialize_window_features

forecaster.utils.initialize_window_features(window_features)

Check window_features argument input and generate the corresponding list.

This function validates window feature objects and extracts their metadata, ensuring they have the required attributes (window_sizes, features_names) and methods (transform_batch, transform) for proper forecasting operations.

Parameters

Name Type Description Default
window_features Any Classes used to create window features. Can be a single object or a list of objects. Each object must have window_sizes, features_names attributes and transform_batch, transform methods. required

Returns

Name Type Description
tuple Tuple[Optional[List[object]], Optional[List[str]], Optional[int]] A tuple containing: - window_features (list or None): List of classes used to create window features. - window_features_names (list or None): List with all the features names of the window features. - max_size_window_features (int or None): Maximum value of the window_sizes attribute of all classes.

Raises

Name Type Description
ValueError If window_features is an empty list.
ValueError If a window feature is missing required attributes or methods.
TypeError If window_sizes or features_names have incorrect types.

Examples

>>> from spotforecast2_safe.forecaster.preprocessing import RollingFeatures
>>> wf = RollingFeatures(stats=['mean', 'std'], window_sizes=[7, 14])
>>> wf_list, names, max_size = initialize_window_features(wf)
>>> print(f"Max window size: {max_size}")
Max window size: 14
>>> print(f"Number of features: {len(names)}")
Number of features: 4

Multiple window features:

>>> class MockWF:
...     def __init__(self, names, sizes):
...         self.features_names = names
...         self.window_sizes = sizes
...     def transform_batch(self, X): pass
...     def transform(self, X): pass
>>> wf1 = MockWF(['f1'], 7)
>>> wf2 = MockWF(['f2', 'f3'], 3)
>>> wf_list, names, max_size = initialize_window_features([wf1, wf2])
>>> print(f"Max window size: {max_size}")
Max window size: 7

predict_multivariate

forecaster.utils.predict_multivariate(
    forecasters,
    steps_ahead,
    exog=None,
    show_progress=False,
)

Generate multi-output predictions using multiple baseline forecasters.

Parameters

Name Type Description Default
forecasters dict Dictionary of fitted forecaster instances (one per target). Keys are target names, values are the fitted forecasters (e.g., ForecasterRecursive, ForecasterEquivalentDate). required
steps_ahead int Number of steps to forecast. required
exog pd.DataFrame Exogenous variables for prediction. If provided, will be passed to each forecaster’s predict method. None
show_progress bool Show progress bar while predicting per target forecaster. Default: False. False

Returns

Name Type Description
pd.DataFrame pd.DataFrame: DataFrame with predictions for all targets.

Examples

>>> import pandas as pd
>>> from sklearn.linear_model import LinearRegression
>>> from spotforecast2_safe.forecaster.recursive import ForecasterRecursive
>>> from spotforecast2_safe.forecaster.utils import predict_multivariate
>>> y1 = pd.Series([1, 2, 3, 4, 5])
>>> y2 = pd.Series([2, 4, 6, 8, 10])
>>> f1 = ForecasterRecursive(estimator=LinearRegression(), lags=2)
>>> f2 = ForecasterRecursive(estimator=LinearRegression(), lags=2)
>>> f1.fit(y=y1)
>>> f2.fit(y=y2)
>>> forecasters = {'target1': f1, 'target2': f2}
>>> predictions = predict_multivariate(forecasters, steps_ahead=2)
>>> predictions
   target1  target2
5      6.0     12.0
6      7.0     14.0

prepare_levels_multiseries

forecaster.utils.prepare_levels_multiseries(
    X_train_series_names_in_,
    levels=None,
)

Prepare list of levels to be predicted in multiseries Forecasters.

Parameters

Name Type Description Default
X_train_series_names_in_ list Names of the series (levels) included in the matrix X_train. required
levels (str, list, None) Names of the series (levels) to be predicted. Defaults to None. None

Returns

Name Type Description
tuple tuple[list[str], bool] A tuple containing: - levels (list): Names of the series (levels) to be predicted. - input_levels_is_list (bool): Indicates if input levels argument is a list.

prepare_steps_direct

forecaster.utils.prepare_steps_direct(max_step, steps=None)

Prepare list of steps to be predicted in Direct Forecasters.

Parameters

Name Type Description Default
max_step (int, list, np.ndarray) Maximum number of future steps the forecaster will predict. required
steps (int, list, None) Predict n steps. The value of steps must be less than or equal to the value of steps defined when initializing the forecaster. Starts at 1. Defaults to None. None

Returns

Name Type Description
list list[int] Steps to be predicted.

Examples

>>> from spotforecast2_safe.forecaster.utils import prepare_steps_direct
>>> max_step = 5
>>> steps = 3
>>> prepare_steps_direct(max_step, steps)
[1, 2, 3]
>>> max_step = 5
>>> steps = [1, 3, 5]
>>> prepare_steps_direct(max_step, steps)
[1, 3, 5]
>>> max_step = 5
>>> steps = None
>>> prepare_steps_direct(max_step, steps)
[1, 2, 3, 4, 5]

preprocess_levels_self_last_window_multiseries

forecaster.utils.preprocess_levels_self_last_window_multiseries(
    levels,
    input_levels_is_list,
    last_window_,
)

Preprocess levels and last_window arguments for prediction.

Only levels whose last window ends at the same datetime index will be predicted together.

Parameters

Name Type Description Default
levels list Names of the series (levels) to be predicted. required
input_levels_is_list bool Indicates if input levels argument is a list. required
last_window_ dict Dictionary with the last window of each series. required

Returns

Name Type Description
tuple tuple[list[str], pd.DataFrame] A tuple containing: - levels (list): Names of the series (levels) to be predicted. - last_window (pd.DataFrame): Series values used to create predictors.

select_n_jobs_fit_forecaster

forecaster.utils.select_n_jobs_fit_forecaster(forecaster_name, estimator)

Select the number of jobs to run in parallel during the fit process.

This function determines the optimal number of parallel processes for fitting the forecaster based on the available system resources. In safety-critical environments, this helps manage computational load and ensures system predictability.

Parameters

Name Type Description Default
forecaster_name str Name of the forecaster being fitted. required
estimator object The estimator object being used by the forecaster. required

Returns

Name Type Description
int int The number of jobs (CPUs) to use for parallel processing.

set_cpu_gpu_device

forecaster.utils.set_cpu_gpu_device(estimator, device='cpu')

Set the device for the estimator.

Parameters

Name Type Description Default
estimator object Estimator object. required
device (str, None) Device to use. One of ‘cpu’, ‘gpu’, ‘cuda’, or None. Defaults to ‘cpu’. 'cpu'

Returns

Name Type Description
str | None str, None: The original device of the estimator.

transform_numpy

forecaster.utils.transform_numpy(
    array,
    transformer,
    fit=False,
    inverse_transform=False,
)

Transform raw values of a numpy ndarray with a scikit-learn alike transformer, preprocessor or ColumnTransformer. The transformer used must have the following methods: fit, transform, fit_transform and inverse_transform. ColumnTransformers are not allowed since they do not have inverse_transform method.

Parameters

Name Type Description Default
array np.ndarray Array to be transformed. required
transformer (object, None) Scikit-learn alike transformer, preprocessor, or ColumnTransformer with methods: fit, transform, fit_transform and inverse_transform. required
fit bool Train the transformer before applying it. Defaults to False. False
inverse_transform bool Transform back the data to the original representation. This is not available when using transformers of class scikit-learn ColumnTransformers. Defaults to False. False

Returns

Name Type Description
np.ndarray np.ndarray: Transformed array.

Raises

Name Type Description
TypeError If array is not a numpy ndarray.
TypeError If transformer is not a scikit-learn alike transformer.
ValueError If inverse_transform is True and transformer is a ColumnTransformer.

Examples

>>> from spotforecast2_safe.forecaster.utils import transform_numpy
>>> from sklearn.preprocessing import StandardScaler
>>> import numpy as np
>>> array = np.array([1, 2, 3, 4, 5])
>>> transformer = StandardScaler()
>>> array_transformed = transform_numpy(array, transformer, fit=True)
>>> print(array_transformed)
[-1.41421356 -0.70710678  0.          0.70710678  1.41421356]
>>> array_inversed = transform_numpy(array_transformed, transformer, inverse_transform=True)
>>> print(array_inversed)
[1. 2. 3. 4. 5.]