forecaster.utils
forecaster.utils
Functions
| Name | Description |
|---|---|
| align_series_and_exog_multiseries | Align series and exog according to their index. |
| check_extract_values_and_index | Extract values and index from a pandas Series or DataFrame, ensuring they are valid. |
| check_preprocess_series | Check and preprocess series argument in ForecasterRecursiveMultiSeries class. |
| check_residuals_input | Check residuals input arguments in Forecasters. |
| date_to_index_position | Transform a datetime string or pandas Timestamp to an integer position. |
| exog_to_direct | Transforms exog to a pandas DataFrame with the shape needed for Direct |
| exog_to_direct_numpy | Transforms exog to numpy ndarray with the shape needed for Direct |
| get_style_repr_html | Generate CSS style for HTML representation of the Forecaster. |
| initialize_differentiator_multiseries | Initialize differentiator_ attribute for multiseries forecasters. |
| initialize_estimator | Handle the deprecation of ‘regressor’ in favor of ‘estimator’. |
| initialize_transformer_series | Initialize transformer_series_ attribute for multivariate/multiseries forecasters. |
| initialize_window_features | Check window_features argument input and generate the corresponding list. |
| predict_multivariate | Generate multi-output predictions using multiple baseline forecasters. |
| prepare_levels_multiseries | Prepare list of levels to be predicted in multiseries Forecasters. |
| prepare_steps_direct | Prepare list of steps to be predicted in Direct Forecasters. |
| preprocess_levels_self_last_window_multiseries | Preprocess levels and last_window arguments for prediction. |
| select_n_jobs_fit_forecaster | Select the number of jobs to run in parallel during the fit process. |
| set_cpu_gpu_device | Set the device for the estimator. |
| transform_numpy | Transform raw values of a numpy ndarray with a scikit-learn alike |
align_series_and_exog_multiseries
forecaster.utils.align_series_and_exog_multiseries(series_dict, exog_dict=None)Align series and exog according to their index.
Heading and trailing NaNs are removed from all series in series_dict. If needed, reindexing is applied to exog_dict.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| series_dict | dict | Dictionary with the series used during training. | required |
| exog_dict | (dict, None) | Dictionary with the exogenous variable/s used during training. Defaults to None. |
None |
Returns
| Name | Type | Description |
|---|---|---|
| tuple | tuple[dict[str, pd.Series], dict[str, pd.DataFrame | None]] | A tuple containing: - series_dict (dict): Dictionary with the aligned series. - exog_dict (dict): Dictionary with the aligned exogenous variables. |
check_extract_values_and_index
forecaster.utils.check_extract_values_and_index(
data,
data_label='`y`',
ignore_freq=False,
return_values=True,
)Extract values and index from a pandas Series or DataFrame, ensuring they are valid.
Validates that the input data has a proper DatetimeIndex or RangeIndex and extracts its values and index for use in forecasting operations. Optionally checks for index frequency consistency.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| data | Union[pd.Series, pd.DataFrame] | Input data (pandas Series or DataFrame) to extract values and index from. | required |
| data_label | str | Label used in exception messages for better error reporting. Defaults to “y”. |
'y' |
| ignore_freq | bool | If True, the frequency of the index is not checked. Defaults to False. | False |
| return_values | bool | If True, the values of the data are returned. Defaults to True. | True |
Returns
| Name | Type | Description |
|---|---|---|
| tuple | Tuple[Optional[np.ndarray], pd.Index] | A tuple containing: - values (numpy.ndarray or None): Values of the data as numpy array, or None if return_values is False. - index (pandas.Index): Index of the data. |
Raises
| Name | Type | Description |
|---|---|---|
| TypeError | If data is not a pandas Series or DataFrame. | |
| TypeError | If data index is not a DatetimeIndex or RangeIndex. |
Warns
If DatetimeIndex has no frequency (inferred automatically).
Examples
>>> import pandas as pd
>>> import numpy as np
>>> dates = pd.date_range('2020-01-01', periods=10, freq='D')
>>> series = pd.Series(np.arange(10), index=dates)
>>> values, index = check_extract_values_and_index(series)
>>> print(values.shape)
(10,)
>>> print(type(index))
<class 'pandas.core.indexes.datetimes.DatetimeIndex'>Extract index only:
>>> _, index = check_extract_values_and_index(series, return_values=False)
>>> print(index[0])
2020-01-01 00:00:00check_preprocess_series
forecaster.utils.check_preprocess_series(series)Check and preprocess series argument in ForecasterRecursiveMultiSeries class.
- If `series` is a wide-format pandas DataFrame, each column represents a
different time series, and the index must be either a `DatetimeIndex` or
a `RangeIndex` with frequency or step size, as appropriate
- If `series` is a long-format pandas DataFrame with a MultiIndex, the
first level of the index must contain the series IDs, and the second
level must be a `DatetimeIndex` with the same frequency across all series.
- If series is a dictionary, each key must be a series ID, and each value
must be a named pandas Series. All series must have the same index, which
must be either a `DatetimeIndex` or a `RangeIndex`, and they must share the
same frequency or step size, as appropriate.
When series is a pandas DataFrame, it is converted to a dictionary of pandas Series, where the keys are the series IDs and the values are the Series with the same index as the original DataFrame.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| series | (pd.DataFrame, dict) | pandas DataFrame or dictionary of pandas Series/DataFrames. | required |
Returns
| Name | Type | Description |
|---|---|---|
| tuple | tuple[dict[str, pd.Series], dict[str, pd.Index]] | A tuple containing: - series_dict (dict): Dictionary where keys are series IDs and values are pandas Series. - series_indexes (dict): Dictionary where keys are series IDs and values are the index of each series. |
Raises: TypeError: If series is not a pandas DataFrame or a dictionary of pandas Series/DataFrames. TypeError: If the index of series is not a DatetimeIndex or RangeIndex with frequency/step size. ValueError: If the series in series have different frequencies or step sizes. ValueError: If all values of any series are NaN. UserWarning: If series is a wide-format DataFrame, only the first column will be used as series values. UserWarning: If series is a DataFrame (either wide or long format), additional internal transformations are required, which can increase computational time. It is recommended to use a dictionary of pandas Series instead.
Examples
>>> import pandas as pd
>>> from spotforecast2_safe.forecaster.utils import check_preprocess_series
>>> # Example with wide-format DataFrame
>>> dates = pd.date_range('2020-01-01', periods=5, freq='D')
>>> df_wide = pd.DataFrame({
... 'series_1': [1, 2, 3, 4, 5],
... 'series_2': [5, 4, 3, 2, 1],
... }, index=dates)
>>> series_dict, series_indexes = check_preprocess_series(df_wide)
UserWarning: `series` DataFrame has multiple columns. Only the values of first column, 'series_1', will be used as series values. All other columns will be ignored.
UserWarning: Passing a DataFrame (either wide or long format) as `series` requires additional internal transformations, which can increase computational time.
It is recommended to use a dictionary of pandas Series instead.
>>> print(series_dict['series_1'])
2020-01-01 1
2020-01-02 2
2020-01-03 3
2020-01-04 4
2020-01-05 5
Name: series_1, dtype: int64
>>> print(series_indexes['series_1'])
DatetimeIndex(['2020-01-01', '2020-01-02', '2020-01-03', '2020-01-04',
'2020-01-05'],
dtype='datetime64[ns]', freq='D')
>>> # Example with long-format DataFrame
>>> df_long = pd.DataFrame({
... 'series_id': ['series_1'] * 5 + ['series_2'] * 5,
... 'value': [1, 2, 3, 4, 5, 5, 4, 3, 2, 1],
... }, index=pd.MultiIndex.from_product([['series_1', 'series_2'], dates], names=['series_id', 'date']))
>>> series_dict, series_indexes = check_preprocess_series(df_long)
UserWarning: `series` DataFrame has multiple columns. Only the values of first column, 'value', will be used as series values. All other columns will be ignored.
UserWarning: Passing a DataFrame (either wide or long format) as `series` requires additional internal transformations, which can increase computational time.
It is recommended to use a dictionary of pandas Series instead.
>>> print(series_dict['series_1'])
2020-01-01 1
2020-01-02 2
2020-01-03 3
2020-01-04 4
2020-01-05 5
Name: series_1, dtype: int64
>>> print(series_indexes['series_1'])
DatetimeIndex(['2020-01-01', '2020-01-02', '2020-01-03', '2020-01-04',
'2020-01-05'],
dtype='datetime64[ns]', freq='D')>>> # Example with dictionary of Series
>>> series_dict_input = {
... 'series_1': pd.Series([1, 2, 3, 4, 5], index=dates),
... 'series_2': pd.Series([5, 4, 3, 2, 1], index=dates),
... }
>>> series_dict, series_indexes = check_preprocess_series(series_dict_input)
>>> print(series_dict['series_1'])
2020-01-01 1
2020-01-02 2
2020-01-03 3
2020-01-04 4
2020-01-05 5
Name: series_1, dtype: int64
>>> print(series_indexes['series_1'])
DatetimeIndex(['2020-01-01', '2020-01-02', '2020-01-03', '2020-01-04',
'2020-01-05'],
dtype='datetime64[ns]', freq='D')
>>> # Example with dictionary of DataFrames
>>> df_series_1 = pd.DataFrame({'value': [1, 2, 3, 4, 5]}, index=dates)
>>> df_series_2 = pd.DataFrame({'value': [5, 4, 3, 2, 1]}, index=dates)
>>> series_dict_input = {
... 'series_1': df_series_1,
... 'series_2': df_series_2,
... }
>>> series_dict, series_indexes = check_preprocess_series(series_dict_input)
>>> print(series_dict['series_1'])
2020-01-01 1
2020-01-02 2
2020-01-03 3
2020-01-04 4
2020-01-05 5
Name: series_1, dtype: int64
>>> print(series_indexes['series_1'])
DatetimeIndex(['2020-01-01', '2020-01-02', '2020-01-03', '2020-01-04',
'2020-01-05'],
dtype='datetime64[ns]', freq='D')check_residuals_input
forecaster.utils.check_residuals_input(
forecaster_name,
use_in_sample_residuals,
in_sample_residuals_,
out_sample_residuals_,
use_binned_residuals,
in_sample_residuals_by_bin_,
out_sample_residuals_by_bin_,
levels=None,
encoding=None,
)Check residuals input arguments in Forecasters.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| forecaster_name | str | Forecaster name. | required |
| use_in_sample_residuals | bool | Indicates if in sample or out sample residuals are used. | required |
| in_sample_residuals_ | (np.ndarray, dict, None) | Residuals of the model when predicting training data. | required |
| out_sample_residuals_ | (np.ndarray, dict, None) | Residuals of the model when predicting non training data. | required |
| use_binned_residuals | bool | Indicates if residuals are binned. | required |
| in_sample_residuals_by_bin_ | (dict, None) | In sample residuals binned according to the predicted value. | required |
| out_sample_residuals_by_bin_ | (dict, None) | Out of sample residuals binned according to the predicted value. | required |
| levels | (list, None) | Names of the series (levels) to be predicted. Defaults to None. |
None |
| encoding | (str, None) | Encoding used to identify the different series. Defaults to None. |
None |
Returns
| Name | Type | Description |
|---|---|---|
| None | None |
Examples
>>> from spotforecast2_safe.forecaster.utils import check_residuals_input
>>> import numpy as np
>>> forecaster_name = "ForecasterRecursiveMultiSeries"
>>> use_in_sample_residuals = True
>>> in_sample_residuals_ = {'series_1': np.array([0.1, -0.2]), 'series_2': np.array([0.3, -0.1])}
>>> out_sample_residuals_ = None
>>> use_binned_residuals = False
>>> check_residuals_input(
... forecaster_name,
... use_in_sample_residuals,
... in_sample_residuals_,
... out_sample_residuals_,
... use_binned_residuals,
... in_sample_residuals_by_bin_=None,
... out_sample_residuals_by_bin_=None,
... levels=['series_1', 'series_2'],
... encoding='onehot'
... )date_to_index_position
forecaster.utils.date_to_index_position(
index,
date_input,
method='prediction',
date_literal='steps',
kwargs_pd_to_datetime={},
)Transform a datetime string or pandas Timestamp to an integer position.
The integer represents the position of the datetime in the index.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| index | pd.Index | Original datetime index. | required |
| date_input | (int, str, pd.Timestamp) | Datetime to transform. | required |
| method | str | Strategy to use. Options: ‘prediction’, ‘validation’. Defaults to 'prediction'. |
'prediction' |
| date_literal | str | Variable name used in error messages. Defaults to 'steps'. |
'steps' |
| kwargs_pd_to_datetime | dict | Keyword arguments for pd.to_datetime(). Defaults to {}. |
{} |
Returns
| Name | Type | Description |
|---|---|---|
| int | int | date_input transformed to integer position in the index. |
Raises
| Name | Type | Description |
|---|---|---|
| ValueError | If method is not ‘prediction’ or ‘validation’. |
|
| TypeError | If date_input is not an int, str, or pandas Timestamp. |
|
| TypeError | If index is not a pandas DatetimeIndex when date_input is not an int. |
|
| ValueError | If date_input is a date and does not meet requirement. |
Examples
>>> from spotforecast2_safe.forecaster.utils import date_to_index_position
>>> import pandas as pd
>>> index = pd.date_range(start='2020-01-01', periods=10, freq='D')
>>> # Using an integer input
>>> date_to_index_position(index, 5)
5
>>> # Using a date input for prediction
>>> date_to_index_position(index, '2020-01-15', method='prediction')
5
>>> # Using a date input for validation
>>> date_to_index_position(index, '2020-01-05', method='validation')
5exog_to_direct
forecaster.utils.exog_to_direct(exog, steps)Transforms exog to a pandas DataFrame with the shape needed for Direct forecasting.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| exog | (pd.Series, pd.DataFrame) | Exogenous variables. | required |
| steps | int | Number of steps that will be predicted using exog. | required |
Returns
| Name | Type | Description |
|---|---|---|
| tuple | tuple[pd.DataFrame, list[str]] | A tuple containing: - exog_direct (pd.DataFrame): Exogenous variables transformed. - exog_direct_names (list): Names of the columns of the exogenous variables transformed. |
exog_to_direct_numpy
forecaster.utils.exog_to_direct_numpy(exog, steps)Transforms exog to numpy ndarray with the shape needed for Direct forecasting.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| exog | (np.ndarray, pd.Series, pd.DataFrame) | Exogenous variables, shape(samples,). If exog is a pandas format, the direct exog names are created. | required |
| steps | int | Number of steps that will be predicted using exog. | required |
Returns
| Name | Type | Description |
|---|---|---|
| tuple | tuple[np.ndarray, list[str] | None] | A tuple containing: - exog_direct (np.ndarray): Exogenous variables transformed. - exog_direct_names (list, None): Names of the columns of the exogenous variables transformed. Only created if exog is a pandas format. |
Examples
>>> from spotforecast2_safe.forecaster.utils import exog_to_direct_numpy
>>> import numpy as np
>>> exog = np.array([10, 20, 30, 40, 50])
>>> steps = 3
>>> exog_direct, exog_direct_names = exog_to_direct_numpy(exog, steps)
>>> print(exog_direct)
[[10 20 30]
[20 30 40]
[30 40 50]]
>>> print(exog_direct_names)
Noneget_style_repr_html
forecaster.utils.get_style_repr_html(is_fitted=False)Generate CSS style for HTML representation of the Forecaster.
Creates a unique CSS style block with a container ID for rendering forecaster objects in Jupyter notebooks or HTML documents. The styling provides a clean, monospace display with a light gray background.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| is_fitted | bool | Parameter to indicate if the Forecaster has been fitted. Currently not used in styling but reserved for future extensions. | False |
Returns
| Name | Type | Description |
|---|---|---|
| tuple | Tuple[str, str] | A tuple containing: - style (str): CSS style block as a string with unique container class. - unique_id (str): Unique 8-character ID for the container element. |
Examples
>>> style, uid = get_style_repr_html(is_fitted=True)
>>> print(f"Container ID: {uid}")
Container ID: a1b2c3d4
>>> print(f"Style contains CSS: {'container-' in style}")
Style contains CSS: TrueUsing in HTML rendering:
>>> style, uid = get_style_repr_html(is_fitted=False)
>>> html = f"{style}<div class='container-{uid}'>Forecaster Info</div>"
>>> print("background-color" in html)
Trueinitialize_differentiator_multiseries
forecaster.utils.initialize_differentiator_multiseries(
series_names_in_,
differentiator=None,
)Initialize differentiator_ attribute for multiseries forecasters.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| series_names_in_ | list | Names of the series (levels) used during training. | required |
| differentiator | (object, dict, None) | Skforecast object (or dict of objects) used to differentiate the time series. Defaults to None. |
None |
Returns
| Name | Type | Description |
|---|---|---|
| dict | dict[str, object | None] | Dictionary with the differentiator for each series. |
initialize_estimator
forecaster.utils.initialize_estimator(estimator=None, regressor=None)Handle the deprecation of ‘regressor’ in favor of ‘estimator’.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| estimator | (object, None) | Estimator or pipeline compatible with the scikit-learn API. Defaults to None. |
None |
| regressor | (object, None) | Deprecated. Alias for estimator. Defaults to None. |
None |
Returns
| Name | Type | Description |
|---|---|---|
| object | None | The valid estimator object. |
Raises
| Name | Type | Description |
|---|---|---|
| ValueError | If both estimator and regressor are provided. |
Examples
>>> from spotforecast2_safe.forecaster.utils import initialize_estimator
>>> from sklearn.linear_model import LinearRegression
>>> # Using the `estimator` argument
>>> estimator = LinearRegression()
>>> initialize_estimator(estimator=estimator)
LinearRegression()
>>> # Using the deprecated `regressor` argument
>>> regressor = LinearRegression()
>>> initialize_estimator(regressor=regressor)
LinearRegression()initialize_transformer_series
forecaster.utils.initialize_transformer_series(
forecaster_name,
series_names_in_,
encoding=None,
transformer_series=None,
)Initialize transformer_series_ attribute for multivariate/multiseries forecasters.
Creates a dictionary of transformers for each time series in multivariate or multiseries forecasting. Handles three cases: no transformation (None), same transformer for all series (single object), or different transformers per series (dictionary). Clones transformer objects to avoid overwriting.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| forecaster_name | str | Name of the forecaster using this function. Special handling is applied for ‘ForecasterRecursiveMultiSeries’. | required |
| series_names_in_ | list[str] | Names of the time series (levels) used during training. These will be the keys in the returned transformer dictionary. | required |
| encoding | str | None | Encoding used to identify different series. Only used for ForecasterRecursiveMultiSeries. If None, creates a single ’_unknown_level’ entry. Defaults to None. | None |
| transformer_series | object | dict[str, object | None] | None | Transformer(s) to apply to series. Can be: - None: No transformation applied - Single transformer object: Same transformer cloned for all series - Dict mapping series names to transformers: Different transformer per series Defaults to None. | None |
Returns
| Name | Type | Description |
|---|---|---|
| dict | dict[str, object | None] | Dictionary with series names as keys and transformer objects (or None) as values. Transformers are cloned to prevent overwriting. |
Warns
If transformer_series is a dict and some series_names_in_ are not present in the dict keys (those series get no transformation).
Examples
No transformation:
>>> from spotforecast2_safe.forecaster.utils import initialize_transformer_series
>>> series = ['series1', 'series2', 'series3']
>>> result = initialize_transformer_series(
... forecaster_name='ForecasterDirectMultiVariate',
... series_names_in_=series,
... transformer_series=None
... )
>>> print(result)
{'series1': None, 'series2': None, 'series3': None}Same transformer for all series:
>>> from sklearn.preprocessing import StandardScaler
>>> scaler = StandardScaler()
>>> result = initialize_transformer_series(
... forecaster_name='ForecasterDirectMultiVariate',
... series_names_in_=['series1', 'series2'],
... transformer_series=scaler
... )
>>> len(result)
2
>>> all(isinstance(v, StandardScaler) for v in result.values())
True
>>> result['series1'] is result['series2'] # Different clones
FalseDifferent transformer per series:
>>> from sklearn.preprocessing import MinMaxScaler
>>> transformers = {
... 'series1': StandardScaler(),
... 'series2': MinMaxScaler()
... }
>>> result = initialize_transformer_series(
... forecaster_name='ForecasterDirectMultiVariate',
... series_names_in_=['series1', 'series2'],
... transformer_series=transformers
... )
>>> isinstance(result['series1'], StandardScaler)
True
>>> isinstance(result['series2'], MinMaxScaler)
Trueinitialize_window_features
forecaster.utils.initialize_window_features(window_features)Check window_features argument input and generate the corresponding list.
This function validates window feature objects and extracts their metadata, ensuring they have the required attributes (window_sizes, features_names) and methods (transform_batch, transform) for proper forecasting operations.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| window_features | Any | Classes used to create window features. Can be a single object or a list of objects. Each object must have window_sizes, features_names attributes and transform_batch, transform methods. |
required |
Returns
| Name | Type | Description |
|---|---|---|
| tuple | Tuple[Optional[List[object]], Optional[List[str]], Optional[int]] | A tuple containing: - window_features (list or None): List of classes used to create window features. - window_features_names (list or None): List with all the features names of the window features. - max_size_window_features (int or None): Maximum value of the window_sizes attribute of all classes. |
Raises
| Name | Type | Description |
|---|---|---|
| ValueError | If window_features is an empty list. |
|
| ValueError | If a window feature is missing required attributes or methods. | |
| TypeError | If window_sizes or features_names have incorrect types. |
Examples
>>> from spotforecast2_safe.forecaster.preprocessing import RollingFeatures
>>> wf = RollingFeatures(stats=['mean', 'std'], window_sizes=[7, 14])
>>> wf_list, names, max_size = initialize_window_features(wf)
>>> print(f"Max window size: {max_size}")
Max window size: 14
>>> print(f"Number of features: {len(names)}")
Number of features: 4Multiple window features:
>>> class MockWF:
... def __init__(self, names, sizes):
... self.features_names = names
... self.window_sizes = sizes
... def transform_batch(self, X): pass
... def transform(self, X): pass
>>> wf1 = MockWF(['f1'], 7)
>>> wf2 = MockWF(['f2', 'f3'], 3)
>>> wf_list, names, max_size = initialize_window_features([wf1, wf2])
>>> print(f"Max window size: {max_size}")
Max window size: 7predict_multivariate
forecaster.utils.predict_multivariate(
forecasters,
steps_ahead,
exog=None,
show_progress=False,
)Generate multi-output predictions using multiple baseline forecasters.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| forecasters | dict | Dictionary of fitted forecaster instances (one per target). Keys are target names, values are the fitted forecasters (e.g., ForecasterRecursive, ForecasterEquivalentDate). | required |
| steps_ahead | int | Number of steps to forecast. | required |
| exog | pd.DataFrame | Exogenous variables for prediction. If provided, will be passed to each forecaster’s predict method. | None |
| show_progress | bool | Show progress bar while predicting per target forecaster. Default: False. | False |
Returns
| Name | Type | Description |
|---|---|---|
| pd.DataFrame | pd.DataFrame: DataFrame with predictions for all targets. |
Examples
>>> import pandas as pd
>>> from sklearn.linear_model import LinearRegression
>>> from spotforecast2_safe.forecaster.recursive import ForecasterRecursive
>>> from spotforecast2_safe.forecaster.utils import predict_multivariate
>>> y1 = pd.Series([1, 2, 3, 4, 5])
>>> y2 = pd.Series([2, 4, 6, 8, 10])
>>> f1 = ForecasterRecursive(estimator=LinearRegression(), lags=2)
>>> f2 = ForecasterRecursive(estimator=LinearRegression(), lags=2)
>>> f1.fit(y=y1)
>>> f2.fit(y=y2)
>>> forecasters = {'target1': f1, 'target2': f2}
>>> predictions = predict_multivariate(forecasters, steps_ahead=2)
>>> predictions
target1 target2
5 6.0 12.0
6 7.0 14.0prepare_levels_multiseries
forecaster.utils.prepare_levels_multiseries(
X_train_series_names_in_,
levels=None,
)Prepare list of levels to be predicted in multiseries Forecasters.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| X_train_series_names_in_ | list | Names of the series (levels) included in the matrix X_train. |
required |
| levels | (str, list, None) | Names of the series (levels) to be predicted. Defaults to None. |
None |
Returns
| Name | Type | Description |
|---|---|---|
| tuple | tuple[list[str], bool] | A tuple containing: - levels (list): Names of the series (levels) to be predicted. - input_levels_is_list (bool): Indicates if input levels argument is a list. |
prepare_steps_direct
forecaster.utils.prepare_steps_direct(max_step, steps=None)Prepare list of steps to be predicted in Direct Forecasters.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| max_step | (int, list, np.ndarray) | Maximum number of future steps the forecaster will predict. | required |
| steps | (int, list, None) | Predict n steps. The value of steps must be less than or equal to the value of steps defined when initializing the forecaster. Starts at 1. Defaults to None. |
None |
Returns
| Name | Type | Description |
|---|---|---|
| list | list[int] | Steps to be predicted. |
Examples
>>> from spotforecast2_safe.forecaster.utils import prepare_steps_direct
>>> max_step = 5
>>> steps = 3
>>> prepare_steps_direct(max_step, steps)
[1, 2, 3]>>> max_step = 5
>>> steps = [1, 3, 5]
>>> prepare_steps_direct(max_step, steps)
[1, 3, 5]>>> max_step = 5
>>> steps = None
>>> prepare_steps_direct(max_step, steps)
[1, 2, 3, 4, 5]preprocess_levels_self_last_window_multiseries
forecaster.utils.preprocess_levels_self_last_window_multiseries(
levels,
input_levels_is_list,
last_window_,
)Preprocess levels and last_window arguments for prediction.
Only levels whose last window ends at the same datetime index will be predicted together.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| levels | list | Names of the series (levels) to be predicted. | required |
| input_levels_is_list | bool | Indicates if input levels argument is a list. | required |
| last_window_ | dict | Dictionary with the last window of each series. | required |
Returns
| Name | Type | Description |
|---|---|---|
| tuple | tuple[list[str], pd.DataFrame] | A tuple containing: - levels (list): Names of the series (levels) to be predicted. - last_window (pd.DataFrame): Series values used to create predictors. |
select_n_jobs_fit_forecaster
forecaster.utils.select_n_jobs_fit_forecaster(forecaster_name, estimator)Select the number of jobs to run in parallel during the fit process.
This function determines the optimal number of parallel processes for fitting the forecaster based on the available system resources. In safety-critical environments, this helps manage computational load and ensures system predictability.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| forecaster_name | str | Name of the forecaster being fitted. | required |
| estimator | object | The estimator object being used by the forecaster. | required |
Returns
| Name | Type | Description |
|---|---|---|
| int | int | The number of jobs (CPUs) to use for parallel processing. |
set_cpu_gpu_device
forecaster.utils.set_cpu_gpu_device(estimator, device='cpu')Set the device for the estimator.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| estimator | object | Estimator object. | required |
| device | (str, None) | Device to use. One of ‘cpu’, ‘gpu’, ‘cuda’, or None. Defaults to ‘cpu’. | 'cpu' |
Returns
| Name | Type | Description |
|---|---|---|
| str | None | str, None: The original device of the estimator. |
transform_numpy
forecaster.utils.transform_numpy(
array,
transformer,
fit=False,
inverse_transform=False,
)Transform raw values of a numpy ndarray with a scikit-learn alike transformer, preprocessor or ColumnTransformer. The transformer used must have the following methods: fit, transform, fit_transform and inverse_transform. ColumnTransformers are not allowed since they do not have inverse_transform method.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| array | np.ndarray | Array to be transformed. | required |
| transformer | (object, None) | Scikit-learn alike transformer, preprocessor, or ColumnTransformer with methods: fit, transform, fit_transform and inverse_transform. | required |
| fit | bool | Train the transformer before applying it. Defaults to False. |
False |
| inverse_transform | bool | Transform back the data to the original representation. This is not available when using transformers of class scikit-learn ColumnTransformers. Defaults to False. |
False |
Returns
| Name | Type | Description |
|---|---|---|
| np.ndarray | np.ndarray: Transformed array. |
Raises
| Name | Type | Description |
|---|---|---|
| TypeError | If array is not a numpy ndarray. |
|
| TypeError | If transformer is not a scikit-learn alike transformer. |
|
| ValueError | If inverse_transform is True and transformer is a ColumnTransformer. |
Examples
>>> from spotforecast2_safe.forecaster.utils import transform_numpy
>>> from sklearn.preprocessing import StandardScaler
>>> import numpy as np
>>> array = np.array([1, 2, 3, 4, 5])
>>> transformer = StandardScaler()
>>> array_transformed = transform_numpy(array, transformer, fit=True)
>>> print(array_transformed)
[-1.41421356 -0.70710678 0. 0.70710678 1.41421356]
>>> array_inversed = transform_numpy(array_transformed, transformer, inverse_transform=True)
>>> print(array_inversed)
[1. 2. 3. 4. 5.]