Extract values and index from a pandas Series or DataFrame, ensuring they are valid.
Validates that the input data has a proper DatetimeIndex or RangeIndex and extracts its values and index for use in forecasting operations. Optionally checks for index frequency consistency.
A tuple containing: - values (numpy.ndarray or None): Values of the data as numpy array, or None if return_values is False. - index (pandas.Index): Index of the data.
If data index is not a DatetimeIndex or RangeIndex.
Warns
If DatetimeIndex has no frequency (inferred automatically).
Examples
import numpy as npimport pandas as pdfrom spotforecast2_safe.forecaster.utils import check_extract_values_and_indexdates = pd.date_range('2020-01-01', periods=10, freq='D')series = pd.Series(np.arange(10), index=dates)values, index = check_extract_values_and_index(series)print(values.shape)print(type(index))
(10,)
<class 'pandas.DatetimeIndex'>
import numpy as npimport pandas as pdfrom spotforecast2_safe.forecaster.utils import check_extract_values_and_indexdates = pd.date_range('2020-01-01', periods=10, freq='D')series = pd.Series(np.arange(10), index=dates)_, index = check_extract_values_and_index(series, return_values=False)print(index[0])
2020-01-01 00:00:00
check_preprocess_series
forecaster.utils.check_preprocess_series(series)
Check and preprocess series argument in ForecasterRecursiveMultiSeries class.
- If `series` is a wide-format pandas DataFrame, each column represents a
different time series, and the index must be either a `DatetimeIndex` or
a `RangeIndex` with frequency or step size, as appropriate
- If `series` is a long-format pandas DataFrame with a MultiIndex, the
first level of the index must contain the series IDs, and the second
level must be a `DatetimeIndex` with the same frequency across all series.
- If series is a dictionary, each key must be a series ID, and each value
must be a named pandas Series. All series must have the same index, which
must be either a `DatetimeIndex` or a `RangeIndex`, and they must share the
same frequency or step size, as appropriate.
When series is a pandas DataFrame, it is converted to a dictionary of pandas Series, where the keys are the series IDs and the values are the Series with the same index as the original DataFrame.
A tuple containing: - series_dict (dict): Dictionary where keys are series IDs and values are pandas Series. - series_indexes (dict): Dictionary where keys are series IDs and values are the index of each series.
Raises: TypeError: If series is not a pandas DataFrame or a dictionary of pandas Series/DataFrames. TypeError: If the index of series is not a DatetimeIndex or RangeIndex with frequency/step size. ValueError: If the series in series have different frequencies or step sizes. ValueError: If all values of any series are NaN. UserWarning: If series is a wide-format DataFrame, only the first column will be used as series values. UserWarning: If series is a DataFrame (either wide or long format), additional internal transformations are required, which can increase computational time. It is recommended to use a dictionary of pandas Series instead.
A tuple containing: - exog_direct (pd.DataFrame): Exogenous variables transformed. - exog_direct_names (list): Names of the columns of the exogenous variables transformed.
A tuple containing: - exog_direct (np.ndarray): Exogenous variables transformed. - exog_direct_names (list, None): Names of the columns of the exogenous variables transformed. Only created if exog is a pandas format.
Generate CSS style for HTML representation of the Forecaster.
Creates a unique CSS style block with a container ID for rendering forecaster objects in Jupyter notebooks or HTML documents. The styling provides a clean, monospace display with a light gray background.
A tuple containing: - style (str): CSS style block as a string with unique container class. - unique_id (str): Unique 8-character ID for the container element.
Initialize transformer_series_ attribute for multivariate/multiseries forecasters.
Creates a dictionary of transformers for each time series in multivariate or multiseries forecasting. Handles three cases: no transformation (None), same transformer for all series (single object), or different transformers per series (dictionary). Clones transformer objects to avoid overwriting.
Encoding used to identify different series. Only used for ForecasterRecursiveMultiSeries. If None, creates a single ’_unknown_level’ entry. Defaults to None.
Transformer(s) to apply to series. Can be: - None: No transformation applied - Single transformer object: Same transformer cloned for all series - Dict mapping series names to transformers: Different transformer per series Defaults to None.
from sklearn.preprocessing import StandardScalerfrom spotforecast2_safe.forecaster.utils import initialize_transformer_seriesscaler = StandardScaler()result = initialize_transformer_series( forecaster_name='ForecasterDirectMultiVariate', series_names_in_=['series1', 'series2'], transformer_series=scaler,)print(len(result))print(all(isinstance(v, StandardScaler) for v in result.values()))print(result['series1'] is result['series2'])
Check window_features argument input and generate the corresponding list.
This function validates window feature objects and extracts their metadata, ensuring they have the required attributes (window_sizes, features_names) and methods (transform_batch, transform) for proper forecasting operations.
Classes used to create window features. Can be a single object or a list of objects. Each object must have window_sizes, features_names attributes and transform_batch, transform methods.
A tuple containing: - window_features (list or None): List of classes used to create window features. - window_features_names (list or None): List with all the features names of the window features. - max_size_window_features (int or None): Maximum value of the window_sizes attribute of all classes.
Dictionary of fitted forecaster instances (one per target). Keys are target names, values are the fitted forecasters (e.g., ForecasterRecursive, ForecasterEquivalentDate).
A tuple containing: - levels (list): Names of the series (levels) to be predicted. - input_levels_is_list (bool): Indicates if input levels argument is a list.
Predict n steps. The value of steps must be less than or equal to the value of steps defined when initializing the forecaster. Starts at 1. Defaults to None.
A tuple containing: - levels (list): Names of the series (levels) to be predicted. - last_window (pd.DataFrame): Series values used to create predictors.
Select the number of jobs to run in parallel during the fit process.
This function determines the optimal number of parallel processes for fitting the forecaster based on the available system resources. In safety-critical environments, this helps manage computational load and ensures system predictability.
Transform raw values of a numpy ndarray with a scikit-learn alike transformer, preprocessor or ColumnTransformer. The transformer used must have the following methods: fit, transform, fit_transform and inverse_transform. ColumnTransformers are not allowed since they do not have inverse_transform method.
Transform back the data to the original representation. This is not available when using transformers of class scikit-learn ColumnTransformers. Defaults to False.