manager.features.merge_data_and_covariates(
data,
exogenous_features,
target_columns,
exog_features,
start,
end,
cov_end,
forecast_horizon,
cast_dtype= 'float32' ,
)
Merge target data with exogenous features and split into train/predict slices.
Performs an inner join of the selected target_columns from data with the selected exog_features from exogenous_features over the training window [start, end]. A separate prediction covariate slice (end+1h, cov_end] is also returned for use during inference.
String timestamps are converted to UTC-aware :class:~pandas.Timestamp objects automatically.
Parameters
data
pd .DataFrame
DataFrame containing one or more target time series with a tz-aware :class:~pandas.DatetimeIndex.
required
exogenous_features
pd .DataFrame
DataFrame with all exogenous feature columns, covering at least the window [start, cov_end].
required
target_columns
List [str ]
Column names of the target variables to keep from data .
required
exog_features
List [str ]
Column names of the exogenous features to include in the merged output and the prediction slice.
required
start
Union [str , pd .Timestamp ]
Inclusive start of the training window. String values are parsed with utc=True.
required
end
Union [str , pd .Timestamp ]
Inclusive end of the training window. String values are parsed with utc=True.
required
cov_end
Union [str , pd .Timestamp ]
Inclusive end of the covariate (forecast) window. String values are parsed with utc=True.
required
forecast_horizon
int
Number of forecast steps ahead (informational; used by calling code to validate slice length).
required
cast_dtype
Optional [str ]
NumPy dtype string applied to the merged training DataFrame via :meth:~pandas.DataFrame.astype. Pass None to skip casting. Defaults to "float32".
'float32'
Returns
pd .DataFrame
Tuple[pd.DataFrame, pd.DataFrame, pd.DataFrame]: A three-tuple
pd .DataFrame
(data_with_exog, exo_tmp, exo_pred) where:
pd .DataFrame
- data_with_exog — training-window DataFrame with target and exogenous columns merged (inner join on index).
Tuple [pd .DataFrame , pd .DataFrame , pd .DataFrame ]
- exo_tmp — full exogenous slice over [start, end] (all columns, not just exog_features ).
Tuple [pd .DataFrame , pd .DataFrame , pd .DataFrame ]
- exo_pred — forecast-window exogenous slice over (end+1h, cov_end] (all columns).
Examples
Merge a toy target series with calendar features over a 3-day window:
import numpy as np
import pandas as pd
from spotforecast2_safe.manager.features import merge_data_and_covariates
idx = pd.date_range("2024-01-01" , periods= 120 , freq= "h" , tz= "UTC" )
data = pd.DataFrame({"load" : np.random.default_rng(42 ).normal(100 , 10 , 120 )}, index= idx)
exog = pd.DataFrame(
{"hour_sin" : np.sin(2 * np.pi * idx.hour / 24 ),
"hour_cos" : np.cos(2 * np.pi * idx.hour / 24 )},
index= idx,
)
start = pd.Timestamp("2024-01-01 00:00" , tz= "UTC" )
end = pd.Timestamp("2024-01-04 23:00" , tz= "UTC" ) # 96 h training
cov_end = pd.Timestamp("2024-01-05 23:00" , tz= "UTC" ) # 24 h forecast
merged, exo_train, exo_pred = merge_data_and_covariates(
data= data,
exogenous_features= exog,
target_columns= ["load" ],
exog_features= ["hour_sin" , "hour_cos" ],
start= start,
end= end,
cov_end= cov_end,
forecast_horizon= 24 ,
)
print ("merged shape: " , merged.shape)
print ("exo_train shape:" , exo_train.shape)
print ("exo_pred shape: " , exo_pred.shape)
merged shape: (96, 3)
exo_train shape: (96, 2)
exo_pred shape: (24, 2)