manager.features.select_exogenous_features(
exogenous_features,
weather_aligned,
cyclical_regex= '_sin$|_cos$' ,
include_weather_windows= False ,
include_holiday_features= False ,
include_poly_features= False ,
)
Select and deduplicate exogenous feature columns for model training.
Builds a prioritised, deduplicated list of column names from exogenous_features suitable for passing as exog to a recursive forecaster. The selection order is:
Cyclical sine/cosine columns (always included).
Weather rolling-window columns (optional, include_weather_windows).
Raw weather columns shared with weather_aligned .
Holiday-related columns starting with "holiday" (optional).
Polynomial interaction columns starting with "poly_" (optional).
Duplicates are removed while preserving insertion order.
Parameters
exogenous_features
pd .DataFrame
DataFrame containing the full set of candidate feature columns.
required
weather_aligned
pd .DataFrame
DataFrame whose column names identify the raw ( non-window, non-polynomial) weather variables.
required
cyclical_regex
str
Regular expression matched against column names to detect cyclical sine/cosine features. Defaults to "_sin$\|_cos$".
'_sin$|_cos$'
include_weather_windows
bool
If True, include rolling-window weather columns (those containing "_window_" plus "_mean", "_min", or "_max"). Defaults to False.
False
include_holiday_features
bool
If True, include columns whose names start with "holiday". Defaults to False.
False
include_poly_features
bool
If True, include polynomial interaction columns whose names start with "poly_". Defaults to False.
False
Returns
List [str ]
List[str]: Deduplicated list of selected column names in priority
List [str ]
order.
Examples
Select cyclical and raw weather columns from a feature matrix:
import numpy as np
import pandas as pd
from spotforecast2_safe.manager.features import select_exogenous_features
rng = np.random.default_rng(1 )
idx = pd.date_range("2024-01-01" , periods= 24 , freq= "h" , tz= "UTC" )
weather = pd.DataFrame({"wind_speed" : rng.uniform(0 , 10 , 24 )}, index= idx)
exog = pd.DataFrame(
{
"hour_sin" : np.sin(2 * np.pi * idx.hour / 24 ),
"hour_cos" : np.cos(2 * np.pi * idx.hour / 24 ),
"wind_speed" : weather["wind_speed" ],
"holiday_flag" : 0 ,
},
index= idx,
)
selected = select_exogenous_features(
exogenous_features= exog,
weather_aligned= weather,
include_holiday_features= False ,
)
print ("selected:" , selected)
selected: ['hour_sin', 'hour_cos', 'wind_speed']