Tuple[pd.DataFrame, pd.Series]: A tuple containing the forward and backward filled DataFrame and a numeric series (0.0 or 1.0) where 0.0 indicates a weight for missing values/gaps.
Examples
import numpy as npimport pandas as pdfrom spotforecast2_safe.preprocessing.imputation import get_missing_weights# Synthetic DataFrame with a deliberate two-row NaN gap at positions 3-4idx = pd.date_range("2024-01-01", periods=10, freq="h")values = [1.0, 2.0, 3.0, None, None, 6.0, 7.0, 8.0, 9.0, 10.0]df = pd.DataFrame({"A": values}, index=idx)filled, weights = get_missing_weights(df, window_size=3, verbose=True)# No NaNs remain after forward/backward fillassert filled.isnull().sum().sum() ==0# Rows inside (and immediately after) the gap receive weight 0gap_weights = weights.loc[idx[3:5]]print(gap_weights.tolist())assert (gap_weights ==0.0).all()# Rows well before the gap retain weight 1assert weights.loc[idx[0]] ==1.0
Number of rows with missing values: 2
Percentage of rows with missing values: 20.00%
missing_indices: DatetimeIndex(['2024-01-01 03:00:00', '2024-01-01 04:00:00'], dtype='datetime64[us]', freq='h')
Number of rows with missing weights after processing: 5
Percentage of rows with missing weights after processing: 50.00%
[0.0, 0.0]