Function reference
Data
Utilities for fetching and loading time series, weather, and holiday data.
| data.fetch_data.fetch_data | Fetches a dataset from a CSV file or processes a DataFrame. |
| data.fetch_data.fetch_holiday_data | Fetches holiday data for the dataset period. |
| data.fetch_data.fetch_weather_data | Fetch weather data for the dataset period plus forecast horizon. |
| data.fetch_data.get_cache_home | Return the location where persistent models are to be cached. |
| data.fetch_data.get_data_home | Return the location where datasets are to be stored. |
| data.fetch_data.get_package_data_home | Return the location of the internal package datasets. |
| data.fetch_data.load_timeseries | Load the actual-load time series from interim/energy_load.csv. |
| data.fetch_data.load_timeseries_forecast | Load the day-ahead forecast time series from interim/energy_load.csv. |
| data.fetch_data.load_renewable_forecast | Load the ENTSO-E day-ahead wind/solar generation forecast. |
| data.fetch_data.load_day_ahead_price | Load the ENTSO-E day-ahead spot price (DE/LU) as an hourly series. |
| data.data_classes | Data structures for input and processed data. |
| data.demo_loader | Demo data loader for safety-critical forecasting tasks. |
| data.entsoe_loader | ENTSO-E interim-CSV data loaders. |
Preprocessing
Hardened tools for data curation, resampling, outlier detection, feature engineering, and temporal train/test splitting.
| preprocessing.coverage.assert_frontier_fresh | Raise CoverageError if the data frontier is stale. |
| preprocessing.coverage.assert_actual_lag_within | Raise CoverageError if the last published actual is too old. |
| preprocessing.coverage.assert_no_interior_gaps | Raise CoverageError if the recent actuals contain large holes. |
| preprocessing.coverage.last_complete_hour | Return the latest hour having a complete set of intra-hour samples. |
| preprocessing.curate_data.agg_and_resample_data | Aggregates and resamples the data to (e.g., hourly) frequency by computing the specified aggregation (e.g. for each hour). |
| preprocessing.curate_data.basic_ts_checks | Checks if the time series data has a datetime index and is sorted. |
| preprocessing.curate_data.curate_holidays | Checks if the holiday dataframe has the correct shape. |
| preprocessing.curate_data.curate_weather | Checks if the weather dataframe has the correct shape. |
| preprocessing.curate_data.get_start_end | Get start and end date strings for data and covariate ranges. |
| preprocessing.curate_data.remove_duplicate_timestamps | Resolve duplicate timestamps across all data columns. |
| preprocessing.curate_data.reset_index | Resets the index of the dataframe and assigns a name to the index column. |
| preprocessing.outlier.get_outliers | Detect outliers in each column using Isolation Forest. |
| preprocessing.outlier.manual_outlier_removal | Manual outlier removal function. |
| preprocessing.outlier.mark_outliers | Marks outliers as NaN in the dataset using Isolation Forest. |
| preprocessing.checking.check_exog | Validate that exog is a pandas Series or DataFrame. |
| preprocessing.checking.check_exog_dtypes | Check that exogenous variables have valid data types (int, float, category). |
| preprocessing.checking.check_interval | Validate that a confidence interval specification is valid. |
| preprocessing.checking.check_predict_input | Check all inputs of predict method. This is a helper function to validate |
| preprocessing.checking.check_residuals_input | Check residuals input arguments in Forecasters. |
| preprocessing.checking.check_y | Validate that y is a pandas Series without missing values. |
| preprocessing.checking.get_exog_dtypes | Extract and store the data types of exogenous variables. |
| preprocessing.checking.set_cpu_gpu_device | Set the device for the estimator to either ‘cpu’, ‘gpu’, ‘cuda’, or None. |
| preprocessing.exog_builder.ExogBuilder | Builds a set of exogenous features for a given date range. |
| preprocessing.exog_providers.ExogFeatureProvider | Contract for a pluggable exogenous-feature source. |
| preprocessing.exog_providers.CovidInfectionRateProvider | German national COVID-19 7-day incidence as an exogenous level regressor. |
| preprocessing.exog_providers.EntsoeForecastLoadProvider | ENTSO-E day-ahead Forecasted Load as an exogenous near-oracle prior. |
| preprocessing.exog_providers.EntsoeRenewableForecastProvider | ENTSO-E day-ahead wind and solar generation forecast. |
| preprocessing.exog_providers.EntsoeNetLoadProvider | ENTSO-E day-ahead net load = Forecasted Load − (wind + solar) forecast. |
| preprocessing.exog_providers.EntsoeDayAheadPriceProvider | ENTSO-E day-ahead spot price (DE/LU) as an exogenous input. |
| preprocessing.exog_providers.EventWindowProvider | Generic event-window provider driven by a bundled CSV file. |
| preprocessing.exog_providers.FootballMatchWindowProvider | German football match event-window provider. |
| preprocessing.exog_providers.EnergyCrisisWindowProvider | German energy-saving regulatory window provider. |
| preprocessing.exog_providers.build_providers | Construct the providers whose flags are truthy, in registry order. |
| preprocessing.exog_providers.build_providers_from_config | Construct providers by reading the registry flags off a config object. |
| preprocessing.imputation.apply_imputation | Apply imputation to a DataFrame based on the method specified in config. |
| preprocessing.imputation.WeightFunction | Callable class for sample weights that can be pickled. |
| preprocessing.target_corruption.TargetCorruptionReport | Immutable summary of a target-corruption detection and policy run. |
| preprocessing.target_corruption.detect_target_corruption | Detect physically-impossible target-column corruption in the native frame. |
| preprocessing.target_corruption.apply_target_corruption_policy | Apply the configured corruption policy to the native-cadence frame. |
| preprocessing.imputation.custom_weights | Return 0 if index is in or near any gap. |
| preprocessing.imputation.get_missing_weights | Return imputed DataFrame and a series indicating missing weights. |
| preprocessing.data_transform | Data transformation utilities for time series forecasting. |
| preprocessing.forecaster_config | Forecaster configuration utilities. |
| preprocessing.linearly_interpolate_ts | Linear interpolation transformer for time series data. |
| preprocessing.repeating_basis_function | Repeating Basis Function transformer for cyclical features. |
| preprocessing.rolling.RollingFeatures | Compute rolling window statistics over time series data. |
Processing
Utilities for aggregated and n-to-n predictions.
| processing.agg_predict.agg_predict | Aggregates multiple prediction columns into a single combined prediction series. |
| processing.n2n_predict.n2n_predict | End-to-end baseline forecasting using equivalent date method. |
| processing.n2n_predict_with_covariates.n2n_predict_with_covariates | End-to-end recursive forecasting with exogenous covariates. |
| processing.shape_check.ShapeCheckReport | Immutable result of a forecast shape plausibility check. |
| processing.shape_check.check_forecast_shape | Measure correlation and daily-range ratio between a forecast and its reference. |
Forecaster
Recursive forecasting classes, seasonal baselines, and metrics.
| forecaster.base | ForecasterBase class. |
| forecaster.recursive._forecaster_recursive.ForecasterRecursive | Recursive autoregressive forecaster for scikit-learn compatible estimators. |
| forecaster.recursive._forecaster_equivalent_date.ForecasterEquivalentDate | This forecaster predicts future values based on the most recent equivalent |
| forecaster.recursive._forecaster_recursive_multiseries | |
| forecaster.metrics.add_y_train_argument | Add y_train argument to a function if it is not already present. |
| forecaster.metrics.calculate_coverage | Calculate coverage of a given interval. |
| forecaster.metrics.create_mean_pinball_loss | Create pinball loss for a given quantile. |
| forecaster.metrics.crps_from_predictions | Compute the Continuous Ranked Probability Score (CRPS) from predictions. |
| forecaster.metrics.crps_from_quantiles | Calculate the Continuous Ranked Probability Score (CRPS) from quantiles. |
| forecaster.metrics.mean_absolute_scaled_error | Mean Absolute Scaled Error (MASE). |
| forecaster.metrics.root_mean_squared_scaled_error | Root Mean Squared Scaled Error (RMSSE). |
| forecaster.metrics.symmetric_mean_absolute_percentage_error | Compute the Symmetric Mean Absolute Percentage Error (SMAPE). |
| forecaster.wrappers.model | Recursive forecaster model wrappers for different estimators. |
| forecaster.wrappers.lgbm | Recursive forecaster wrapper using LightGBM. |
| forecaster.wrappers.xgb | Recursive forecaster model wrappers for different estimators. |
Splitter
Time-series-aware cross-validation folds and train/val/test holdout helpers.
| splitter.split_base | Base class for time series cross-validation splitting. |
| splitter.split_one_step | One step ahead cross-validation splitting. |
| splitter.split_ts_cv | Time series cross-validation splitting. |
| splitter.split.split_abs_train_val_test | Splits a time series DataFrame into training, validation, and test sets based on absolute timestamps. |
| splitter.split.split_rel_train_val_test | Splits a time series DataFrame into training, validation, and test sets by percentages. |
| splitter.utils_common | Common validation and initialization utilities for model selection. |
Backtesting
Backtesting orchestration for time series forecasters.
| backtesting.validation.backtesting_forecaster | Backtesting of forecaster model following the folds generated by the TimeSeriesFold |
| backtesting.validation.backtesting_forecaster_one_step | Backtesting of forecaster model using one-step-ahead predictions. |
| backtesting.validation |
Configurator
Task-level configuration objects for the forecasting pipelines.
| configurator.config_demo.ConfigDemo | Configuration for the safety-critical demo task. |
| configurator.config_entsoe.ConfigEntsoe | Configuration for the ENTSO-E forecasting pipeline. |
| configurator.config_multi.ConfigMulti | Configuration for the multi-input forecasting pipeline. |
Manager
High-level training, prediction, and model persistence orchestration.
| manager.demo_metrics.calculate_metrics | Calculate MAE and MSE for numeric evaluation. |
| manager.trainer.get_last_model | Get the latest trained model from the cache. |
| manager.trainer.get_path_model | Yield the path to a model file for a given iteration and model name. |
| manager.trainer.load_iteration | Load a saved model at a given iteration. |
| manager.trainer.should_retrain | Decide whether a forecaster should be retrained. |
| manager.predictor.build_prediction_package | Build a prediction package for downstream plotting/reporting consumers. |
| manager.predictor.get_model_prediction | Get the prediction package from the latest trained model. |
| manager.logger | Audit-grade logging for spotforecast2-safe. |
| manager.persistence | |
| manager.features.apply_cyclical_encoding | Apply cyclical (sine/cosine) encoding to periodic integer features. |
| manager.features.create_interaction_features | Append bilinear interaction terms to an exogenous feature matrix. |
| manager.features.select_exogenous_features | Select and deduplicate exogenous feature columns for model training. |
| manager.features.merge_data_and_covariates | Merge target data with exogenous features and split into train/predict slices. |
| manager.features.get_target_data | Extract the training series and exogenous slices for one target column. |
| manager.features.select_top_poly_features | Rank polynomial interaction columns by mutual information, keep the top K. |
Multitask
Config-driven multi-target forecasting orchestrator. Tuning-free and plotting-free safe subset of the sibling spotforecast2.multitask. Provides BaseTask (shared pipeline steps), MultiTask (task dispatcher), task-specific classes (LazyTask, DefaultsTask, PredictTask, CleanTask), the module-level agg_predictor helper, a LightGBM factory, training strategies, and a one-call runner.
| multitask.base.BaseTask | Shared base for all multi-target forecasting pipeline tasks. |
| multitask.base.agg_predictor | Aggregate per-target prediction packages into a weighted forecast. |
| multitask.multi.MultiTask | Orchestrates a multi-target time-series forecasting pipeline. |
| multitask.lazy.LazyTask | Task 1 — Lazy Fitting with default LightGBM parameters. |
| multitask.defaults.DefaultsTask | Task 2 — Defaults fitting (no tuning, no cached params). |
| multitask.predict.PredictTask | Task 5 — Predict-only using previously saved models. |
| multitask.clean.CleanTask | Cache-cleaning task — removes all cached data from the pipeline cache. |
| multitask.factories.default_lgbm_forecaster_factory | Return a fresh, unfitted LightGBM ForecasterRecursive. |
| multitask.factories.quantile_lgbm_forecaster_factory | Return one quantile-regression LightGBM ForecasterRecursive per quantile. |
| multitask.factories.predict_quantile_band | Assemble per-quantile forecasts into one non-crossing band. |
| multitask.guards.assert_no_leakage | Raise LeakageError if any forbidden column reached the model. |
| multitask.strategies.TrainingStrategy | Strategy interface for preparing a forecaster before the final fit. |
| multitask.strategies.LazyStrategy | Approach 1 — Lazy fitting with optional cached tuning. |
| multitask.strategies.DefaultsStrategy | Approach 2 — Train with defaults, no tuning, no cached params. |
| multitask.runner.run | Run the MultiTask forecasting pipeline and return predictions. |
Utils
General-purpose utility functions: CPE generation, validation, data transforms, and generic TTL-aware atomic snapshot store.
| utils.cpe.get_cpe_identifier | Generates the CPE 2.3 identifier for the spotforecast2-safe project. |
| utils.convert_to_utc | Utility functions for timezone conversion. |
| utils.parse.parse_bool | Parse case-insensitive boolean strings for CLI arguments. |
| utils.snapshot_store.SnapshotStore | Generic TTL-aware atomic snapshot store. |
| utils.snapshot_store.parse_snapshot_timestamp | Parse the UTC timestamp encoded in a snapshot filename stem. |
Stats
Statistical primitives for time-series diagnostics. Pure compute, no plotting dependencies — safety-critical-friendly.
| stats.stationarity.augmented_dickey_fuller | Run an Augmented Dickey-Fuller test on a time series. |
| stats.spectral.compute_periodogram | Compute the periodogram of a time series. |
| stats.spectral.PeriodogramResult | Container for the output of compute_periodogram(). |
Security
Masking and PII-protection utilities for safe logging (CWE-532, CWE-312).
| security.masking.mask_estimator | Return a non-sensitive string representation of an estimator. |
Calendar
Calendar, holiday, and day/night feature engineering for covariate construction.
| calendar.holiday.create_holiday_df | Create a DataFrame with datetime index and a binary holiday indicator column. |
| calendar.holiday.get_holiday_features | Build public-holiday indicators and align them to a regular time grid. |
| calendar.holiday.create_holiday_adjacency_df | Create a DataFrame with binary adjacency indicators for public holidays. |
| calendar.holiday.get_holiday_adjacency_features | Build holiday-adjacency indicators and align them to a regular time grid. |
| calendar.holiday.create_day_type_df | Create a day-type refinement of the public-holiday column. |
| calendar.holiday.get_day_type_features | Build day-type indicators and align them to a regular time grid. |
| calendar.holiday.create_school_holiday_df | Create a DataFrame with a binary school-holiday indicator for a German state. |
| calendar.holiday.get_school_holiday_features | Build per-Bundesland school-holiday indicators and align them to a forecast grid. |
| calendar.features.get_calendar_features | Create calendar-based features for a contiguous time range. |
| calendar.features.get_day_night_features | Create day/night features using astronomical sunrise and sunset times. |
| calendar.features.get_ephemeris_features | Create continuous solar-geometry features from the ephemeris. |
Weather
Weather data integration using the Open-Meteo API, derived weather features (degree-hours, apparent temperature, dew point), and population-weighted multi-city spatial aggregation.
| weather.client.WeatherClient | Client for fetching weather data from Open-Meteo API. |
| weather.client.WeatherService | High-level service for weather data generation. |
| weather.features.get_weather_features | Fetch weather data and compute rolling-window features. |
| weather.derived.heating_degree_hours | Heating degree-hours :math:\max(base - T, 0). |
| weather.derived.cooling_degree_hours | Cooling degree-hours :math:\max(T - base, 0). |
| weather.derived.dew_point | Dew-point temperature via the Magnus-Tetens approximation. |
| weather.derived.apparent_temperature | Steadman apparent (“feels-like”) temperature. |
| weather.derived.population_weighted_average | Combine per-location weather frames into one demand-weighted index. |
| weather.derived.add_derived_weather_features | Append the requested derived columns to a raw weather frame (fail-safe). |
| weather.locations.WeatherLocation | A single weather sampling location with a population weight. |
| weather.locations.default_german_locations | Return the default population-weighted German load-centre registry. |
| weather.locations.coordinates | Extract (latitude, longitude) pairs in order. |
| weather.locations.weights | Extract the raw (un-normalised) weights in order. |
Downloader
Data downloaders for external data sources (e.g. ENTSO-E).
| downloader.entsoe.download_new_data | Download new load and forecast data from ENTSO-E. |
| downloader.entsoe.download_renewable_forecast | Download the ENTSO-E day-ahead wind/solar generation forecast. |
| downloader.entsoe.download_day_ahead_price | Download the ENTSO-E day-ahead spot price (DE/LU). |
| downloader.entsoe.merge_build_manual | Merge all raw CSV files from the ‘raw’ directory into a single interim file. |
| downloader.entsoe.download_zone_loads | Download Actual Total Load separately for each German TSO control area. |
| downloader.entsoe.assemble_zone_loads | Join the per-zone interim load files into one aligned, validated frame. |
| downloader.entsoe.ZoneResult | Structured result record for one zone in a download_zone_loads collect run. |
| downloader.entsoe.build_zone_qc_frame | Build a bottom-up QC frame from per-zone interim CSVs. |
Exceptions
Custom exception types for safety-critical failure signalling.
| exceptions | Custom exceptions and warnings for spotforecast2. |
| exceptions.CoverageError | Exception raised when operational data-coverage requirements are violated. |
| exceptions.LeakageError | Exception raised when forbidden columns are detected in model inputs. |
Tasks
Executable tasks for demonstration and production pipelines.
| tasks.task_safe_demo | Task demo: compare baseline, covariate, and custom LightGBM forecasts against ground truth. |
| tasks.task_safe_zone_load_demo | Task demo: four-zone bottom-up total-load forecast vs. a direct aggregate. |