Function reference

Data

Utilities for fetching and loading time series, weather, and holiday data.

data.fetch_data.fetch_data Fetches a dataset from a CSV file or processes a DataFrame.
data.fetch_data.fetch_holiday_data Fetches holiday data for the dataset period.
data.fetch_data.fetch_weather_data Fetch weather data for the dataset period plus forecast horizon.
data.fetch_data.get_cache_home Return the location where persistent models are to be cached.
data.fetch_data.get_data_home Return the location where datasets are to be stored.
data.fetch_data.get_package_data_home Return the location of the internal package datasets.
data.fetch_data.load_timeseries Load the actual-load time series from interim/energy_load.csv.
data.fetch_data.load_timeseries_forecast Load the day-ahead forecast time series from interim/energy_load.csv.
data.fetch_data.load_renewable_forecast Load the ENTSO-E day-ahead wind/solar generation forecast.
data.fetch_data.load_day_ahead_price Load the ENTSO-E day-ahead spot price (DE/LU) as an hourly series.
data.data_classes Data structures for input and processed data.
data.demo_loader Demo data loader for safety-critical forecasting tasks.
data.entsoe_loader ENTSO-E interim-CSV data loaders.

Preprocessing

Hardened tools for data curation, resampling, outlier detection, feature engineering, and temporal train/test splitting.

preprocessing.coverage.assert_frontier_fresh Raise CoverageError if the data frontier is stale.
preprocessing.coverage.assert_actual_lag_within Raise CoverageError if the last published actual is too old.
preprocessing.coverage.assert_no_interior_gaps Raise CoverageError if the recent actuals contain large holes.
preprocessing.coverage.last_complete_hour Return the latest hour having a complete set of intra-hour samples.
preprocessing.curate_data.agg_and_resample_data Aggregates and resamples the data to (e.g., hourly) frequency by computing the specified aggregation (e.g. for each hour).
preprocessing.curate_data.basic_ts_checks Checks if the time series data has a datetime index and is sorted.
preprocessing.curate_data.curate_holidays Checks if the holiday dataframe has the correct shape.
preprocessing.curate_data.curate_weather Checks if the weather dataframe has the correct shape.
preprocessing.curate_data.get_start_end Get start and end date strings for data and covariate ranges.
preprocessing.curate_data.remove_duplicate_timestamps Resolve duplicate timestamps across all data columns.
preprocessing.curate_data.reset_index Resets the index of the dataframe and assigns a name to the index column.
preprocessing.outlier.get_outliers Detect outliers in each column using Isolation Forest.
preprocessing.outlier.manual_outlier_removal Manual outlier removal function.
preprocessing.outlier.mark_outliers Marks outliers as NaN in the dataset using Isolation Forest.
preprocessing.checking.check_exog Validate that exog is a pandas Series or DataFrame.
preprocessing.checking.check_exog_dtypes Check that exogenous variables have valid data types (int, float, category).
preprocessing.checking.check_interval Validate that a confidence interval specification is valid.
preprocessing.checking.check_predict_input Check all inputs of predict method. This is a helper function to validate
preprocessing.checking.check_residuals_input Check residuals input arguments in Forecasters.
preprocessing.checking.check_y Validate that y is a pandas Series without missing values.
preprocessing.checking.get_exog_dtypes Extract and store the data types of exogenous variables.
preprocessing.checking.set_cpu_gpu_device Set the device for the estimator to either ‘cpu’, ‘gpu’, ‘cuda’, or None.
preprocessing.exog_builder.ExogBuilder Builds a set of exogenous features for a given date range.
preprocessing.exog_providers.ExogFeatureProvider Contract for a pluggable exogenous-feature source.
preprocessing.exog_providers.CovidInfectionRateProvider German national COVID-19 7-day incidence as an exogenous level regressor.
preprocessing.exog_providers.EntsoeForecastLoadProvider ENTSO-E day-ahead Forecasted Load as an exogenous near-oracle prior.
preprocessing.exog_providers.EntsoeRenewableForecastProvider ENTSO-E day-ahead wind and solar generation forecast.
preprocessing.exog_providers.EntsoeNetLoadProvider ENTSO-E day-ahead net load = Forecasted Load − (wind + solar) forecast.
preprocessing.exog_providers.EntsoeDayAheadPriceProvider ENTSO-E day-ahead spot price (DE/LU) as an exogenous input.
preprocessing.exog_providers.EventWindowProvider Generic event-window provider driven by a bundled CSV file.
preprocessing.exog_providers.FootballMatchWindowProvider German football match event-window provider.
preprocessing.exog_providers.EnergyCrisisWindowProvider German energy-saving regulatory window provider.
preprocessing.exog_providers.build_providers Construct the providers whose flags are truthy, in registry order.
preprocessing.exog_providers.build_providers_from_config Construct providers by reading the registry flags off a config object.
preprocessing.imputation.apply_imputation Apply imputation to a DataFrame based on the method specified in config.
preprocessing.imputation.WeightFunction Callable class for sample weights that can be pickled.
preprocessing.target_corruption.TargetCorruptionReport Immutable summary of a target-corruption detection and policy run.
preprocessing.target_corruption.detect_target_corruption Detect physically-impossible target-column corruption in the native frame.
preprocessing.target_corruption.apply_target_corruption_policy Apply the configured corruption policy to the native-cadence frame.
preprocessing.imputation.custom_weights Return 0 if index is in or near any gap.
preprocessing.imputation.get_missing_weights Return imputed DataFrame and a series indicating missing weights.
preprocessing.data_transform Data transformation utilities for time series forecasting.
preprocessing.forecaster_config Forecaster configuration utilities.
preprocessing.linearly_interpolate_ts Linear interpolation transformer for time series data.
preprocessing.repeating_basis_function Repeating Basis Function transformer for cyclical features.
preprocessing.rolling.RollingFeatures Compute rolling window statistics over time series data.

Processing

Utilities for aggregated and n-to-n predictions.

processing.agg_predict.agg_predict Aggregates multiple prediction columns into a single combined prediction series.
processing.n2n_predict.n2n_predict End-to-end baseline forecasting using equivalent date method.
processing.n2n_predict_with_covariates.n2n_predict_with_covariates End-to-end recursive forecasting with exogenous covariates.
processing.shape_check.ShapeCheckReport Immutable result of a forecast shape plausibility check.
processing.shape_check.check_forecast_shape Measure correlation and daily-range ratio between a forecast and its reference.

Forecaster

Recursive forecasting classes, seasonal baselines, and metrics.

forecaster.base ForecasterBase class.
forecaster.recursive._forecaster_recursive.ForecasterRecursive Recursive autoregressive forecaster for scikit-learn compatible estimators.
forecaster.recursive._forecaster_equivalent_date.ForecasterEquivalentDate This forecaster predicts future values based on the most recent equivalent
forecaster.recursive._forecaster_recursive_multiseries
forecaster.metrics.add_y_train_argument Add y_train argument to a function if it is not already present.
forecaster.metrics.calculate_coverage Calculate coverage of a given interval.
forecaster.metrics.create_mean_pinball_loss Create pinball loss for a given quantile.
forecaster.metrics.crps_from_predictions Compute the Continuous Ranked Probability Score (CRPS) from predictions.
forecaster.metrics.crps_from_quantiles Calculate the Continuous Ranked Probability Score (CRPS) from quantiles.
forecaster.metrics.mean_absolute_scaled_error Mean Absolute Scaled Error (MASE).
forecaster.metrics.root_mean_squared_scaled_error Root Mean Squared Scaled Error (RMSSE).
forecaster.metrics.symmetric_mean_absolute_percentage_error Compute the Symmetric Mean Absolute Percentage Error (SMAPE).
forecaster.wrappers.model Recursive forecaster model wrappers for different estimators.
forecaster.wrappers.lgbm Recursive forecaster wrapper using LightGBM.
forecaster.wrappers.xgb Recursive forecaster model wrappers for different estimators.

Splitter

Time-series-aware cross-validation folds and train/val/test holdout helpers.

splitter.split_base Base class for time series cross-validation splitting.
splitter.split_one_step One step ahead cross-validation splitting.
splitter.split_ts_cv Time series cross-validation splitting.
splitter.split.split_abs_train_val_test Splits a time series DataFrame into training, validation, and test sets based on absolute timestamps.
splitter.split.split_rel_train_val_test Splits a time series DataFrame into training, validation, and test sets by percentages.
splitter.utils_common Common validation and initialization utilities for model selection.

Backtesting

Backtesting orchestration for time series forecasters.

backtesting.validation.backtesting_forecaster Backtesting of forecaster model following the folds generated by the TimeSeriesFold
backtesting.validation.backtesting_forecaster_one_step Backtesting of forecaster model using one-step-ahead predictions.
backtesting.validation

Configurator

Task-level configuration objects for the forecasting pipelines.

configurator.config_demo.ConfigDemo Configuration for the safety-critical demo task.
configurator.config_entsoe.ConfigEntsoe Configuration for the ENTSO-E forecasting pipeline.
configurator.config_multi.ConfigMulti Configuration for the multi-input forecasting pipeline.

Manager

High-level training, prediction, and model persistence orchestration.

manager.demo_metrics.calculate_metrics Calculate MAE and MSE for numeric evaluation.
manager.trainer.get_last_model Get the latest trained model from the cache.
manager.trainer.get_path_model Yield the path to a model file for a given iteration and model name.
manager.trainer.load_iteration Load a saved model at a given iteration.
manager.trainer.should_retrain Decide whether a forecaster should be retrained.
manager.predictor.build_prediction_package Build a prediction package for downstream plotting/reporting consumers.
manager.predictor.get_model_prediction Get the prediction package from the latest trained model.
manager.logger Audit-grade logging for spotforecast2-safe.
manager.persistence
manager.features.apply_cyclical_encoding Apply cyclical (sine/cosine) encoding to periodic integer features.
manager.features.create_interaction_features Append bilinear interaction terms to an exogenous feature matrix.
manager.features.select_exogenous_features Select and deduplicate exogenous feature columns for model training.
manager.features.merge_data_and_covariates Merge target data with exogenous features and split into train/predict slices.
manager.features.get_target_data Extract the training series and exogenous slices for one target column.
manager.features.select_top_poly_features Rank polynomial interaction columns by mutual information, keep the top K.

Multitask

Config-driven multi-target forecasting orchestrator. Tuning-free and plotting-free safe subset of the sibling spotforecast2.multitask. Provides BaseTask (shared pipeline steps), MultiTask (task dispatcher), task-specific classes (LazyTask, DefaultsTask, PredictTask, CleanTask), the module-level agg_predictor helper, a LightGBM factory, training strategies, and a one-call runner.

multitask.base.BaseTask Shared base for all multi-target forecasting pipeline tasks.
multitask.base.agg_predictor Aggregate per-target prediction packages into a weighted forecast.
multitask.multi.MultiTask Orchestrates a multi-target time-series forecasting pipeline.
multitask.lazy.LazyTask Task 1 — Lazy Fitting with default LightGBM parameters.
multitask.defaults.DefaultsTask Task 2 — Defaults fitting (no tuning, no cached params).
multitask.predict.PredictTask Task 5 — Predict-only using previously saved models.
multitask.clean.CleanTask Cache-cleaning task — removes all cached data from the pipeline cache.
multitask.factories.default_lgbm_forecaster_factory Return a fresh, unfitted LightGBM ForecasterRecursive.
multitask.factories.quantile_lgbm_forecaster_factory Return one quantile-regression LightGBM ForecasterRecursive per quantile.
multitask.factories.predict_quantile_band Assemble per-quantile forecasts into one non-crossing band.
multitask.guards.assert_no_leakage Raise LeakageError if any forbidden column reached the model.
multitask.strategies.TrainingStrategy Strategy interface for preparing a forecaster before the final fit.
multitask.strategies.LazyStrategy Approach 1 — Lazy fitting with optional cached tuning.
multitask.strategies.DefaultsStrategy Approach 2 — Train with defaults, no tuning, no cached params.
multitask.runner.run Run the MultiTask forecasting pipeline and return predictions.

Utils

General-purpose utility functions: CPE generation, validation, data transforms, and generic TTL-aware atomic snapshot store.

utils.cpe.get_cpe_identifier Generates the CPE 2.3 identifier for the spotforecast2-safe project.
utils.convert_to_utc Utility functions for timezone conversion.
utils.parse.parse_bool Parse case-insensitive boolean strings for CLI arguments.
utils.snapshot_store.SnapshotStore Generic TTL-aware atomic snapshot store.
utils.snapshot_store.parse_snapshot_timestamp Parse the UTC timestamp encoded in a snapshot filename stem.

Stats

Statistical primitives for time-series diagnostics. Pure compute, no plotting dependencies — safety-critical-friendly.

stats.stationarity.augmented_dickey_fuller Run an Augmented Dickey-Fuller test on a time series.
stats.spectral.compute_periodogram Compute the periodogram of a time series.
stats.spectral.PeriodogramResult Container for the output of compute_periodogram().

Security

Masking and PII-protection utilities for safe logging (CWE-532, CWE-312).

security.masking.mask_estimator Return a non-sensitive string representation of an estimator.

Calendar

Calendar, holiday, and day/night feature engineering for covariate construction.

calendar.holiday.create_holiday_df Create a DataFrame with datetime index and a binary holiday indicator column.
calendar.holiday.get_holiday_features Build public-holiday indicators and align them to a regular time grid.
calendar.holiday.create_holiday_adjacency_df Create a DataFrame with binary adjacency indicators for public holidays.
calendar.holiday.get_holiday_adjacency_features Build holiday-adjacency indicators and align them to a regular time grid.
calendar.holiday.create_day_type_df Create a day-type refinement of the public-holiday column.
calendar.holiday.get_day_type_features Build day-type indicators and align them to a regular time grid.
calendar.holiday.create_school_holiday_df Create a DataFrame with a binary school-holiday indicator for a German state.
calendar.holiday.get_school_holiday_features Build per-Bundesland school-holiday indicators and align them to a forecast grid.
calendar.features.get_calendar_features Create calendar-based features for a contiguous time range.
calendar.features.get_day_night_features Create day/night features using astronomical sunrise and sunset times.
calendar.features.get_ephemeris_features Create continuous solar-geometry features from the ephemeris.

Weather

Weather data integration using the Open-Meteo API, derived weather features (degree-hours, apparent temperature, dew point), and population-weighted multi-city spatial aggregation.

weather.client.WeatherClient Client for fetching weather data from Open-Meteo API.
weather.client.WeatherService High-level service for weather data generation.
weather.features.get_weather_features Fetch weather data and compute rolling-window features.
weather.derived.heating_degree_hours Heating degree-hours :math:\max(base - T, 0).
weather.derived.cooling_degree_hours Cooling degree-hours :math:\max(T - base, 0).
weather.derived.dew_point Dew-point temperature via the Magnus-Tetens approximation.
weather.derived.apparent_temperature Steadman apparent (“feels-like”) temperature.
weather.derived.population_weighted_average Combine per-location weather frames into one demand-weighted index.
weather.derived.add_derived_weather_features Append the requested derived columns to a raw weather frame (fail-safe).
weather.locations.WeatherLocation A single weather sampling location with a population weight.
weather.locations.default_german_locations Return the default population-weighted German load-centre registry.
weather.locations.coordinates Extract (latitude, longitude) pairs in order.
weather.locations.weights Extract the raw (un-normalised) weights in order.

Downloader

Data downloaders for external data sources (e.g. ENTSO-E).

downloader.entsoe.download_new_data Download new load and forecast data from ENTSO-E.
downloader.entsoe.download_renewable_forecast Download the ENTSO-E day-ahead wind/solar generation forecast.
downloader.entsoe.download_day_ahead_price Download the ENTSO-E day-ahead spot price (DE/LU).
downloader.entsoe.merge_build_manual Merge all raw CSV files from the ‘raw’ directory into a single interim file.
downloader.entsoe.download_zone_loads Download Actual Total Load separately for each German TSO control area.
downloader.entsoe.assemble_zone_loads Join the per-zone interim load files into one aligned, validated frame.
downloader.entsoe.ZoneResult Structured result record for one zone in a download_zone_loads collect run.
downloader.entsoe.build_zone_qc_frame Build a bottom-up QC frame from per-zone interim CSVs.

Exceptions

Custom exception types for safety-critical failure signalling.

exceptions Custom exceptions and warnings for spotforecast2.
exceptions.CoverageError Exception raised when operational data-coverage requirements are violated.
exceptions.LeakageError Exception raised when forbidden columns are detected in model inputs.

Tasks

Executable tasks for demonstration and production pipelines.

tasks.task_safe_demo Task demo: compare baseline, covariate, and custom LightGBM forecasts against ground truth.
tasks.task_safe_zone_load_demo Task demo: four-zone bottom-up total-load forecast vs. a direct aggregate.