Model/Method Card: spotforecast2-safe
Model/Method Card: spotforecast2-safe
This card describes what spotforecast2-safe is, how to use it safely, the conditions under which its results are valid, and the responsibilities it places on anyone who deploys it. It follows the Hugging Face Model Card Guidebook taxonomy.
1. Model Details
| Field | Value |
|---|---|
| Name | spotforecast2-safe |
| Version | 22.3.0-rc.1 |
| Type | Deterministic Python library for time series feature engineering and recursive multi-step forecasting. It performs no training of its own. |
| Developed by | Thomas Bartz-Beielstein, ORCID 0000-0002-5938-5158 |
| Distributed by | the sequential-parameter-optimization GitHub organization |
| Language | Python 3.13 or newer |
| License | AGPL-3.0-or-later (Affero General Public License) |
| Repository | https://github.com/sequential-parameter-optimization/spotforecast2-safe |
| Technical report | bart26h/index.qmd, shipped in the source tree |
The library depends only on numpy, pandas, scikit-learn, lightgbm, numba, pyarrow, requests, feature-engine, holidays, astral, and tqdm. It deliberately excludes plotly, matplotlib, spotoptim, optuna, torch, and tensorflow, so no plotting or automated-tuning code ships in this package.
Two Common Platform Enumeration (CPE) identifiers let vulnerability-tracking and software bill of materials (SBOM) tools recognize the package. The wildcard identifier cpe:2.3:a:sequential_parameter_optimization:spotforecast2_safe:*:*:*:*:*:*:*:* matches any release; the current release is cpe:2.3:a:sequential_parameter_optimization:spotforecast2_safe:22.3.0-rc.1:*:*:*:*:*:*:*.
The library itself is a low-risk component: it is deterministic, its source is fully inspectable, and it fails safe on invalid input. It is built to support high-risk AI systems in the sense of the EU AI Act, but it is not itself such a system. When it is embedded in a high-risk deployment, the duties that attach to that system fall on the integrator, not on the library.
Responsibilities are divided as follows.
| Responsibility | Party | Contact |
|---|---|---|
| Library development and maintenance | Thomas Bartz-Beielstein | bartzbeielstein@gmail.com |
| Distribution | sequential-parameter-optimization on GitHub | repository issue tracker |
| Deployment, operation, and audit | the system integrator | defined per deployment |
The current release is 22.3.0-rc.1, with a stable public interface pinned in spotforecast2_safe.__init__.__all__. The full version history, including release dates, is recorded in CHANGELOG.md and on the GitHub Releases page; it is maintained automatically by the release pipeline and is not repeated here.
2. Intended Use and Scope
spotforecast2-safe prepares time series data for regression models in auditable settings such as energy supply, finance, and industrial monitoring. It runs in resource-constrained or embedded environments where heavier machine-learning frameworks are unavailable, and it produces bit-for-bit reproducible lag transformations with no hidden randomness.
The feature matrices it builds feed directly into scikit-learn regressors, LightGBM, or XGBoost, either through the bundled ForecasterRecursiveLGBM and ForecasterRecursiveXGB wrappers or through custom forecasters built on the ForecasterRecursiveModel base class.
The package has clear limits. It does not visualize data, since no plotting backend ships with it. It does not tune hyperparameters; tuning belongs in a separate workflow outside the safety-critical environment. It does not clean data silently: missing (NaN) or infinite (Inf) values raise an error rather than being imputed without the caller’s consent.
3. How to Get Started
pip install spotforecast2-safeimport numpy as np
import pandas as pd
from spotforecast2_safe import ForecasterRecursiveLGBM
# A short hourly demonstration series
idx = pd.date_range("2023-01-01", periods=50, freq="h")
y = pd.Series(100 + 10 * np.sin(np.linspace(0, 4 * np.pi, 50)), index=idx)
forecaster = ForecasterRecursiveLGBM(iteration=0, lags=3)
forecaster.fit(y=y)
predictions = forecaster.forecaster.predict(steps=2)A complete reference workflow, which compares a seasonal baseline, a covariate model, and a tuned LightGBM forecaster against ground truth, is registered as a console script:
uv run spotforecast-safe-demoIts source is in src/spotforecast2_safe/tasks/task_safe_demo.py.
4. Technical Specification
Task and model family
The library addresses recursive multi-step forecasting of a univariate time series from its own past values (lags), rolling-window features, and exogenous regressors. The forecasters are scikit-learn-compatible wrappers around a regressor that the caller supplies, such as LightGBM or XGBoost. The wrapper handles feature construction, the recursive prediction loop, and persistence, while the supplied regressor handles learning. The library fixes no model size, because size is a property of the chosen regressor and its configuration.
Mathematical description
For a univariate series \(X = \{x_1, x_2, \ldots, x_T\}\) and a window of \(w\) lags, the transformation builds one feature row per target value:
\[X_{row, t} = [x_{t-w}, x_{t-w+1}, \ldots, x_{t-1}] \rightarrow y_t = x_t.\]
The target \(y_t\) never appears in its own feature row, which prevents look-ahead leakage by construction.
Architecture
The package is layered. The forecaster layer holds the low-level estimator wrappers. The preprocessing layer holds deterministic transformers such as ExogBuilder, RepeatingBasisFunction, QuantileBinner, and TimeSeriesDifferentiator. The model_selection layer holds time-aware cross-validation (TimeSeriesFold, OneStepAheadFold, and backtesting_forecaster), which avoids the future-data leakage that ordinary random splits cause. The manager layer orchestrates these into ForecasterRecursiveLGBM, ForecasterRecursiveXGB, and the ConfigEntsoe configuration object. The processing and tasks layers compose these into end-to-end pipelines and console entry points.
Training
The library trains nothing on its own. Training happens in the downstream regressor and is the integrator’s responsibility, so the training data, hyperparameters, and learning schedule belong to each deployment rather than to the library. The bundled demo is a concrete, reproducible example of such a deployment.
Running uv run spotforecast-safe-demo forecasts an aggregated hourly energy series from a bundled CSV fixture. The demo uses a 24-hour forecast horizon, 24 lags, a 72-hour rolling window, an outlier-contamination fraction of 0.01, and an 80/20 train and test split, with a fixed random seed of 42. It builds calendar, cyclic, day-and-night (from sunrise and sunset), weather, and holiday features for the Dortmund location (51.5136, 7.4653). One of the three models it compares is a LightGBM regressor with the following configuration:
| Hyperparameter | Value |
|---|---|
| n_estimators | 1059 |
| learning_rate | 0.0419 |
| num_leaves | 212 |
| min_child_samples | 54 |
| subsample | 0.501 |
| colsample_bytree | 0.608 |
| random_state | 42 |
| verbose | -1 |
These are the demo’s chosen values, shown to illustrate a deployment, not a recommended default.
Design objectives
Three properties hold by design. The library is deterministic, so the same input produces the same output bit for bit. Its construction is leakage-free, so a target value is never part of its own feature row. Its behavior is fail-safe, so invalid input raises an explicit exception instead of being silently repaired.
5. Interfaces and Runtime
The target is a numeric univariate series (a pandas Series or NumPy array) carrying a regular, monotonic date-time index. Exogenous features are a numeric pandas DataFrame aligned to that index and complete; any missing entry in the exogenous features raises a ValueError before a prediction is made. Inside the pipeline the data passes through lag-matrix construction, cyclic encoding of calendar features, optional outlier handling and imputation that the caller enables explicitly, and a cast of the feature matrix to 32-bit floating point for memory efficiency. The forecaster returns predictions as a series whose length equals the requested horizon (24 steps in the demo), in the same units as the target. A fitted forecaster is serialized with joblib at compression level 3 using the .joblib extension, and is reloaded through the same persistence helpers.
The library runs on Python 3.13 or newer on a central processing unit (CPU). It needs no graphics processing unit (GPU) and ships no GPU code. Building the lag matrix duplicates the input, so peak memory grows in proportion to the series length times the window size. Bit-for-bit reproducibility assumes deterministic regressor settings: the bundled LightGBM wrapper enables LightGBM’s deterministic and column-wise flags, and single-threaded execution removes any remaining floating-point reordering.
All runtime dependencies carry permissive licenses, which keeps the combined distribution simple for integrators. The library itself is distributed under the copyleft AGPL-3.0-or-later license.
| Dependency | License |
|---|---|
| numpy | BSD-3-Clause |
| pandas | BSD-3-Clause |
| scikit-learn | BSD-3-Clause |
| feature-engine | BSD-3-Clause |
| numba | BSD |
| lightgbm | MIT |
| holidays | MIT |
| pyarrow | Apache-2.0 |
| requests | Apache-2.0 |
| astral | Apache-2.0 |
| tqdm | MPL-2.0 and MIT |
Because the library performs no training and uses no GPU, its own energy cost is small. Runtime cost is dominated by vector operations during feature engineering and by whatever regressor the caller trains. A typical LightGBM fit on an hourly series of about 100,000 rows completes in seconds on one commodity CPU core. No pretrained weights ship with the package, so there are no embedded training emissions to report.
6. Data and Operational Design Domain
The fixtures under src/spotforecast2_safe/datasets/csv/ and the demo dataset support reproducible documentation and tests; production data is supplied by the integrator. Docstring examples in the source are executed as tests, and time-aware cross-validation is used during validation so that no future observation can influence a past prediction.
The Operational Design Domain (ODD) is the set of conditions under which the library’s results are valid. Outside these conditions the library is designed to raise an error rather than return an unreliable result.
| Condition | Valid range | Outside the range |
|---|---|---|
| Target series | numeric, univariate, with a regular and monotonic date-time index | error |
| Exogenous features | numeric, complete, aligned to the target index | ValueError on any missing entry |
| Sampling interval | uniform; hourly in the demo | unreliable result |
| Minimum history | longer than the window size plus the number of lags, about 96 hourly points for the demo defaults | the model cannot be called |
| Missing target values | rejected unless the caller explicitly enables imputation | error |
| Series length | validated to about one million rows; beyond about ten million the caller must process the series in chunks | memory exhaustion |
| Numeric precision | feature matrices computed in 32-bit floating point | values needing higher precision fall outside the domain |
| Any invalid input | not applicable | explicit ValueError or TypeError, never silent repair |
Forecast accuracy is bounded by the downstream regressor and its training data, so concept drift, seasonal shifts, or regime changes degrade forecasts even when the feature engineering stays correct. Users who build lag or calendar features outside the provided builders risk leaking a target value into its own feature row; the bundled ExogBuilder and task paths are leakage-free, hand-rolled pipelines are not. Operating states that are scarce in the training data are forecast less reliably than common ones.
To stay inside the valid domain, validate every new deployment against historical ground truth before it carries live traffic, build features only through ExogBuilder or the bundled tasks, keep the regressor deterministic when reproducibility is required, and process very long series in windowed chunks.
7. Evaluation
Because no training runs inside the library, classical accuracy metrics do not describe the library itself. The library is evaluated on software-quality properties, while forecast accuracy is a property of each deployment.
DataFrames that contain missing or infinite values raise a ValueError. The public loaders load_timeseries, load_timeseries_forecast, and WeatherService.get_dataframe refuse to return silently imputed values unless the caller opts in. Input types are checked at runtime, identical input yields identical output bytes, and new code carries at least 80 percent line coverage. CPE identifier generation is tested directly. Excluding heavy dependencies keeps the Common Vulnerabilities and Exposures (CVE) attack surface small: there is no web server, no deep-learning runtime, and no plotting backend.
For deployments, the demo computes mean absolute error (MAE) and mean squared error (MSE) when it compares each model against ground truth. These metrics are deterministic and can be reproduced by running the demo. Their numerical values depend on the data vintage and are therefore not fixed in this card.
8. Model Transparency
The library produces point forecasts. It does not natively quantify or calibrate predictive uncertainty, so a deployment that needs prediction intervals or calibrated probabilities must add them in the downstream regressor or a wrapper around it.
The code is white-box: there are no compiled inference kernels and no opaque weights, so every transformation can be read and audited in source. Feature attributions are available through the downstream regressor’s own importance measures, for example LightGBM’s split and gain importances. The package ships no separate explainability backend such as SHAP or LIME, consistent with its minimal-dependency policy.
9. Operation: Monitoring and Response
A deployment should watch the quality of incoming data (missing or out-of-range values and gaps in the timestamps), the drift of the input and target distributions away from the training period, and the forecast error measured against a simple baseline. The production configuration carries a refit cadence, with a default of seven days, and a maximum-model-age policy that signals when retraining is due.
When monitoring crosses a deployment-defined threshold, the usual responses are to refit or retrain the model, to fall back to the seasonal baseline forecaster or the last known-good model, and to alert the responsible team. A dual-handler logger writes timestamped records to the console and to files under ~/spotforecast2_safe_models/logs/, which supports audit retention. The thresholds and escalation steps are owned by the integrator.
10. Compliance Support
The package is built to support the development of high-risk AI systems under the EU AI Act. The package itself is not certified; full-system certification is the integrator’s responsibility.
It rejects missing or infinite data by default, which supports the data-governance duty of Article 10. This card together with the technical report forms a technical-documentation baseline for Article 11, and the CPE identifiers in Section 1 feed SBOM and vulnerability-tracking pipelines. The logging facility supports the record-keeping duty of Article 12. The white-box code supports the transparency duty of Article 13. The deterministic, reproducible transformations support the accuracy-and-robustness duty of Article 15, while formal system-level verification remains the integrator’s responsibility.
The authoritative mapping to IEC 61508, ISO 26262, ISA/IEC 62443, and the individual EU AI Act articles is maintained in the technical report (bart26h/index.qmd, section Compliance Mapping). These references reflect the standards as of 2026-04-19; users must track later amendments themselves.
11. Glossary
| Term | Meaning |
|---|---|
| EU AI Act | Regulation (EU) 2024/1689 on artificial intelligence, in force since 2024-08-01. |
| IEC 61508 | International standard for the functional safety of electrical, electronic, and programmable electronic safety-related systems. |
| ISA/IEC 62443 | Standard series for the security of industrial automation and control systems. |
| ISO 26262 | International standard for the functional safety of road vehicles. |
12. How to Audit
An auditor can validate this package as follows.
- Inspect
pyproject.tomlto confirm that none of the prohibited libraries (plotly, matplotlib, spotoptim, optuna, torch, tensorflow) are present. - Run
uv run pytest tests/to confirm functional correctness and the full test suite. - Run
uv run pytest tests/test_cpe.pyto confirm CPE identifier generation. - Record the CPE identifiers from Section 1 in vulnerability-tracking systems and supply-chain disclosures.
- Read
get_cpe_identifierinsrc/spotforecast2_safe/utils/cpe.pyfor use in automated workflows. - Run
uv run reuse lintto confirm license and copyright-header compliance.
14. Disclaimer and Liability
Limitation of liability. This library is designed with safety principles and deterministic logic, but it is provided as is, without warranty of any kind. The authors and contributors accept no liability for any direct or indirect damage, system failure, or financial loss arising from its use.
It is the sole responsibility of the system integrator to perform full system-level safety validation, for example under ISO 26262, IEC 61508, or the EU AI Act, before deploying this software in a production or safety-critical environment.