Boundaries, transformations, PCA, OCBA, scaling, and parallel helpers.
spotoptim ships a collection of utility functions that support the optimization loop and post-hoc analysis. This page covers the most commonly used helpers in spotoptim.utils.
Boundaries and Mapping
get_boundaries computes the column-wise minimum and maximum of a NumPy array. This is useful for determining the range of evaluated points or for setting up scaling.
map_to_original_scale maps points from the \([0, 1]\) unit hypercube back to the original variable ranges defined by lower and upper bounds.
import numpy as npfrom spotoptim.utils import get_boundaries, map_to_original_scalenp.random.seed(0)data = np.random.uniform(low=-5, high=5, size=(20, 3))min_vals, max_vals = get_boundaries(data)print(f"Min per column: {min_vals}")print(f"Max per column: {max_vals}")
Min per column: [-4.128707 -4.812102 -4.28963942]
Max per column: [4.44668917 4.88373838 4.78618342]
Given boundaries, you can map unit-scaled search points back to the original scale:
The get_pca function scales numeric columns of a DataFrame and performs Principal Component Analysis. It returns the fitted PCA object, scaled data, feature names, sample names, and the transformed data.
get_pca_topk identifies the top \(k\) features with the strongest influence on PC1 and PC2.
Use get_pca_topk to find which original features load most heavily on the first two components:
import numpy as npimport pandas as pdfrom spotoptim.utils import get_pca, get_pca_topknp.random.seed(0)df = pd.DataFrame({"feature_a": np.random.randn(50),"feature_b": np.random.randn(50) *2,"feature_c": np.random.randn(50) +1,"feature_d": np.random.randn(50) *0.5,})pca, _, feature_names, _, _ = get_pca(df, n_components=2)top_pc1, top_pc2 = get_pca_topk(pca, feature_names, k=2)print(f"Top features for PC1: {top_pc1}")print(f"Top features for PC2: {top_pc2}")
Top features for PC1: ['feature_b', 'feature_c']
Top features for PC2: ['feature_a', 'feature_c']
OCBA (Optimal Computing Budget Allocation)
When the objective function is noisy, repeated evaluations of the same design can be allocated smartly using OCBA. Given the current sample means, variances, and an incremental budget \(\delta\), get_ocba returns an allocation vector that concentrates evaluations on the most promising and most uncertain designs.
get_ranks is a helper that returns the rank of each element in an array (0 = smallest).
The allocation vector tells you how many additional evaluations each design should receive. Designs with lower means (better objectives, assuming minimization) and higher variance tend to receive more budget.
See The SpotOptim Class for how OCBA integrates into noisy optimization runs.
TorchStandardScaler
TorchStandardScaler standardizes PyTorch tensors to zero mean and unit variance, analogous to sklearn’s StandardScaler but operating on torch.Tensor objects directly.
Shape: torch.Size([8, 2])
Mean after scaling: tensor([ 3.7253e-08, -2.9802e-08])
Parallel Evaluation
is_gil_disabled checks whether the current Python interpreter is a free-threaded build (PEP 703). On standard CPython the GIL is enabled and this returns False. spotoptim uses this check internally to decide whether thread-based parallelism is safe for objective evaluation.
from spotoptim.utils import is_gil_disabledresult = is_gil_disabled()print(f"GIL disabled: {result}")print(f"Return type: {type(result).__name__}")