15 Kriging Surrogate Integration

15.1 Overview

Implementation of a Kriging (Gaussian Process) surrogate model to SpotOptim, providing an alternative to scikit-learn’s GaussianProcessRegressor.

15.2 Module Structure

src/spotoptim/surrogate/
├── __init__.py          # Module exports
├── kriging.py           # Kriging implementation (~350 lines)
└── README.md            # Module documentation

15.3 Kriging Class (`src/spotoptim/surrogate/kriging.py`)

Key Features:

Scikit-learn compatible interface (fit(), predict())
Gaussian (RBF) kernel: R = exp(-D)
Automatic hyperparameter optimization via maximum likelihood
Cholesky decomposition for efficient linear algebra
Prediction with uncertainty (return_std=True)
Reproducible results via seed parameter

Implementation Details:

lean, well-documented code
No external dependencies beyond NumPy, SciPy
Simplified from spotpython.surrogate.kriging
Focused on core functionality needed for SpotOptim

Parameters:

noise: Regularization (nugget effect)
kernel: Currently ‘gauss’ (Gaussian/RBF)
n_theta: Number of length scale parameters
min_theta, max_theta: Bounds for hyperparameter optimization
seed: Random seed for reproducibility

15.4 Integration with SpotOptim

No Changes Required to SpotOptim Core!

The existing surrogate parameter already supports any scikit-learn compatible model:

from spotoptim import SpotOptim, Kriging
import numpy as np

def rosenbrock(X):
    """Rosenbrock function"""
    x = X[:, 0]
    y = X[:, 1]
    return (1 - x)**2 + 100 * (y - x**2)**2

kriging = Kriging(seed=42)
optimizer = SpotOptim(
    fun=rosenbrock,
    bounds=[(-2, 2), (-2, 2)],
    surrogate=kriging,  # Just pass the Kriging instance
    max_iter=30,
    seed=42
)
result = optimizer.optimize()
print(f"Best value: {result.fun:.6f}")
print(f"Best point: {result.x}")

Best value: 0.012619
Best point: [0.92767604 0.86917826]

15.5 Documentation

Added Example to notebooks/demos.ipynb

Demonstrates Kriging vs GP comparison
Shows custom parameter usage

15.6 Usage Examples

15.6.1 Basic Usage

from spotoptim import SpotOptim, Kriging
import numpy as np

def sphere(X):
    """Sphere function: f(x) = sum(x^2)"""
    return np.sum(X**2, axis=1)

kriging = Kriging(noise=1e-6, seed=42)
optimizer = SpotOptim(
    fun=sphere, 
    bounds=[(-5, 5), (-5, 5)], 
    surrogate=kriging,
    max_iter=20,
    seed=42
)
result = optimizer.optimize()
print(f"Best value: {result.fun:.6f}")
print(f"Best point: {result.x}")

Best value: 0.000000
Best point: [2.04416182e-05 5.67177247e-04]

15.6.2 Custom Parameters

import numpy as np

def ackley(X):
    """Ackley function - multimodal test function"""
    a = 20
    b = 0.2
    c = 2 * np.pi
    n = X.shape[1]
    
    sum_sq = np.sum(X**2, axis=1)
    sum_cos = np.sum(np.cos(c * X), axis=1)
    
    return -a * np.exp(-b * np.sqrt(sum_sq / n)) - np.exp(sum_cos / n) + a + np.e

kriging = Kriging(
    noise=1e-4,
    min_theta=-2.0,
    max_theta=3.0,
    seed=123
)

optimizer = SpotOptim(
    fun=ackley,
    bounds=[(-5, 5), (-5, 5)],
    surrogate=kriging,
    max_iter=40,
    seed=123
)
result = optimizer.optimize()
print(f"Best value: {result.fun:.6f}")
print(f"Best point: {result.x}")

Best value: 0.004040
Best point: [ 0.00140328 -0.00013417]

15.6.3 Prediction with Uncertainty

from spotoptim import Kriging
import numpy as np

# Generate training data
np.random.seed(42)
X_train = np.random.uniform(-5, 5, (20, 2))
y_train = np.sum(X_train**2, axis=1)

# Generate test data
X_test = np.random.uniform(-5, 5, (10, 2))

# Fit model and predict with uncertainty
model = Kriging(seed=42)
model.fit(X_train, y_train)
y_pred, y_std = model.predict(X_test, return_std=True)

print(f"Predictions shape: {y_pred.shape}")
print(f"Uncertainties shape: {y_std.shape}")
print(f"Mean prediction: {y_pred.mean():.4f}")
print(f"Mean uncertainty: {y_std.mean():.4f}")

Predictions shape: (10,)
Uncertainties shape: (10,)
Mean prediction: 20.8065
Mean uncertainty: 0.0302

15.7 Technical Details

15.7.1 Kriging vs GaussianProcessRegressor

Aspect	Kriging	GaussianProcessRegressor
Lines of code	~350	Complex internal implementation
Dependencies	NumPy, SciPy	scikit-learn + dependencies
Kernel	Gaussian (RBF)	Multiple types (Matern, RQ, etc.)
Hyperparameter opt	Differential Evolution	L-BFGS-B with restarts
Use case	Simplified, explicit	Production, flexible

15.7.2 Algorithm

Correlation Matrix:
- Compute squared distances: D_ij = Σ_k θ_k(x_ik - x_jk)²
- Apply kernel: R_ij = exp(-D_ij)
- Add nugget: R_ii += noise
Maximum Likelihood:
- Optimize θ via differential evolution
- Minimize: (n/2)log(σ²) + (1/2)log|R|
- Concentrated likelihood (μ profiled out)
Prediction:
- Mean: f̂(x) = μ̂ + ψ(x)ᵀR⁻¹r
- Variance: s²(x) = σ̂²[1 + λ - ψ(x)ᵀR⁻¹ψ(x)]
- Uses Cholesky decomposition for efficiency

15.7.3 Key Arguments Passed from SpotOptim

SpotOptim passes these to the surrogate via the standard interface:

During fit:

# Example of how SpotOptim uses the surrogate internally
surrogate.fit(X, y)

X: Training points (n_initial or accumulated evaluations)
y: Function values

During predict:

# Example of internal usage
mu = surrogate.predict(x)[0]  # For acquisition='y'
mu, sigma = surrogate.predict(x, return_std=True)  # For acquisition='ei', 'pi'

Implicit parameters via seed:

random_state=seed (for GaussianProcessRegressor)
seed=seed (for Kriging)

15.8 Benefits

Self-contained: No heavy scikit-learn dependency for surrogate
Explicit: Clear hyperparameter bounds and optimization
Educational: Readable implementation of Kriging/GP
Flexible: Easy to extend with new kernels or features
Compatible: Works seamlessly with existing SpotOptim API

15.9 Future Enhancements

Potential additions:

Additional kernels (Matern, Exponential, Cubic)
Anisotropic hyperparameters (separate θ per dimension)
Gradient-enhanced predictions
Batch predictions for efficiency
Parallel hyperparameter optimization
ARD (Automatic Relevance Determination)

15.10 Jupyter Notebook

Note

The Jupyter-Notebook of this chapter is available on GitHub in the Sequential Parameter Optimization Cookbook Repository

15.1 Overview

15.2 Module Structure

15.3 Kriging Class (src/spotoptim/surrogate/kriging.py)

15.4 Integration with SpotOptim

15.5 Documentation

15.6 Usage Examples

15.6.1 Basic Usage

15.6.2 Custom Parameters

15.6.3 Prediction with Uncertainty

15.7 Technical Details

15.7.1 Kriging vs GaussianProcessRegressor

15.7.2 Algorithm

15.7.3 Key Arguments Passed from SpotOptim

15.8 Benefits

15.9 Future Enhancements

15.10 Jupyter Notebook

15.3 Kriging Class (`src/spotoptim/surrogate/kriging.py`)