Surrogate Models

Kriging, SimpleKriging, and MLPSurrogate: the models that approximate your objective function.

A surrogate model is a cheap-to-evaluate approximation of the expensive objective function. spotoptim fits the surrogate to all evaluated points, then uses it to decide where to evaluate next. The surrogate must implement the sklearn estimator interface: fit(X, y) and predict(X, return_std=False).


Kriging (Default)

Kriging is the default surrogate. It models the objective as a Gaussian process and provides both predictions and uncertainty estimates.

import numpy as np
from spotoptim.surrogate import Kriging

X_train = np.array([[0.0], [1.0], [2.0], [3.0], [4.0]])
y_train = np.array([0.0, 1.0, 4.0, 9.0, 16.0])

model = Kriging(method="regression", seed=0)
model.fit(X_train, y_train)

X_test = np.array([[0.5], [2.5]])
y_pred = model.predict(X_test)
print(f"Predictions: {y_pred}")
Predictions: [0.25416718 6.24855725]

Uncertainty Estimation

Use return_std=True to get both the predicted mean and standard deviation. Regions with high uncertainty have not been explored yet.

import numpy as np
from spotoptim.surrogate import Kriging

X_train = np.array([[0.0], [1.0], [3.0], [4.0]])
y_train = np.array([0.0, 1.0, 9.0, 16.0])

model = Kriging(method="regression", seed=0)
model.fit(X_train, y_train)

X_test = np.array([[0.5], [2.0], [3.5]])
y_pred, y_std = model.predict(X_test, return_std=True)
for i in range(len(X_test)):
    print(f"x={X_test[i, 0]:.1f}: pred={y_pred[i]:.2f}, std={y_std[i]:.4f}")
x=0.5: pred=0.26, std=0.1724
x=2.0: pred=3.95, std=0.1914
x=3.5: pred=12.31, std=0.1724

Kriging Methods

The method parameter controls the fitting approach:

  • "regression" (default) — generalized least-squares regression Kriging
  • "interpolation" — exact interpolation through all data points
  • "reinterpolation" — Forrester’s re-interpolation for noisy data

Key Parameters

Parameter Default Description
method "regression" Fitting method
noise None Regularization (nugget); None = auto-select
seed 124 Random seed for theta optimization
n_theta None Number of theta parameters (None = one per dimension)
min_theta -3.0 Lower bound for log10(theta)
max_theta 2.0 Upper bound for log10(theta)
kernel (implicit) Gaussian/RBF kernel by default

SimpleKriging

A lightweight alternative to Kriging, suitable for simple continuous problems where speed matters more than flexibility.

import numpy as np
from spotoptim.surrogate import SimpleKriging

X_train = np.array([[0.0], [1.0], [2.0], [3.0]])
y_train = np.array([0.0, 1.0, 4.0, 9.0])

model = SimpleKriging(seed=0)
model.fit(X_train, y_train)

y_pred = model.predict(np.array([[1.5], [2.5]]))
print(f"Predictions: {y_pred}")
Predictions: [2.23557167 6.27502847]

MLPSurrogate

A neural-network-based surrogate using a Multi-Layer Perceptron with MC Dropout for uncertainty estimation. Useful when the response surface is highly non-linear or when you have many data points.

import numpy as np
from spotoptim.surrogate import MLPSurrogate

np.random.seed(0)
X_train = np.random.uniform(-5, 5, size=(50, 2))
y_train = np.sum(X_train**2, axis=1)

model = MLPSurrogate(l1=64, num_hidden_layers=2, epochs=100, seed=0)
model.fit(X_train, y_train)

X_test = np.array([[0.0, 0.0], [1.0, 1.0]])
y_pred = model.predict(X_test)
print(f"Predictions: {y_pred}")
Predictions: [-2.6765175  -0.09913826]

Uncertainty via MC Dropout

import numpy as np
from spotoptim.surrogate import MLPSurrogate

np.random.seed(0)
X_train = np.random.uniform(-5, 5, size=(50, 2))
y_train = np.sum(X_train**2, axis=1)

model = MLPSurrogate(
    l1=64, num_hidden_layers=2, dropout=0.1,
    mc_dropout_passes=30, epochs=100, seed=0,
)
model.fit(X_train, y_train)

X_test = np.array([[0.0, 0.0], [3.0, 3.0]])
y_pred, y_std = model.predict(X_test, return_std=True)
for i in range(len(X_test)):
    print(f"pred={y_pred[i]:.2f}, std={y_std[i]:.4f}")
pred=2.10, std=4.4744
pred=18.69, std=3.7247

Using a Custom sklearn Surrogate

Any estimator that implements fit(X, y) and predict(X) can serve as a surrogate. For acquisition functions that need uncertainty ("ei", "pi"), the model should also support predict(X, return_std=True).

from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import Matern
from spotoptim import SpotOptim
from spotoptim.function import sphere

gpr = GaussianProcessRegressor(
    kernel=Matern(nu=2.5),
    n_restarts_optimizer=5,
    random_state=0,
)

opt = SpotOptim(
    fun=sphere,
    bounds=[(-5, 5), (-5, 5)],
    surrogate=gpr,
    acquisition="ei",
    max_iter=20,
    n_initial=10,
    seed=0,
)
result = opt.optimize()

print(f"Best f(x) : {result.fun:.6f}")
Best f(x) : 0.000248

Using Kriging Inside SpotOptim

from spotoptim import SpotOptim
from spotoptim.surrogate import Kriging
from spotoptim.function import rosenbrock

kriging = Kriging(
    method="regression",
    noise=1e-3,
    min_theta=-3.0,
    max_theta=2.0,
    seed=0,
)

opt = SpotOptim(
    fun=rosenbrock,
    bounds=[(-2, 2), (-2, 2)],
    surrogate=kriging,
    acquisition="ei",
    max_iter=25,
    n_initial=10,
    seed=0,
)
result = opt.optimize()

print(f"Best x    : {result.x}")
print(f"Best f(x) : {result.fun:.6f}")
Best x    : [0.89033462 0.7894651 ]
Best f(x) : 0.013070

See Also