import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from spotoptim import SpotOptim
from spotoptim.surrogate import Kriging
import time
import warnings
warnings.filterwarnings('ignore')
# Scikit-learn models
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import (
Matern, RBF, ConstantKernel, WhiteKernel, RationalQuadratic
)
from sklearn.ensemble import RandomForestRegressor, GradientBoostingRegressor
from sklearn.svm import SVR12 Surrogate Model Selection in SpotOptim
This section demonstrates how to select and configure different surrogate models in SpotOptim
We compare various surrogate options:
- Gaussian Process with different kernels (Matern, RBF, Rational Quadratic)
- SpotOptim Kriging model
- Random Forest regressor
- XGBoost regressor
- Support Vector Regression (SVR)
- Gradient Boosting regressor
All methods are evaluated on the Aircraft Wing Weight Example (AWWE) function
We visualize the fitted surrogates and compare optimization performance
12.1 Introduction
Surrogate models are the heart of Bayesian optimization. SpotOptim supports any scikit-learn compatible regressor, allowing you to choose the surrogate that best fits your problem characteristics:
- Gaussian Processes: Provide uncertainty estimates, smooth interpolation, good for continuous functions. Support Expected Improvement (EI) acquisition.
- Kriging: Similar to GP but with customizable correlation functions. Supports EI acquisition.
- Random Forests: Robust to noise, handle discontinuities. Don’t provide uncertainty, so use
acquisition='y'(greedy). - XGBoost: Excellent for high-dimensional problems, fast training and prediction. Use
acquisition='y'. - SVR: Good for high-dimensional problems with smooth structure. Use
acquisition='y'. - Gradient Boosting: Strong performance on structured problems. Use
acquisition='y'.
Models that provide uncertainty estimates (Gaussian Process, Kriging) work with all acquisition functions: ‘ei’ (Expected Improvement), ‘pi’ (Probability of Improvement), and ‘y’ (greedy).
Tree-based and other models (Random Forest, XGBoost, SVR, Gradient Boosting) don’t provide uncertainty estimates by default, so they should use acquisition='y' for greedy optimization. SpotOptim automatically handles this gracefully.
12.2 Setup and Imports
- XGBoost (if available)
try:
import xgboost as xgb
XGBOOST_AVAILABLE = True
except ImportError:
XGBOOST_AVAILABLE = False
print("XGBoost not available. Install with: pip install xgboost")
print("Libraries imported successfully!")Libraries imported successfully!
12.3 The AWWE Objective Function
We use the Aircraft Wing Weight Example function, which models the weight of an unpainted light aircraft wing. The function accepts inputs in the unit cube \([0,1]^9\) and returns the wing weight in pounds.
def wingwt(x):
"""
Aircraft Wing Weight function.
Args:
x: array-like of 9 values in [0,1]
[Sw, Wfw, A, L, q, l, Rtc, Nz, Wdg]
Returns:
Wing weight (scalar)
"""
# Ensure x is a 2D array for batch evaluation
x = np.atleast_2d(x)
# Transform from unit cube to natural scales
Sw = x[:, 0] * (200 - 150) + 150 # Wing area (ft²)
Wfw = x[:, 1] * (300 - 220) + 220 # Fuel weight (lb)
A = x[:, 2] * (10 - 6) + 6 # Aspect ratio
L = (x[:, 3] * (10 - (-10)) - 10) * np.pi/180 # Sweep angle (rad)
q = x[:, 4] * (45 - 16) + 16 # Dynamic pressure (lb/ft²)
l = x[:, 5] * (1 - 0.5) + 0.5 # Taper ratio
Rtc = x[:, 6] * (0.18 - 0.08) + 0.08 # Root thickness/chord
Nz = x[:, 7] * (6 - 2.5) + 2.5 # Ultimate load factor
Wdg = x[:, 8] * (2500 - 1700) + 1700 # Design gross weight (lb)
# Calculate weight on natural scale
W = 0.036 * Sw**0.758 * Wfw**0.0035 * (A/np.cos(L)**2)**0.6 * q**0.006
W = W * l**0.04 * (100*Rtc/np.cos(L))**(-0.3) * (Nz*Wdg)**(0.49)
return W.ravel()
# Problem setup
bounds = [(0, 1)] * 9
param_names = ['Sw', 'Wfw', 'A', 'L', 'q', 'l', 'Rtc', 'Nz', 'Wdg']
max_iter = 30
n_initial = 10
seed = 42
print(f"Problem dimension: {len(bounds)}")
print(f"Optimization budget: {max_iter} evaluations")Problem dimension: 9
Optimization budget: 30 evaluations
12.4 1. Default Surrogate: Gaussian Process with Matern Kernel
SpotOptim’s default surrogate is a Gaussian Process with a Matern kernel (ν=2.5), which provides twice-differentiable sample paths and good performance for most optimization problems.
print("=" * 80)
print("1. DEFAULT: Gaussian Process with Matern ν=2.5 Kernel")
print("=" * 80)
start_time = time.time()
# Default GP (no surrogate specified)
optimizer_default = SpotOptim(
fun=wingwt,
bounds=bounds,
max_iter=max_iter,
n_initial=n_initial,
var_name=param_names,
acquisition='ei',
seed=seed,
verbose=False
)
result_default = optimizer_default.optimize()
time_default = time.time() - start_time
print(f"\nResults:")
print(f" Best weight: {result_default.fun:.4f} lb")
print(f" Function evaluations: {result_default.nfev}")
print(f" Time: {time_default:.2f}s")
print(f" Success: {result_default.success}")
# Store for comparison
results_comparison = [{
'Surrogate': 'GP Matern ν=2.5 (Default)',
'Best Weight': result_default.fun,
'Evaluations': result_default.nfev,
'Time (s)': time_default,
'Success': result_default.success
}]================================================================================
1. DEFAULT: Gaussian Process with Matern ν=2.5 Kernel
================================================================================
Results:
Best weight: 119.5041 lb
Function evaluations: 30
Time: 40.79s
Success: True
12.4.1 Visualization: Default Surrogate
# Plot convergence
optimizer_default.plot_progress(log_y=False, figsize=(10, 5))
# Plot most important hyperparameters
optimizer_default.plot_important_hyperparameter_contour(max_imp=3)
plt.suptitle('Default GP Matern ν=2.5: Most Important Parameters', y=1.02)
plt.show()Plotting surrogate contours for top 3 most important parameters:
A: importance = 19.62% (type: float)
Nz: importance = 19.50% (type: float)
Rtc: importance = 19.29% (type: float)
Generating 3 surrogate plots...
Plotting A vs Nz

Plotting A vs Rtc

Plotting Nz vs Rtc

<Figure size 672x480 with 0 Axes>
12.5 2. Gaussian Process with RBF (Radial Basis Function) Kernel
The RBF kernel (also called squared exponential) produces infinitely differentiable sample paths, resulting in very smooth predictions.
print("=" * 80)
print("2. Gaussian Process with RBF Kernel")
print("=" * 80)
start_time = time.time()
# Configure GP with RBF kernel
kernel_rbf = ConstantKernel(1.0, (1e-3, 1e3)) * RBF(
length_scale=1.0,
length_scale_bounds=(1e-2, 1e2)
)
gp_rbf = GaussianProcessRegressor(
kernel=kernel_rbf,
n_restarts_optimizer=10,
normalize_y=True,
random_state=seed
)
optimizer_rbf = SpotOptim(
fun=wingwt,
bounds=bounds,
surrogate=gp_rbf,
max_iter=max_iter,
n_initial=n_initial,
var_name=param_names,
acquisition='ei',
seed=seed,
verbose=False
)
result_rbf = optimizer_rbf.optimize()
time_rbf = time.time() - start_time
print(f"\nResults:")
print(f" Best weight: {result_rbf.fun:.4f} lb")
print(f" Function evaluations: {result_rbf.nfev}")
print(f" Time: {time_rbf:.2f}s")
print(f" Success: {result_rbf.success}")
results_comparison.append({
'Surrogate': 'GP RBF',
'Best Weight': result_rbf.fun,
'Evaluations': result_rbf.nfev,
'Time (s)': time_rbf,
'Success': result_rbf.success
})================================================================================
2. Gaussian Process with RBF Kernel
================================================================================
Results:
Best weight: 119.5037 lb
Function evaluations: 30
Time: 46.60s
Success: True
12.5.1 Visualization: RBF Kernel
optimizer_rbf.plot_progress(log_y=False, figsize=(10, 5))
optimizer_rbf.plot_important_hyperparameter_contour(max_imp=3)
plt.suptitle('GP RBF Kernel: Most Important Parameters', y=1.02)
plt.show()Plotting surrogate contours for top 3 most important parameters:
A: importance = 19.40% (type: float)
Nz: importance = 19.27% (type: float)
Rtc: importance = 19.06% (type: float)
Generating 3 surrogate plots...
Plotting A vs Nz

Plotting A vs Rtc

Plotting Nz vs Rtc

<Figure size 672x480 with 0 Axes>
12.6 3. Gaussian Process with Matern ν=1.5 Kernel
The Matern ν=1.5 kernel produces once-differentiable sample paths, allowing for more flexible (less smooth) fits than ν=2.5.
print("=" * 80)
print("3. Gaussian Process with Matern ν=1.5 Kernel")
print("=" * 80)
start_time = time.time()
# Configure GP with Matern nu=1.5
kernel_matern15 = ConstantKernel(1.0, (1e-3, 1e3)) * Matern(
length_scale=1.0,
length_scale_bounds=(1e-2, 1e2),
nu=1.5
)
gp_matern15 = GaussianProcessRegressor(
kernel=kernel_matern15,
n_restarts_optimizer=10,
normalize_y=True,
random_state=seed
)
optimizer_matern15 = SpotOptim(
fun=wingwt,
bounds=bounds,
surrogate=gp_matern15,
max_iter=max_iter,
n_initial=n_initial,
var_name=param_names,
acquisition='ei',
seed=seed,
verbose=False
)
result_matern15 = optimizer_matern15.optimize()
time_matern15 = time.time() - start_time
print(f"\nResults:")
print(f" Best weight: {result_matern15.fun:.4f} lb")
print(f" Function evaluations: {result_matern15.nfev}")
print(f" Time: {time_matern15:.2f}s")
print(f" Success: {result_matern15.success}")
results_comparison.append({
'Surrogate': 'GP Matern ν=1.5',
'Best Weight': result_matern15.fun,
'Evaluations': result_matern15.nfev,
'Time (s)': time_matern15,
'Success': result_matern15.success
})================================================================================
3. Gaussian Process with Matern ν=1.5 Kernel
================================================================================
Results:
Best weight: 119.5056 lb
Function evaluations: 30
Time: 43.48s
Success: True
12.6.1 Visualization: Matern ν=1.5 Kernel
optimizer_matern15.plot_progress(log_y=False, figsize=(10, 5))
optimizer_matern15.plot_important_hyperparameter_contour(max_imp=3)
plt.suptitle('GP Matern ν=1.5 Kernel: Most Important Parameters', y=1.02)
plt.show()Plotting surrogate contours for top 3 most important parameters:
Wdg: importance = 18.62% (type: float)
A: importance = 18.41% (type: float)
Nz: importance = 18.30% (type: float)
Generating 3 surrogate plots...
Plotting Wdg vs A

Plotting Wdg vs Nz

Plotting A vs Nz

<Figure size 672x480 with 0 Axes>
12.7 4. Gaussian Process with Rational Quadratic Kernel
The Rational Quadratic kernel is a scale mixture of RBF kernels with different length scales, providing more flexibility than a single RBF.
print("=" * 80)
print("4. Gaussian Process with Rational Quadratic Kernel")
print("=" * 80)
start_time = time.time()
# Configure GP with Rational Quadratic kernel
kernel_rq = ConstantKernel(1.0, (1e-3, 1e3)) * RationalQuadratic(
length_scale=1.0,
alpha=1.0,
length_scale_bounds=(1e-2, 1e2),
alpha_bounds=(1e-2, 1e2)
)
gp_rq = GaussianProcessRegressor(
kernel=kernel_rq,
n_restarts_optimizer=10,
normalize_y=True,
random_state=seed
)
optimizer_rq = SpotOptim(
fun=wingwt,
bounds=bounds,
surrogate=gp_rq,
max_iter=max_iter,
n_initial=n_initial,
var_name=param_names,
acquisition='ei',
seed=seed,
verbose=False
)
result_rq = optimizer_rq.optimize()
time_rq = time.time() - start_time
print(f"\nResults:")
print(f" Best weight: {result_rq.fun:.4f} lb")
print(f" Function evaluations: {result_rq.nfev}")
print(f" Time: {time_rq:.2f}s")
print(f" Success: {result_rq.success}")
results_comparison.append({
'Surrogate': 'GP Rational Quadratic',
'Best Weight': result_rq.fun,
'Evaluations': result_rq.nfev,
'Time (s)': time_rq,
'Success': result_rq.success
})================================================================================
4. Gaussian Process with Rational Quadratic Kernel
================================================================================
Results:
Best weight: 119.5061 lb
Function evaluations: 30
Time: 49.84s
Success: True
12.7.1 Visualization: Rational Quadratic Kernel
optimizer_rq.plot_progress(log_y=False, figsize=(10, 5))
optimizer_rq.plot_important_hyperparameter_contour(max_imp=3)
plt.suptitle('GP Rational Quadratic Kernel: Most Important Parameters', y=1.02)
plt.show()Plotting surrogate contours for top 3 most important parameters:
A: importance = 18.94% (type: float)
Nz: importance = 18.82% (type: float)
Rtc: importance = 18.61% (type: float)
Generating 3 surrogate plots...
Plotting A vs Nz

Plotting A vs Rtc

Plotting Nz vs Rtc

<Figure size 672x480 with 0 Axes>
12.8 5. SpotOptim Kriging Model
SpotOptim includes its own Kriging implementation optimized for sequential design. It uses Gaussian correlation function and optimizes hyperparameters via differential evolution.
print("=" * 80)
print("5. SpotOptim Kriging Model")
print("=" * 80)
start_time = time.time()
# Configure Kriging model
kriging_model = Kriging(
noise=1e-10, # Regularization parameter
kernel='gauss', # Gaussian/RBF kernel
n_theta=None, # Auto: use number of dimensions
min_theta=-3.0, # Min log10(theta) bound
max_theta=2.0, # Max log10(theta) bound
seed=seed
)
optimizer_kriging = SpotOptim(
fun=wingwt,
bounds=bounds,
surrogate=kriging_model,
max_iter=max_iter,
n_initial=n_initial,
var_name=param_names,
acquisition='ei',
seed=seed,
verbose=False
)
result_kriging = optimizer_kriging.optimize()
time_kriging = time.time() - start_time
print(f"\nResults:")
print(f" Best weight: {result_kriging.fun:.4f} lb")
print(f" Function evaluations: {result_kriging.nfev}")
print(f" Time: {time_kriging:.2f}s")
print(f" Success: {result_kriging.success}")
results_comparison.append({
'Surrogate': 'SpotOptim Kriging',
'Best Weight': result_kriging.fun,
'Evaluations': result_kriging.nfev,
'Time (s)': time_kriging,
'Success': result_kriging.success
})================================================================================
5. SpotOptim Kriging Model
================================================================================
Results:
Best weight: 121.1616 lb
Function evaluations: 30
Time: 62.39s
Success: True
12.8.1 Visualization: Kriging Model
optimizer_kriging.plot_progress(log_y=False, figsize=(10, 5))
optimizer_kriging.plot_important_hyperparameter_contour(max_imp=3)
plt.suptitle('SpotOptim Kriging: Most Important Parameters', y=1.02)
plt.show()Plotting surrogate contours for top 3 most important parameters:
Nz: importance = 19.13% (type: float)
A: importance = 16.01% (type: float)
Wdg: importance = 14.70% (type: float)
Generating 3 surrogate plots...
Plotting Nz vs A

Plotting Nz vs Wdg

Plotting A vs Wdg

<Figure size 672x480 with 0 Axes>
12.9 6. Random Forest Regressor
Random Forests are ensemble methods that handle noise well and can model discontinuities. They don’t naturally provide uncertainty estimates, so the acquisition function uses predictions only.
print("=" * 80)
print("6. Random Forest Regressor")
print("=" * 80)
start_time = time.time()
# Configure Random Forest
rf_model = RandomForestRegressor(
n_estimators=100,
max_depth=15,
min_samples_split=2,
min_samples_leaf=1,
random_state=seed,
n_jobs=-1
)
optimizer_rf = SpotOptim(
fun=wingwt,
bounds=bounds,
surrogate=rf_model,
max_iter=max_iter,
n_initial=n_initial,
var_name=param_names,
acquisition='y', # Use 'y' (greedy) since RF doesn't provide std
seed=seed,
verbose=False
)
result_rf = optimizer_rf.optimize()
time_rf = time.time() - start_time
print(f"\nResults:")
print(f" Best weight: {result_rf.fun:.4f} lb")
print(f" Function evaluations: {result_rf.nfev}")
print(f" Time: {time_rf:.2f}s")
print(f" Success: {result_rf.success}")
print(f" Note: Using acquisition='y' (greedy) since RF doesn't provide uncertainty")
results_comparison.append({
'Surrogate': 'Random Forest',
'Best Weight': result_rf.fun,
'Evaluations': result_rf.nfev,
'Time (s)': time_rf,
'Success': result_rf.success
})================================================================================
6. Random Forest Regressor
================================================================================
Results:
Best weight: 158.2359 lb
Function evaluations: 30
Time: 1093.84s
Success: True
Note: Using acquisition='y' (greedy) since RF doesn't provide uncertainty
12.9.1 Visualization: Random Forest
optimizer_rf.plot_progress(log_y=False, figsize=(10, 5))
optimizer_rf.plot_important_hyperparameter_contour(max_imp=3)
plt.suptitle('Random Forest: Most Important Parameters', y=1.02)
plt.show()Plotting surrogate contours for top 3 most important parameters:
Wdg: importance = 13.97% (type: float)
A: importance = 13.96% (type: float)
Rtc: importance = 13.43% (type: float)
Generating 3 surrogate plots...
Plotting Wdg vs A

Plotting Wdg vs Rtc

Plotting A vs Rtc

<Figure size 672x480 with 0 Axes>
12.10 7. XGBoost Regressor
XGBoost is a gradient boosting implementation known for excellent performance on structured data and fast training/prediction times.
if XGBOOST_AVAILABLE:
print("=" * 80)
print("7. XGBoost Regressor")
print("=" * 80)
start_time = time.time()
# Configure XGBoost
xgb_model = xgb.XGBRegressor(
n_estimators=100,
max_depth=6,
learning_rate=0.1,
subsample=0.8,
colsample_bytree=0.8,
random_state=seed,
n_jobs=-1
)
optimizer_xgb = SpotOptim(
fun=wingwt,
bounds=bounds,
surrogate=xgb_model,
max_iter=max_iter,
n_initial=n_initial,
var_name=param_names,
acquisition='y', # Use 'y' (greedy) since XGBoost doesn't provide std
seed=seed,
verbose=False
)
result_xgb = optimizer_xgb.optimize()
time_xgb = time.time() - start_time
print(f"\nResults:")
print(f" Best weight: {result_xgb.fun:.4f} lb")
print(f" Function evaluations: {result_xgb.nfev}")
print(f" Time: {time_xgb:.2f}s")
print(f" Success: {result_xgb.success}")
print(f" Note: Using acquisition='y' (greedy) since XGBoost doesn't provide uncertainty")
results_comparison.append({
'Surrogate': 'XGBoost',
'Best Weight': result_xgb.fun,
'Evaluations': result_xgb.nfev,
'Time (s)': time_xgb,
'Success': result_xgb.success
})
# Visualization
optimizer_xgb.plot_progress(log_y=False, figsize=(10, 5))
plt.title('XGBoost: Convergence')
plt.show()
optimizer_xgb.plot_important_hyperparameter_contour(max_imp=3)
plt.suptitle('XGBoost: Most Important Parameters', y=1.02)
plt.show()
else:
print("=" * 80)
print("7. XGBoost Regressor - SKIPPED (not installed)")
print("=" * 80)
print("Install XGBoost with: pip install xgboost")================================================================================
7. XGBoost Regressor
================================================================================
Results:
Best weight: 150.8548 lb
Function evaluations: 30
Time: 19.15s
Success: True
Note: Using acquisition='y' (greedy) since XGBoost doesn't provide uncertainty


Plotting surrogate contours for top 3 most important parameters:
Nz: importance = 15.18% (type: float)
A: importance = 15.04% (type: float)
Wdg: importance = 14.49% (type: float)
Generating 3 surrogate plots...
Plotting Nz vs A

Plotting Nz vs Wdg

Plotting A vs Wdg

<Figure size 672x480 with 0 Axes>
12.11 8. Support Vector Regression (SVR)
SVR with RBF kernel can model complex non-linear relationships. It’s particularly good for high-dimensional problems with smooth structure.
print("=" * 80)
print("8. Support Vector Regression (SVR)")
print("=" * 80)
start_time = time.time()
# Configure SVR
svr_model = SVR(
kernel='rbf',
C=100.0,
epsilon=0.1,
gamma='scale'
)
optimizer_svr = SpotOptim(
fun=wingwt,
bounds=bounds,
surrogate=svr_model,
max_iter=max_iter,
n_initial=n_initial,
var_name=param_names,
acquisition='y', # Use 'y' (greedy) since SVR doesn't provide std by default
seed=seed,
verbose=False
)
result_svr = optimizer_svr.optimize()
time_svr = time.time() - start_time
print(f"\nResults:")
print(f" Best weight: {result_svr.fun:.4f} lb")
print(f" Function evaluations: {result_svr.nfev}")
print(f" Time: {time_svr:.2f}s")
print(f" Success: {result_svr.success}")
results_comparison.append({
'Surrogate': 'SVR (RBF)',
'Best Weight': result_svr.fun,
'Evaluations': result_svr.nfev,
'Time (s)': time_svr,
'Success': result_svr.success
})================================================================================
8. Support Vector Regression (SVR)
================================================================================
Results:
Best weight: 140.0109 lb
Function evaluations: 30
Time: 1.79s
Success: True
12.11.1 Visualization: SVR
optimizer_svr.plot_progress(log_y=False, figsize=(10, 5))
optimizer_svr.plot_important_hyperparameter_contour(max_imp=3)
plt.suptitle('Support Vector Regression: Most Important Parameters', y=1.02)
plt.show()Plotting surrogate contours for top 3 most important parameters:
A: importance = 18.74% (type: float)
Rtc: importance = 18.17% (type: float)
Wdg: importance = 18.14% (type: float)
Generating 3 surrogate plots...
Plotting A vs Rtc

Plotting A vs Wdg

Plotting Rtc vs Wdg

<Figure size 672x480 with 0 Axes>
12.12 9. Gradient Boosting Regressor
Gradient Boosting from scikit-learn is another ensemble method that builds trees sequentially, with each tree correcting errors of the previous ones.
print("=" * 80)
print("9. Gradient Boosting Regressor")
print("=" * 80)
start_time = time.time()
# Configure Gradient Boosting
gb_model = GradientBoostingRegressor(
n_estimators=100,
max_depth=5,
learning_rate=0.1,
subsample=0.8,
random_state=seed
)
optimizer_gb = SpotOptim(
fun=wingwt,
bounds=bounds,
surrogate=gb_model,
max_iter=max_iter,
n_initial=n_initial,
var_name=param_names,
acquisition='y', # Use 'y' (greedy) since GB doesn't provide std
seed=seed,
verbose=False
)
result_gb = optimizer_gb.optimize()
time_gb = time.time() - start_time
print(f"\nResults:")
print(f" Best weight: {result_gb.fun:.4f} lb")
print(f" Function evaluations: {result_gb.nfev}")
print(f" Time: {time_gb:.2f}s")
print(f" Success: {result_gb.success}")
results_comparison.append({
'Surrogate': 'Gradient Boosting',
'Best Weight': result_gb.fun,
'Evaluations': result_gb.nfev,
'Time (s)': time_gb,
'Success': result_gb.success
})================================================================================
9. Gradient Boosting Regressor
================================================================================
Results:
Best weight: 124.6772 lb
Function evaluations: 30
Time: 7.39s
Success: True
12.12.1 Visualization: Gradient Boosting
optimizer_gb.plot_progress(log_y=False, figsize=(10, 5))
optimizer_gb.plot_important_hyperparameter_contour(max_imp=3)
plt.suptitle('Gradient Boosting: Most Important Parameters', y=1.02)
plt.show()Plotting surrogate contours for top 3 most important parameters:
Wdg: importance = 16.80% (type: float)
A: importance = 16.28% (type: float)
Rtc: importance = 15.69% (type: float)
Generating 3 surrogate plots...
Plotting Wdg vs A

Plotting Wdg vs Rtc

Plotting A vs Rtc

<Figure size 672x480 with 0 Axes>
12.13 Comprehensive Comparison
Now let’s compare all surrogate models side-by-side.
# Create comparison DataFrame
df_comparison = pd.DataFrame(results_comparison)
# Calculate improvement from best
best_weight = df_comparison['Best Weight'].min()
df_comparison['Gap to Best (%)'] = (
(df_comparison['Best Weight'] - best_weight) / best_weight * 100
)
# Sort by best weight
df_comparison = df_comparison.sort_values('Best Weight')
print("\n" + "=" * 100)
print("SURROGATE MODEL COMPARISON")
print("=" * 100)
print(df_comparison.to_string(index=False))
print("=" * 100)
====================================================================================================
SURROGATE MODEL COMPARISON
====================================================================================================
Surrogate Best Weight Evaluations Time (s) Success Gap to Best (%)
GP RBF 119.503672 30 46.604268 True 0.000000
GP Matern ν=2.5 (Default) 119.504103 30 40.790576 True 0.000361
GP Matern ν=1.5 119.505612 30 43.481080 True 0.001624
GP Rational Quadratic 119.506083 30 49.837051 True 0.002017
SpotOptim Kriging 121.161582 30 62.392756 True 1.387330
Gradient Boosting 124.677233 30 7.394153 True 4.329207
SVR (RBF) 140.010945 30 1.792096 True 17.160371
XGBoost 150.854809 30 19.152899 True 26.234455
Random Forest 158.235859 30 1093.836132 True 32.410877
====================================================================================================
12.13.1 Visualization: Performance Comparison
fig, axes = plt.subplots(2, 2, figsize=(16, 12))
# Plot 1: Best weight comparison
ax1 = axes[0, 0]
colors = ['green' if i == 0 else 'steelblue' for i in range(len(df_comparison))]
ax1.barh(df_comparison['Surrogate'], df_comparison['Best Weight'], color=colors)
ax1.set_xlabel('Best Weight (lb)')
ax1.set_title('Best Weight Found by Each Surrogate')
ax1.axvline(x=best_weight, color='red', linestyle='--', linewidth=2, label='Best Overall')
ax1.legend()
ax1.grid(True, alpha=0.3, axis='x')
# Plot 2: Computational time
ax2 = axes[0, 1]
ax2.barh(df_comparison['Surrogate'], df_comparison['Time (s)'], color='coral')
ax2.set_xlabel('Time (seconds)')
ax2.set_title('Computational Time')
ax2.grid(True, alpha=0.3, axis='x')
# Plot 3: Gap to best
ax3 = axes[1, 0]
colors_gap = ['green' if gap < 0.1 else 'orange' if gap < 1.0 else 'red'
for gap in df_comparison['Gap to Best (%)']]
ax3.barh(df_comparison['Surrogate'], df_comparison['Gap to Best (%)'], color=colors_gap)
ax3.set_xlabel('Gap to Best Solution (%)')
ax3.set_title('Solution Quality (Lower is Better)')
ax3.axvline(x=1.0, color='black', linestyle='--', alpha=0.5, linewidth=1)
ax3.grid(True, alpha=0.3, axis='x')
# Plot 4: Efficiency (weight reduction per second)
ax4 = axes[1, 1]
baseline_weight = wingwt(np.array([[0.48, 0.4, 0.38, 0.5, 0.62, 0.344, 0.4, 0.37, 0.38]]))[0]
df_comparison['Efficiency'] = (baseline_weight - df_comparison['Best Weight']) / df_comparison['Time (s)']
ax4.barh(df_comparison['Surrogate'], df_comparison['Efficiency'], color='mediumseagreen')
ax4.set_xlabel('Weight Reduction per Second (lb/s)')
ax4.set_title('Optimization Efficiency')
ax4.grid(True, alpha=0.3, axis='x')
plt.tight_layout()
plt.show()
12.14 Convergence Comparison
Let’s compare how different surrogates converge over iterations.
fig, ax = plt.subplots(figsize=(14, 8))
# Collect convergence data from all optimizers
optimizers = [
(optimizer_default, 'GP Matern ν=2.5 (Default)', 'blue'),
(optimizer_rbf, 'GP RBF', 'green'),
(optimizer_matern15, 'GP Matern ν=1.5', 'red'),
(optimizer_rq, 'GP Rational Quadratic', 'purple'),
(optimizer_kriging, 'SpotOptim Kriging', 'orange'),
(optimizer_rf, 'Random Forest', 'brown'),
(optimizer_svr, 'SVR', 'pink'),
(optimizer_gb, 'Gradient Boosting', 'gray')
]
if XGBOOST_AVAILABLE:
optimizers.append((optimizer_xgb, 'XGBoost', 'cyan'))
for opt, label, color in optimizers:
y_history = opt.y_
best_so_far = np.minimum.accumulate(y_history)
ax.plot(range(len(best_so_far)), best_so_far, linewidth=2, label=label, color=color)
ax.set_xlabel('Function Evaluation', fontsize=12)
ax.set_ylabel('Best Wing Weight Found (lb)', fontsize=12)
ax.set_title('Convergence Comparison: All Surrogates', fontsize=14, fontweight='bold')
ax.legend(loc='upper right', fontsize=10)
ax.grid(True, alpha=0.3)
ax.set_xlim(0, max_iter)
plt.tight_layout()
plt.show()
12.15 Key Insights and Recommendations
print("\n" + "=" * 100)
print("KEY INSIGHTS AND RECOMMENDATIONS")
print("=" * 100)
# Find best surrogate
best_surrogate = df_comparison.iloc[0]['Surrogate']
best_value = df_comparison.iloc[0]['Best Weight']
best_time = df_comparison.iloc[0]['Time (s)']
print(f"\n1. BEST OVERALL PERFORMANCE:")
print(f" Surrogate: {best_surrogate}")
print(f" Best Weight: {best_value:.4f} lb")
print(f" Computation Time: {best_time:.2f}s")
# Find fastest
fastest_idx = df_comparison['Time (s)'].idxmin()
fastest_surrogate = df_comparison.loc[fastest_idx, 'Surrogate']
fastest_time = df_comparison.loc[fastest_idx, 'Time (s)']
print(f"\n2. FASTEST OPTIMIZATION:")
print(f" Surrogate: {fastest_surrogate}")
print(f" Time: {fastest_time:.2f}s")
print(f" Best Weight: {df_comparison.loc[fastest_idx, 'Best Weight']:.4f} lb")
# Find most efficient
most_efficient_idx = df_comparison['Efficiency'].idxmax()
most_efficient = df_comparison.loc[most_efficient_idx, 'Surrogate']
print(f"\n3. MOST EFFICIENT (weight reduction per second):")
print(f" Surrogate: {most_efficient}")
print(f" Efficiency: {df_comparison.loc[most_efficient_idx, 'Efficiency']:.4f} lb/s")
print(f"\n4. RECOMMENDATIONS BY PROBLEM TYPE:")
print(f" - Smooth, continuous functions: Gaussian Process with RBF or Matern ν=2.5")
print(f" - Functions with noise: Random Forest or Gradient Boosting")
print(f" - High-dimensional problems (>20D): XGBoost or Random Forest")
print(f" - Limited budget (<50 evals): Gaussian Process with Expected Improvement")
print(f" - Fast evaluation needed: XGBoost or Random Forest")
print(f" - Need uncertainty estimates: Gaussian Process or Kriging")
print(f" - Non-smooth/discontinuous: Random Forest or Gradient Boosting")
print(f"\n5. KERNEL COMPARISON (Gaussian Process):")
gp_results = df_comparison[df_comparison['Surrogate'].str.contains('GP')]
print(gp_results[['Surrogate', 'Best Weight', 'Time (s)']].to_string(index=False))
print("\n" + "=" * 100)
====================================================================================================
KEY INSIGHTS AND RECOMMENDATIONS
====================================================================================================
1. BEST OVERALL PERFORMANCE:
Surrogate: GP RBF
Best Weight: 119.5037 lb
Computation Time: 46.60s
2. FASTEST OPTIMIZATION:
Surrogate: SVR (RBF)
Time: 1.79s
Best Weight: 140.0109 lb
3. MOST EFFICIENT (weight reduction per second):
Surrogate: SVR (RBF)
Efficiency: 52.3953 lb/s
4. RECOMMENDATIONS BY PROBLEM TYPE:
- Smooth, continuous functions: Gaussian Process with RBF or Matern ν=2.5
- Functions with noise: Random Forest or Gradient Boosting
- High-dimensional problems (>20D): XGBoost or Random Forest
- Limited budget (<50 evals): Gaussian Process with Expected Improvement
- Fast evaluation needed: XGBoost or Random Forest
- Need uncertainty estimates: Gaussian Process or Kriging
- Non-smooth/discontinuous: Random Forest or Gradient Boosting
5. KERNEL COMPARISON (Gaussian Process):
Surrogate Best Weight Time (s)
GP RBF 119.503672 46.604268
GP Matern ν=2.5 (Default) 119.504103 40.790576
GP Matern ν=1.5 119.505612 43.481080
GP Rational Quadratic 119.506083 49.837051
====================================================================================================
12.16 Summary Statistics
# Summary statistics
print("\n" + "=" * 100)
print("SUMMARY STATISTICS")
print("=" * 100)
summary_stats = pd.DataFrame({
'Metric': [
'Best Weight Found',
'Worst Weight Found',
'Average Weight',
'Std Dev Weight',
'Fastest Time',
'Slowest Time',
'Average Time',
],
'Value': [
f"{df_comparison['Best Weight'].min():.4f} lb",
f"{df_comparison['Best Weight'].max():.4f} lb",
f"{df_comparison['Best Weight'].mean():.4f} lb",
f"{df_comparison['Best Weight'].std():.4f} lb",
f"{df_comparison['Time (s)'].min():.2f} s",
f"{df_comparison['Time (s)'].max():.2f} s",
f"{df_comparison['Time (s)'].mean():.2f} s",
]
})
print(summary_stats.to_string(index=False))
print("=" * 100)
====================================================================================================
SUMMARY STATISTICS
====================================================================================================
Metric Value
Best Weight Found 119.5037 lb
Worst Weight Found 158.2359 lb
Average Weight 130.3289 lb
Std Dev Weight 15.3235 lb
Fastest Time 1.79 s
Slowest Time 1093.84 s
Average Time 151.70 s
====================================================================================================
12.17 Conclusion
This comprehensive comparison demonstrates that:
- Gaussian Processes with appropriate kernels (Matern, RBF) provide excellent performance for smooth optimization problems and naturally support Expected Improvement acquisition
- SpotOptim Kriging offers a lightweight alternative to sklearn’s GP with comparable performance
- Random Forest and XGBoost are robust alternatives that handle noise and discontinuities well, though they require greedy acquisition
- SVR and Gradient Boosting offer middle-ground solutions with good scalability
- The choice of surrogate should be based on:
- Function smoothness
- Computational budget
- Need for uncertainty quantification
- Problem dimensionality
- Noise characteristics
For the AWWE problem, Gaussian Process surrogates generally performed best due to the function’s smooth structure, but tree-based methods (RF, XGBoost, GB) can be preferable for more complex, noisy, or high-dimensional problems.
12.18 Jupyter Notebook
- The Jupyter-Notebook of this chapter is available on GitHub in the Sequential Parameter Optimization Cookbook Repository