12  Expected Improvement

This chapter describes, analyzes, and compares different infill criterion. An infill criterion defines how the next point \(x_{n+1}\) is selected from the surrogate model \(S\). Expected improvement is a popular infill criterion in Bayesian optimization.

12.1 Example: Spot and the 1-dim Sphere Function

import numpy as np
from math import inf
from spotPython.fun.objectivefunctions import analytical
from spotPython.spot import spot
from spotPython.utils.init import fun_control_init, surrogate_control_init, design_control_init
import matplotlib.pyplot as plt

12.1.1 The Objective Function: 1-dim Sphere

  • The spotPython package provides several classes of objective functions.
  • We will use an analytical objective function, i.e., a function that can be described by a (closed) formula: \[f(x) = x^2 \]
fun = analytical().fun_sphere
  • The size of the lower bound vector determines the problem dimension.
  • Here we will use np.array([-1]), i.e., a one-dim function.
TensorBoard

Similar to the one-dimensional case, which was introduced in Section Section 7.5, we can use TensorBoard to monitor the progress of the optimization. We will use the same code, only the prefix is different:

from spotPython.utils.init import fun_control_init
PREFIX = "07_Y"
fun_control = fun_control_init(
    PREFIX=PREFIX,
    fun_evals = 25,
    lower = np.array([-1]),
    upper = np.array([1]),
    tolerance_x = np.sqrt(np.spacing(1)),)
design_control = design_control_init(init_size=10)
Created spot_tensorboard_path: runs/spot_logs/07_Y_maans14_2024-04-22_00-26-25 for SummaryWriter()
spot_1 = spot.Spot(
            fun=fun,
            fun_control=fun_control,
            design_control=design_control)
spot_1.run()
spotPython tuning: 1.2026789271012512e-09 [####------] 44.00% 
spotPython tuning: 1.2026789271012512e-09 [#####-----] 48.00% 
spotPython tuning: 1.2026789271012512e-09 [#####-----] 52.00% 
spotPython tuning: 1.2026789271012512e-09 [######----] 56.00% 
spotPython tuning: 3.7010904275056666e-10 [######----] 60.00% 
spotPython tuning: 3.7010904275056666e-10 [######----] 64.00% 
spotPython tuning: 3.7010904275056666e-10 [#######---] 68.00% 
spotPython tuning: 3.7010904275056666e-10 [#######---] 72.00% 
spotPython tuning: 3.7010904275056666e-10 [########--] 76.00% 
spotPython tuning: 3.7010904275056666e-10 [########--] 80.00% 
spotPython tuning: 3.7010904275056666e-10 [########--] 84.00% 
spotPython tuning: 3.7010904275056666e-10 [#########-] 88.00% 
spotPython tuning: 2.802111689321758e-11 [#########-] 92.00% 
spotPython tuning: 2.802111689321758e-11 [##########] 96.00% 
spotPython tuning: 2.802111689321758e-11 [##########] 100.00% Done...

{'CHECKPOINT_PATH': 'runs/saved_models/',
 'DATASET_PATH': 'data/',
 'PREFIX': '07_Y',
 'RESULTS_PATH': 'results/',
 'TENSORBOARD_PATH': 'runs/',
 '_L_in': None,
 '_L_out': None,
 '_torchmetric': None,
 'accelerator': 'auto',
 'converters': None,
 'core_model': None,
 'core_model_name': None,
 'counter': 25,
 'data': None,
 'data_dir': './data',
 'data_module': None,
 'data_set': None,
 'data_set_name': None,
 'db_dict_name': None,
 'design': None,
 'device': None,
 'devices': 1,
 'enable_progress_bar': False,
 'eval': None,
 'fun_evals': 25,
 'fun_repeats': 1,
 'horizon': None,
 'infill_criterion': 'y',
 'k_folds': 3,
 'log_graph': False,
 'log_level': 50,
 'loss_function': None,
 'lower': array([-1]),
 'max_surrogate_points': 30,
 'max_time': 1,
 'metric_params': {},
 'metric_river': None,
 'metric_sklearn': None,
 'metric_sklearn_name': None,
 'metric_torch': None,
 'model_dict': {},
 'n_points': 1,
 'n_samples': None,
 'n_total': None,
 'noise': False,
 'num_workers': 0,
 'ocba_delta': 0,
 'oml_grace_period': None,
 'optimizer': None,
 'path': None,
 'prep_model': None,
 'prep_model_name': None,
 'progress_file': None,
 'save_model': False,
 'scenario': None,
 'seed': 123,
 'show_batch_interval': 1000000,
 'show_models': False,
 'show_progress': True,
 'shuffle': None,
 'sigma': 0.0,
 'spot_tensorboard_path': 'runs/spot_logs/07_Y_maans14_2024-04-22_00-26-25',
 'spot_writer': <torch.utils.tensorboard.writer.SummaryWriter object at 0x3c0b8dfd0>,
 'target_column': None,
 'target_type': None,
 'task': None,
 'test': None,
 'test_seed': 1234,
 'test_size': 0.4,
 'tolerance_x': 1.4901161193847656e-08,
 'train': None,
 'upper': array([1]),
 'var_name': None,
 'var_type': ['num'],
 'verbosity': 0,
 'weight_coeff': 0.0,
 'weights': 1.0,
 'weights_entry': None}
<spotPython.spot.spot.Spot at 0x3c0bb9250>

12.1.2 Results

spot_1.print_results()
min y: 2.802111689321758e-11
x0: -5.293497604912803e-06
[['x0', -5.293497604912803e-06]]
spot_1.plot_progress(log_y=True)

TensorBoard visualization of the spotPython optimization process and the surrogate model.

12.2 Same, but with EI as infill_criterion

PREFIX = "07_EI_ISO"
fun_control = fun_control_init(
    PREFIX=PREFIX,
    lower = np.array([-1]),
    upper = np.array([1]),
    fun_evals = 25,
    tolerance_x = np.sqrt(np.spacing(1)),
    infill_criterion = "ei")
Created spot_tensorboard_path: runs/spot_logs/07_EI_ISO_maans14_2024-04-22_00-26-27 for SummaryWriter()
spot_1_ei = spot.Spot(fun=fun,
                     fun_control=fun_control)
spot_1_ei.run()
spotPython tuning: 9.993558891826623e-09 [####------] 44.00% 
spotPython tuning: 9.993558891826623e-09 [#####-----] 48.00% 
spotPython tuning: 9.993558891826623e-09 [#####-----] 52.00% 
spotPython tuning: 9.993558891826623e-09 [######----] 56.00% 
spotPython tuning: 3.016921825539976e-12 [######----] 60.00% 
spotPython tuning: 3.016921825539976e-12 [######----] 64.00% 
spotPython tuning: 3.016921825539976e-12 [#######---] 68.00% 
spotPython tuning: 3.016921825539976e-12 [#######---] 72.00% 
spotPython tuning: 3.016921825539976e-12 [########--] 76.00% 
spotPython tuning: 3.016921825539976e-12 [########--] 80.00% 
spotPython tuning: 3.016921825539976e-12 [########--] 84.00% 
spotPython tuning: 3.016921825539976e-12 [#########-] 88.00% 
spotPython tuning: 3.016921825539976e-12 [#########-] 92.00% 
spotPython tuning: 3.016921825539976e-12 [##########] 96.00% 
spotPython tuning: 3.016921825539976e-12 [##########] 100.00% Done...

{'CHECKPOINT_PATH': 'runs/saved_models/',
 'DATASET_PATH': 'data/',
 'PREFIX': '07_EI_ISO',
 'RESULTS_PATH': 'results/',
 'TENSORBOARD_PATH': 'runs/',
 '_L_in': None,
 '_L_out': None,
 '_torchmetric': None,
 'accelerator': 'auto',
 'converters': None,
 'core_model': None,
 'core_model_name': None,
 'counter': 25,
 'data': None,
 'data_dir': './data',
 'data_module': None,
 'data_set': None,
 'data_set_name': None,
 'db_dict_name': None,
 'design': None,
 'device': None,
 'devices': 1,
 'enable_progress_bar': False,
 'eval': None,
 'fun_evals': 25,
 'fun_repeats': 1,
 'horizon': None,
 'infill_criterion': 'ei',
 'k_folds': 3,
 'log_graph': False,
 'log_level': 50,
 'loss_function': None,
 'lower': array([-1]),
 'max_surrogate_points': 30,
 'max_time': 1,
 'metric_params': {},
 'metric_river': None,
 'metric_sklearn': None,
 'metric_sklearn_name': None,
 'metric_torch': None,
 'model_dict': {},
 'n_points': 1,
 'n_samples': None,
 'n_total': None,
 'noise': False,
 'num_workers': 0,
 'ocba_delta': 0,
 'oml_grace_period': None,
 'optimizer': None,
 'path': None,
 'prep_model': None,
 'prep_model_name': None,
 'progress_file': None,
 'save_model': False,
 'scenario': None,
 'seed': 123,
 'show_batch_interval': 1000000,
 'show_models': False,
 'show_progress': True,
 'shuffle': None,
 'sigma': 0.0,
 'spot_tensorboard_path': 'runs/spot_logs/07_EI_ISO_maans14_2024-04-22_00-26-27',
 'spot_writer': <torch.utils.tensorboard.writer.SummaryWriter object at 0x3c8588d90>,
 'target_column': None,
 'target_type': None,
 'task': None,
 'test': None,
 'test_seed': 1234,
 'test_size': 0.4,
 'tolerance_x': 1.4901161193847656e-08,
 'train': None,
 'upper': array([1]),
 'var_name': None,
 'var_type': ['num'],
 'verbosity': 0,
 'weight_coeff': 0.0,
 'weights': 1.0,
 'weights_entry': None}
<spotPython.spot.spot.Spot at 0x3c0c52290>
spot_1_ei.plot_progress(log_y=True)

spot_1_ei.print_results()
min y: 3.016921825539976e-12
x0: 1.7369288487269638e-06
[['x0', 1.7369288487269638e-06]]

TensorBoard visualization of the spotPython optimization process and the surrogate model. Expected improvement, isotropic Kriging.

12.3 Non-isotropic Kriging

PREFIX = "07_EI_NONISO"
fun_control = fun_control_init(
    PREFIX=PREFIX,
    lower = np.array([-1, -1]),
    upper = np.array([1, 1]),
    fun_evals = 25,
    tolerance_x = np.sqrt(np.spacing(1)),
    infill_criterion = "ei")
surrogate_control = surrogate_control_init(
    n_theta=2,
    noise=False,
    )
Created spot_tensorboard_path: runs/spot_logs/07_EI_NONISO_maans14_2024-04-22_00-26-28 for SummaryWriter()
spot_2_ei_noniso = spot.Spot(fun=fun,
                   fun_control=fun_control,
                   surrogate_control=surrogate_control)
spot_2_ei_noniso.run()
spotPython tuning: 2.035369116580917e-05 [####------] 44.00% 
spotPython tuning: 2.035369116580917e-05 [#####-----] 48.00% 
spotPython tuning: 2.035369116580917e-05 [#####-----] 52.00% 
spotPython tuning: 1.0764759208059285e-05 [######----] 56.00% 
spotPython tuning: 1.0764759208059285e-05 [######----] 60.00% 
spotPython tuning: 1.2512039520452527e-07 [######----] 64.00% 
spotPython tuning: 1.2512039520452527e-07 [#######---] 68.00% 
spotPython tuning: 1.2512039520452527e-07 [#######---] 72.00% 
spotPython tuning: 1.2512039520452527e-07 [########--] 76.00% 
spotPython tuning: 1.2512039520452527e-07 [########--] 80.00% 
spotPython tuning: 1.2512039520452527e-07 [########--] 84.00% 
spotPython tuning: 1.2512039520452527e-07 [#########-] 88.00% 
spotPython tuning: 1.2512039520452527e-07 [#########-] 92.00% 
spotPython tuning: 1.2512039520452527e-07 [##########] 96.00% 
spotPython tuning: 1.2512039520452527e-07 [##########] 100.00% Done...

{'CHECKPOINT_PATH': 'runs/saved_models/',
 'DATASET_PATH': 'data/',
 'PREFIX': '07_EI_NONISO',
 'RESULTS_PATH': 'results/',
 'TENSORBOARD_PATH': 'runs/',
 '_L_in': None,
 '_L_out': None,
 '_torchmetric': None,
 'accelerator': 'auto',
 'converters': None,
 'core_model': None,
 'core_model_name': None,
 'counter': 25,
 'data': None,
 'data_dir': './data',
 'data_module': None,
 'data_set': None,
 'data_set_name': None,
 'db_dict_name': None,
 'design': None,
 'device': None,
 'devices': 1,
 'enable_progress_bar': False,
 'eval': None,
 'fun_evals': 25,
 'fun_repeats': 1,
 'horizon': None,
 'infill_criterion': 'ei',
 'k_folds': 3,
 'log_graph': False,
 'log_level': 50,
 'loss_function': None,
 'lower': array([-1, -1]),
 'max_surrogate_points': 30,
 'max_time': 1,
 'metric_params': {},
 'metric_river': None,
 'metric_sklearn': None,
 'metric_sklearn_name': None,
 'metric_torch': None,
 'model_dict': {},
 'n_points': 1,
 'n_samples': None,
 'n_total': None,
 'noise': False,
 'num_workers': 0,
 'ocba_delta': 0,
 'oml_grace_period': None,
 'optimizer': None,
 'path': None,
 'prep_model': None,
 'prep_model_name': None,
 'progress_file': None,
 'save_model': False,
 'scenario': None,
 'seed': 123,
 'show_batch_interval': 1000000,
 'show_models': False,
 'show_progress': True,
 'shuffle': None,
 'sigma': 0.0,
 'spot_tensorboard_path': 'runs/spot_logs/07_EI_NONISO_maans14_2024-04-22_00-26-28',
 'spot_writer': <torch.utils.tensorboard.writer.SummaryWriter object at 0x3c855d8d0>,
 'target_column': None,
 'target_type': None,
 'task': None,
 'test': None,
 'test_seed': 1234,
 'test_size': 0.4,
 'tolerance_x': 1.4901161193847656e-08,
 'train': None,
 'upper': array([1, 1]),
 'var_name': None,
 'var_type': ['num'],
 'verbosity': 0,
 'weight_coeff': 0.0,
 'weights': 1.0,
 'weights_entry': None}
<spotPython.spot.spot.Spot at 0x3c851cb90>
spot_2_ei_noniso.plot_progress(log_y=True)

spot_2_ei_noniso.print_results()
min y: 1.2512039520452527e-07
x0: -0.00023903776922459534
x1: 0.0002607323150065108
[['x0', -0.00023903776922459534], ['x1', 0.0002607323150065108]]
spot_2_ei_noniso.surrogate.plot()

TensorBoard visualization of the spotPython optimization process and the surrogate model. Expected improvement, isotropic Kriging.

12.4 Using sklearn Surrogates

12.4.1 The spot Loop

The spot loop consists of the following steps:

  1. Init: Build initial design \(X\)
  2. Evaluate initial design on real objective \(f\): \(y = f(X)\)
  3. Build surrogate: \(S = S(X,y)\)
  4. Optimize on surrogate: \(X_0 = \text{optimize}(S)\)
  5. Evaluate on real objective: \(y_0 = f(X_0)\)
  6. Impute (Infill) new points: \(X = X \cup X_0\), \(y = y \cup y_0\).
  7. Got 3.

The spot loop is implemented in R as follows:

Visual representation of the model based search with SPOT. Taken from: Bartz-Beielstein, T., and Zaefferer, M. Hyperparameter tuning approaches. In Hyperparameter Tuning for Machine and Deep Learning with R - A Practical Guide, E. Bartz, T. Bartz-Beielstein, M. Zaefferer, and O. Mersmann, Eds. Springer, 2022, ch. 4, pp. 67–114.

12.4.2 spot: The Initial Model

12.4.2.1 Example: Modifying the initial design size

This is the “Example: Modifying the initial design size” from Chapter 4.5.1 in [bart21i].

spot_ei = spot.Spot(fun=fun,
                fun_control=fun_control_init(
                lower = np.array([-1,-1]),
                upper= np.array([1,1])), 
                design_control = design_control_init(init_size=5))
spot_ei.run()
spotPython tuning: 0.1377171852680486 [####------] 40.00% 
spotPython tuning: 0.008763557388693657 [#####-----] 46.67% 
spotPython tuning: 0.002832279071142736 [#####-----] 53.33% 
spotPython tuning: 0.0008138662965600185 [######----] 60.00% 
spotPython tuning: 0.00036637583790222027 [#######---] 66.67% 
spotPython tuning: 0.00036006945938022686 [#######---] 73.33% 
spotPython tuning: 0.0003591078890308837 [########--] 80.00% 
spotPython tuning: 0.00032713515580249373 [#########-] 86.67% 
spotPython tuning: 0.0002785854368057176 [#########-] 93.33% 
spotPython tuning: 0.0001638494253170647 [##########] 100.00% Done...

{'CHECKPOINT_PATH': 'runs/saved_models/',
 'DATASET_PATH': 'data/',
 'PREFIX': None,
 'RESULTS_PATH': 'results/',
 'TENSORBOARD_PATH': 'runs/',
 '_L_in': None,
 '_L_out': None,
 '_torchmetric': None,
 'accelerator': 'auto',
 'converters': None,
 'core_model': None,
 'core_model_name': None,
 'counter': 15,
 'data': None,
 'data_dir': './data',
 'data_module': None,
 'data_set': None,
 'data_set_name': None,
 'db_dict_name': None,
 'design': None,
 'device': None,
 'devices': 1,
 'enable_progress_bar': False,
 'eval': None,
 'fun_evals': 15,
 'fun_repeats': 1,
 'horizon': None,
 'infill_criterion': 'y',
 'k_folds': 3,
 'log_graph': False,
 'log_level': 50,
 'loss_function': None,
 'lower': array([-1, -1]),
 'max_surrogate_points': 30,
 'max_time': 1,
 'metric_params': {},
 'metric_river': None,
 'metric_sklearn': None,
 'metric_sklearn_name': None,
 'metric_torch': None,
 'model_dict': {},
 'n_points': 1,
 'n_samples': None,
 'n_total': None,
 'noise': False,
 'num_workers': 0,
 'ocba_delta': 0,
 'oml_grace_period': None,
 'optimizer': None,
 'path': None,
 'prep_model': None,
 'prep_model_name': None,
 'progress_file': None,
 'save_model': False,
 'scenario': None,
 'seed': 123,
 'show_batch_interval': 1000000,
 'show_models': False,
 'show_progress': True,
 'shuffle': None,
 'sigma': 0.0,
 'spot_tensorboard_path': None,
 'spot_writer': None,
 'target_column': None,
 'target_type': None,
 'task': None,
 'test': None,
 'test_seed': 1234,
 'test_size': 0.4,
 'tolerance_x': 0,
 'train': None,
 'upper': array([1, 1]),
 'var_name': None,
 'var_type': ['num'],
 'verbosity': 0,
 'weight_coeff': 0.0,
 'weights': 1.0,
 'weights_entry': None}
<spotPython.spot.spot.Spot at 0x3c11b13d0>
spot_ei.plot_progress()

np.min(spot_1.y), np.min(spot_ei.y)
(2.802111689321758e-11, 0.0001638494253170647)

12.4.3 Init: Build Initial Design

from spotPython.design.spacefilling import spacefilling
from spotPython.build.kriging import Kriging
from spotPython.fun.objectivefunctions import analytical
gen = spacefilling(2)
rng = np.random.RandomState(1)
lower = np.array([-5,-0])
upper = np.array([10,15])
fun = analytical().fun_branin

X = gen.scipy_lhd(10, lower=lower, upper = upper)
print(X)
y = fun(X, fun_control=fun_control)
print(y)
[[ 8.97647221 13.41926847]
 [ 0.66946019  1.22344228]
 [ 5.23614115 13.78185824]
 [ 5.6149825  11.5851384 ]
 [-1.72963184  1.66516096]
 [-4.26945568  7.1325531 ]
 [ 1.26363761 10.17935555]
 [ 2.88779942  8.05508969]
 [-3.39111089  4.15213772]
 [ 7.30131231  5.22275244]]
[128.95676449  31.73474356 172.89678121 126.71295908  64.34349975
  70.16178611  48.71407916  31.77322887  76.91788181  30.69410529]
S = Kriging(name='kriging',  seed=123)
S.fit(X, y)
S.plot()

gen = spacefilling(2, seed=123)
X0 = gen.scipy_lhd(3)
gen = spacefilling(2, seed=345)
X1 = gen.scipy_lhd(3)
X2 = gen.scipy_lhd(3)
gen = spacefilling(2, seed=123)
X3 = gen.scipy_lhd(3)
X0, X1, X2, X3
(array([[0.77254938, 0.31539299],
        [0.59321338, 0.93854273],
        [0.27469803, 0.3959685 ]]),
 array([[0.78373509, 0.86811887],
        [0.06692621, 0.6058029 ],
        [0.41374778, 0.00525456]]),
 array([[0.121357  , 0.69043832],
        [0.41906219, 0.32838498],
        [0.86742658, 0.52910374]]),
 array([[0.77254938, 0.31539299],
        [0.59321338, 0.93854273],
        [0.27469803, 0.3959685 ]]))

12.4.4 Evaluate

12.4.5 Build Surrogate

12.4.6 A Simple Predictor

The code below shows how to use a simple model for prediction.

  • Assume that only two (very costly) measurements are available:

    1. f(0) = 0.5
    2. f(2) = 2.5
  • We are interested in the value at \(x_0 = 1\), i.e., \(f(x_0 = 1)\), but cannot run an additional, third experiment.

from sklearn import linear_model
X = np.array([[0], [2]])
y = np.array([0.5, 2.5])
S_lm = linear_model.LinearRegression()
S_lm = S_lm.fit(X, y)
X0 = np.array([[1]])
y0 = S_lm.predict(X0)
print(y0)
[1.5]
  • Central Idea:
    • Evaluation of the surrogate model S_lm is much cheaper (or / and much faster) than running the real-world experiment \(f\).

12.5 Gaussian Processes regression: basic introductory example

This example was taken from scikit-learn. After fitting our model, we see that the hyperparameters of the kernel have been optimized. Now, we will use our kernel to compute the mean prediction of the full dataset and plot the 95% confidence interval.

import numpy as np
import matplotlib.pyplot as plt
import math as m
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import RBF

X = np.linspace(start=0, stop=10, num=1_000).reshape(-1, 1)
y = np.squeeze(X * np.sin(X))
rng = np.random.RandomState(1)
training_indices = rng.choice(np.arange(y.size), size=6, replace=False)
X_train, y_train = X[training_indices], y[training_indices]

kernel = 1 * RBF(length_scale=1.0, length_scale_bounds=(1e-2, 1e2))
gaussian_process = GaussianProcessRegressor(kernel=kernel, n_restarts_optimizer=9)
gaussian_process.fit(X_train, y_train)
gaussian_process.kernel_

mean_prediction, std_prediction = gaussian_process.predict(X, return_std=True)

plt.plot(X, y, label=r"$f(x) = x \sin(x)$", linestyle="dotted")
plt.scatter(X_train, y_train, label="Observations")
plt.plot(X, mean_prediction, label="Mean prediction")
plt.fill_between(
    X.ravel(),
    mean_prediction - 1.96 * std_prediction,
    mean_prediction + 1.96 * std_prediction,
    alpha=0.5,
    label=r"95% confidence interval",
)
plt.legend()
plt.xlabel("$x$")
plt.ylabel("$f(x)$")
_ = plt.title("sk-learn Version: Gaussian process regression on noise-free dataset")

from spotPython.build.kriging import Kriging
import numpy as np
import matplotlib.pyplot as plt
rng = np.random.RandomState(1)
X = np.linspace(start=0, stop=10, num=1_000).reshape(-1, 1)
y = np.squeeze(X * np.sin(X))
training_indices = rng.choice(np.arange(y.size), size=6, replace=False)
X_train, y_train = X[training_indices], y[training_indices]


S = Kriging(name='kriging',  seed=123, log_level=50, cod_type="norm")
S.fit(X_train, y_train)

mean_prediction, std_prediction, ei = S.predict(X, return_val="all")

std_prediction

plt.plot(X, y, label=r"$f(x) = x \sin(x)$", linestyle="dotted")
plt.scatter(X_train, y_train, label="Observations")
plt.plot(X, mean_prediction, label="Mean prediction")
plt.fill_between(
    X.ravel(),
    mean_prediction - 1.96 * std_prediction,
    mean_prediction + 1.96 * std_prediction,
    alpha=0.5,
    label=r"95% confidence interval",
)
plt.legend()
plt.xlabel("$x$")
plt.ylabel("$f(x)$")
_ = plt.title("spotPython Version: Gaussian process regression on noise-free dataset")

12.6 The Surrogate: Using scikit-learn models

Default is the internal kriging surrogate.

S_0 = Kriging(name='kriging', seed=123)

Models from scikit-learn can be selected, e.g., Gaussian Process:

# Needed for the sklearn surrogates:
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import RBF
from sklearn.tree import DecisionTreeRegressor
from sklearn.ensemble import RandomForestRegressor
from sklearn import linear_model
from sklearn import tree
import pandas as pd
kernel = 1 * RBF(length_scale=1.0, length_scale_bounds=(1e-2, 1e2))
S_GP = GaussianProcessRegressor(kernel=kernel, n_restarts_optimizer=9)
  • and many more:
S_Tree = DecisionTreeRegressor(random_state=0)
S_LM = linear_model.LinearRegression()
S_Ridge = linear_model.Ridge()
S_RF = RandomForestRegressor(max_depth=2, random_state=0) 
  • The scikit-learn GP model S_GP is selected.
S = S_GP
isinstance(S, GaussianProcessRegressor)
True
from spotPython.fun.objectivefunctions import analytical
fun = analytical().fun_branin
fun_control = fun_control_init(
    lower = np.array([-5,-0]),
    upper = np.array([10,15]),
    fun_evals = 15)    
design_control = design_control_init(init_size=5)
spot_GP = spot.Spot(fun=fun, 
                    fun_control=fun_control,
                    surrogate=S, 
                    design_control=design_control)
spot_GP.run()
spotPython tuning: 24.51465459019188 [####------] 40.00% 
spotPython tuning: 11.003077541587748 [#####-----] 46.67% 
spotPython tuning: 11.003077541587748 [#####-----] 53.33% 
spotPython tuning: 7.281227279299504 [######----] 60.00% 
spotPython tuning: 7.281227279299504 [#######---] 66.67% 
spotPython tuning: 7.281227279299504 [#######---] 73.33% 
spotPython tuning: 2.9519489314482 [########--] 80.00% 
spotPython tuning: 2.9519489314482 [#########-] 86.67% 
spotPython tuning: 2.104972804244822 [#########-] 93.33% 
spotPython tuning: 1.9431600962086772 [##########] 100.00% Done...

{'CHECKPOINT_PATH': 'runs/saved_models/',
 'DATASET_PATH': 'data/',
 'PREFIX': None,
 'RESULTS_PATH': 'results/',
 'TENSORBOARD_PATH': 'runs/',
 '_L_in': None,
 '_L_out': None,
 '_torchmetric': None,
 'accelerator': 'auto',
 'converters': None,
 'core_model': None,
 'core_model_name': None,
 'counter': 15,
 'data': None,
 'data_dir': './data',
 'data_module': None,
 'data_set': None,
 'data_set_name': None,
 'db_dict_name': None,
 'design': None,
 'device': None,
 'devices': 1,
 'enable_progress_bar': False,
 'eval': None,
 'fun_evals': 15,
 'fun_repeats': 1,
 'horizon': None,
 'infill_criterion': 'y',
 'k_folds': 3,
 'log_graph': False,
 'log_level': 50,
 'loss_function': None,
 'lower': array([-5,  0]),
 'max_surrogate_points': 30,
 'max_time': 1,
 'metric_params': {},
 'metric_river': None,
 'metric_sklearn': None,
 'metric_sklearn_name': None,
 'metric_torch': None,
 'model_dict': {},
 'n_points': 1,
 'n_samples': None,
 'n_total': None,
 'noise': False,
 'num_workers': 0,
 'ocba_delta': 0,
 'oml_grace_period': None,
 'optimizer': None,
 'path': None,
 'prep_model': None,
 'prep_model_name': None,
 'progress_file': None,
 'save_model': False,
 'scenario': None,
 'seed': 123,
 'show_batch_interval': 1000000,
 'show_models': False,
 'show_progress': True,
 'shuffle': None,
 'sigma': 0.0,
 'spot_tensorboard_path': None,
 'spot_writer': None,
 'target_column': None,
 'target_type': None,
 'task': None,
 'test': None,
 'test_seed': 1234,
 'test_size': 0.4,
 'tolerance_x': 0,
 'train': None,
 'upper': array([10, 15]),
 'var_name': None,
 'var_type': ['num'],
 'verbosity': 0,
 'weight_coeff': 0.0,
 'weights': 1.0,
 'weights_entry': None}
<spotPython.spot.spot.Spot at 0x3c175a350>
spot_GP.y
array([ 69.32459936, 152.38491454, 107.92560483,  24.51465459,
        76.73500031,  86.30425303,  11.00307754,  16.11742138,
         7.28122728,  21.82317903,  10.96088904,   2.95194893,
         3.02910742,   2.1049728 ,   1.9431601 ])
spot_GP.plot_progress()

spot_GP.print_results()
min y: 1.9431600962086772
x0: 10.0
x1: 2.9985482809555464
[['x0', 10.0], ['x1', 2.9985482809555464]]

12.7 Additional Examples

# Needed for the sklearn surrogates:
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import RBF
from sklearn.tree import DecisionTreeRegressor
from sklearn.ensemble import RandomForestRegressor
from sklearn import linear_model
from sklearn import tree
import pandas as pd
kernel = 1 * RBF(length_scale=1.0, length_scale_bounds=(1e-2, 1e2))
S_GP = GaussianProcessRegressor(kernel=kernel, n_restarts_optimizer=9)
from spotPython.build.kriging import Kriging
import numpy as np
import spotPython
from spotPython.fun.objectivefunctions import analytical
from spotPython.spot import spot

S_K = Kriging(name='kriging',
              seed=123,
              log_level=50,
              infill_criterion = "y",
              n_theta=1,
              noise=False,
              cod_type="norm")
fun = analytical().fun_sphere

fun_control = fun_control_init(
    lower = np.array([-1,-1]),
    upper = np.array([1,1]),
    fun_evals = 25)

spot_S_K = spot.Spot(fun=fun,
                     fun_control=fun_control,
                     surrogate=S_K,
                     design_control=design_control,
                     surrogate_control=surrogate_control)
spot_S_K.run()
spotPython tuning: 0.13771718778810743 [##--------] 24.00% 
spotPython tuning: 0.008768000187888899 [###-------] 28.00% 
spotPython tuning: 0.0028300907437246053 [###-------] 32.00% 
spotPython tuning: 0.0008148020998531609 [####------] 36.00% 
spotPython tuning: 0.00036681248440550095 [####------] 40.00% 
spotPython tuning: 0.00035607605553701025 [####------] 44.00% 
spotPython tuning: 0.00035607605553701025 [#####-----] 48.00% 
spotPython tuning: 0.00033033596693814263 [#####-----] 52.00% 
spotPython tuning: 0.0002774179969789593 [######----] 56.00% 
spotPython tuning: 0.00016886412273302311 [######----] 60.00% 
spotPython tuning: 2.0349536932144563e-05 [######----] 64.00% 
spotPython tuning: 1.6621220007683266e-06 [#######---] 68.00% 
spotPython tuning: 4.905822935561126e-07 [#######---] 72.00% 
spotPython tuning: 4.7634545282279014e-07 [########--] 76.00% 
spotPython tuning: 3.966290585455581e-07 [########--] 80.00% 
spotPython tuning: 1.9602185212475464e-07 [########--] 84.00% 
spotPython tuning: 1.7115221726800905e-07 [#########-] 88.00% 
spotPython tuning: 1.7115221726800905e-07 [#########-] 92.00% 
spotPython tuning: 1.7115221726800905e-07 [##########] 96.00% 
spotPython tuning: 1.7115221726800905e-07 [##########] 100.00% Done...

{'CHECKPOINT_PATH': 'runs/saved_models/',
 'DATASET_PATH': 'data/',
 'PREFIX': None,
 'RESULTS_PATH': 'results/',
 'TENSORBOARD_PATH': 'runs/',
 '_L_in': None,
 '_L_out': None,
 '_torchmetric': None,
 'accelerator': 'auto',
 'converters': None,
 'core_model': None,
 'core_model_name': None,
 'counter': 25,
 'data': None,
 'data_dir': './data',
 'data_module': None,
 'data_set': None,
 'data_set_name': None,
 'db_dict_name': None,
 'design': None,
 'device': None,
 'devices': 1,
 'enable_progress_bar': False,
 'eval': None,
 'fun_evals': 25,
 'fun_repeats': 1,
 'horizon': None,
 'infill_criterion': 'y',
 'k_folds': 3,
 'log_graph': False,
 'log_level': 50,
 'loss_function': None,
 'lower': array([-1, -1]),
 'max_surrogate_points': 30,
 'max_time': 1,
 'metric_params': {},
 'metric_river': None,
 'metric_sklearn': None,
 'metric_sklearn_name': None,
 'metric_torch': None,
 'model_dict': {},
 'n_points': 1,
 'n_samples': None,
 'n_total': None,
 'noise': False,
 'num_workers': 0,
 'ocba_delta': 0,
 'oml_grace_period': None,
 'optimizer': None,
 'path': None,
 'prep_model': None,
 'prep_model_name': None,
 'progress_file': None,
 'save_model': False,
 'scenario': None,
 'seed': 123,
 'show_batch_interval': 1000000,
 'show_models': False,
 'show_progress': True,
 'shuffle': None,
 'sigma': 0.0,
 'spot_tensorboard_path': None,
 'spot_writer': None,
 'target_column': None,
 'target_type': None,
 'task': None,
 'test': None,
 'test_seed': 1234,
 'test_size': 0.4,
 'tolerance_x': 0,
 'train': None,
 'upper': array([1, 1]),
 'var_name': None,
 'var_type': ['num'],
 'verbosity': 0,
 'weight_coeff': 0.0,
 'weights': 1.0,
 'weights_entry': None}
<spotPython.spot.spot.Spot at 0x3c178a350>
spot_S_K.plot_progress(log_y=True)

spot_S_K.surrogate.plot()

spot_S_K.print_results()
min y: 1.7115221726800905e-07
x0: 0.0003105897139994429
x1: 0.0002732878460995902
[['x0', 0.0003105897139994429], ['x1', 0.0002732878460995902]]

12.7.1 Optimize on Surrogate

12.7.2 Evaluate on Real Objective

12.7.3 Impute / Infill new Points

12.8 Tests

import numpy as np
from spotPython.spot import spot
from spotPython.fun.objectivefunctions import analytical

fun_sphere = analytical().fun_sphere

fun_control = fun_control_init(
                    lower=np.array([-1, -1]),
                    upper=np.array([1, 1]),
                    n_points = 2)
spot_1 = spot.Spot(
    fun=fun_sphere,
    fun_control=fun_control,
)

# (S-2) Initial Design:
spot_1.X = spot_1.design.scipy_lhd(
    spot_1.design_control["init_size"], lower=spot_1.lower, upper=spot_1.upper
)
print(spot_1.X)

# (S-3): Eval initial design:
spot_1.y = spot_1.fun(spot_1.X)
print(spot_1.y)

spot_1.fit_surrogate()
X0 = spot_1.suggest_new_X()
print(X0)
assert X0.size == spot_1.n_points * spot_1.k
[[ 0.86352963  0.7892358 ]
 [-0.24407197 -0.83687436]
 [ 0.36481882  0.8375811 ]
 [ 0.415331    0.54468512]
 [-0.56395091 -0.77797854]
 [-0.90259409 -0.04899292]
 [-0.16484832  0.35724741]
 [ 0.05170659  0.07401196]
 [-0.78548145 -0.44638164]
 [ 0.64017497 -0.30363301]]
[1.36857656 0.75992983 0.83463487 0.46918172 0.92329124 0.8170764
 0.15480068 0.00815134 0.81623768 0.502017  ]
[[0.00159092 0.00410652]
 [0.00190779 0.00379162]]

12.9 EI: The Famous Schonlau Example

X_train0 = np.array([1, 2, 3, 4, 12]).reshape(-1,1)
X_train = np.linspace(start=0, stop=10, num=5).reshape(-1, 1)
from spotPython.build.kriging import Kriging
import numpy as np
import matplotlib.pyplot as plt

X_train = np.array([1., 2., 3., 4., 12.]).reshape(-1,1)
y_train = np.array([0., -1.75, -2, -0.5, 5.])

S = Kriging(name='kriging',  seed=123, log_level=50, n_theta=1, noise=False, cod_type="norm")
S.fit(X_train, y_train)

X = np.linspace(start=0, stop=13, num=1000).reshape(-1, 1)
mean_prediction, std_prediction, ei = S.predict(X, return_val="all")

plt.scatter(X_train, y_train, label="Observations")
plt.plot(X, mean_prediction, label="Mean prediction")
if True:
    plt.fill_between(
        X.ravel(),
        mean_prediction - 2 * std_prediction,
        mean_prediction + 2 * std_prediction,
        alpha=0.5,
        label=r"95% confidence interval",
    )
plt.legend()
plt.xlabel("$x$")
plt.ylabel("$f(x)$")
_ = plt.title("Gaussian process regression on noise-free dataset")

#plt.plot(X, y, label=r"$f(x) = x \sin(x)$", linestyle="dotted")
# plt.scatter(X_train, y_train, label="Observations")
plt.plot(X, -ei, label="Expected Improvement")
plt.legend()
plt.xlabel("$x$")
plt.ylabel("$f(x)$")
_ = plt.title("Gaussian process regression on noise-free dataset")

S.log
{'negLnLike': array([1.20788205]),
 'theta': array([-0.9900252]),
 'p': [],
 'Lambda': []}

12.10 EI: The Forrester Example

from spotPython.build.kriging import Kriging
import numpy as np
import matplotlib.pyplot as plt
import spotPython
from spotPython.fun.objectivefunctions import analytical
from spotPython.spot import spot

# exact x locations are unknown:
X_train = np.array([0.0, 0.175, 0.225, 0.3, 0.35, 0.375, 0.5,1]).reshape(-1,1)

fun = analytical().fun_forrester
fun_control = fun_control_init(
    PREFIX="07_EI_FORRESTER",
    sigma=1.0,
    seed=123,)
y_train = fun(X_train, fun_control=fun_control)

S = Kriging(name='kriging',  seed=123, log_level=50, n_theta=1, noise=False, cod_type="norm")
S.fit(X_train, y_train)

X = np.linspace(start=0, stop=1, num=1000).reshape(-1, 1)
mean_prediction, std_prediction, ei = S.predict(X, return_val="all")

plt.scatter(X_train, y_train, label="Observations")
plt.plot(X, mean_prediction, label="Mean prediction")
if True:
    plt.fill_between(
        X.ravel(),
        mean_prediction - 2 * std_prediction,
        mean_prediction + 2 * std_prediction,
        alpha=0.5,
        label=r"95% confidence interval",
    )
plt.legend()
plt.xlabel("$x$")
plt.ylabel("$f(x)$")
_ = plt.title("Gaussian process regression on noise-free dataset")
Created spot_tensorboard_path: runs/spot_logs/07_EI_FORRESTER_maans14_2024-04-22_00-26-43 for SummaryWriter()

#plt.plot(X, y, label=r"$f(x) = x \sin(x)$", linestyle="dotted")
# plt.scatter(X_train, y_train, label="Observations")
plt.plot(X, -ei, label="Expected Improvement")
plt.legend()
plt.xlabel("$x$")
plt.ylabel("$f(x)$")
_ = plt.title("Gaussian process regression on noise-free dataset")

12.11 Noise

import numpy as np
import spotPython
from spotPython.fun.objectivefunctions import analytical
from spotPython.spot import spot
from spotPython.design.spacefilling import spacefilling
from spotPython.build.kriging import Kriging
import matplotlib.pyplot as plt

gen = spacefilling(1)
rng = np.random.RandomState(1)
lower = np.array([-10])
upper = np.array([10])
fun = analytical().fun_sphere
fun_control = fun_control_init(
    PREFIX="07_Y",
    sigma=2.0,
    seed=123,)
X = gen.scipy_lhd(10, lower=lower, upper = upper)
print(X)
y = fun(X, fun_control=fun_control)
print(y)
y.shape
X_train = X.reshape(-1,1)
y_train = y

S = Kriging(name='kriging',
            seed=123,
            log_level=50,
            n_theta=1,
            noise=False)
S.fit(X_train, y_train)

X_axis = np.linspace(start=-13, stop=13, num=1000).reshape(-1, 1)
mean_prediction, std_prediction, ei = S.predict(X_axis, return_val="all")

#plt.plot(X, y, label=r"$f(x) = x \sin(x)$", linestyle="dotted")
plt.scatter(X_train, y_train, label="Observations")
#plt.plot(X, ei, label="Expected Improvement")
plt.plot(X_axis, mean_prediction, label="mue")
plt.legend()
plt.xlabel("$x$")
plt.ylabel("$f(x)$")
_ = plt.title("Sphere: Gaussian process regression on noisy dataset")
Created spot_tensorboard_path: runs/spot_logs/07_Y_maans14_2024-04-22_00-26-44 for SummaryWriter()
[[ 0.63529627]
 [-4.10764204]
 [-0.44071975]
 [ 9.63125638]
 [-8.3518118 ]
 [-3.62418901]
 [ 4.15331   ]
 [ 3.4468512 ]
 [ 6.36049088]
 [-7.77978539]]
[-1.57464135 16.13714981  2.77008442 93.14904827 71.59322218 14.28895359
 15.9770567  12.96468767 39.82265329 59.88028242]

S.log
{'negLnLike': array([26.18505386]),
 'theta': array([-1.10547474]),
 'p': [],
 'Lambda': []}
S = Kriging(name='kriging',
            seed=123,
            log_level=50,
            n_theta=1,
            noise=True)
S.fit(X_train, y_train)

X_axis = np.linspace(start=-13, stop=13, num=1000).reshape(-1, 1)
mean_prediction, std_prediction, ei = S.predict(X_axis, return_val="all")

#plt.plot(X, y, label=r"$f(x) = x \sin(x)$", linestyle="dotted")
plt.scatter(X_train, y_train, label="Observations")
#plt.plot(X, ei, label="Expected Improvement")
plt.plot(X_axis, mean_prediction, label="mue")
plt.legend()
plt.xlabel("$x$")
plt.ylabel("$f(x)$")
_ = plt.title("Sphere: Gaussian process regression with nugget on noisy dataset")

S.log
{'negLnLike': array([21.82059174]),
 'theta': array([-2.96946062]),
 'p': [],
 'Lambda': array([4.28985898e-05])}

12.12 Cubic Function

import numpy as np
import spotPython
from spotPython.fun.objectivefunctions import analytical
from spotPython.spot import spot
from spotPython.design.spacefilling import spacefilling
from spotPython.build.kriging import Kriging
import matplotlib.pyplot as plt

gen = spacefilling(1)
rng = np.random.RandomState(1)
lower = np.array([-10])
upper = np.array([10])
fun = analytical().fun_cubed
fun_control = fun_control_init(
    PREFIX="07_Y",
    sigma=10.0,
    seed=123,)

X = gen.scipy_lhd(10, lower=lower, upper = upper)
print(X)
y = fun(X, fun_control=fun_control)
print(y)
y.shape
X_train = X.reshape(-1,1)
y_train = y

S = Kriging(name='kriging',  seed=123, log_level=50, n_theta=1, noise=False)
S.fit(X_train, y_train)

X_axis = np.linspace(start=-13, stop=13, num=1000).reshape(-1, 1)
mean_prediction, std_prediction, ei = S.predict(X_axis, return_val="all")

plt.scatter(X_train, y_train, label="Observations")
#plt.plot(X, ei, label="Expected Improvement")
plt.plot(X_axis, mean_prediction, label="mue")
plt.legend()
plt.xlabel("$x$")
plt.ylabel("$f(x)$")
_ = plt.title("Cubed: Gaussian process regression on noisy dataset")
Created spot_tensorboard_path: runs/spot_logs/07_Y_maans14_2024-04-22_00-26-44 for SummaryWriter()
[[ 0.63529627]
 [-4.10764204]
 [-0.44071975]
 [ 9.63125638]
 [-8.3518118 ]
 [-3.62418901]
 [ 4.15331   ]
 [ 3.4468512 ]
 [ 6.36049088]
 [-7.77978539]]
[ 2.56406437e-01 -6.93071067e+01 -8.56027124e-02  8.93405931e+02
 -5.82561927e+02 -4.76028022e+01  7.16445311e+01  4.09512920e+01
  2.57319028e+02 -4.70871982e+02]

S = Kriging(name='kriging',  seed=123, log_level=0, n_theta=1, noise=True)
S.fit(X_train, y_train)

X_axis = np.linspace(start=-13, stop=13, num=1000).reshape(-1, 1)
mean_prediction, std_prediction, ei = S.predict(X_axis, return_val="all")

plt.scatter(X_train, y_train, label="Observations")
#plt.plot(X, ei, label="Expected Improvement")
plt.plot(X_axis, mean_prediction, label="mue")
plt.legend()
plt.xlabel("$x$")
plt.ylabel("$f(x)$")
_ = plt.title("Cubed: Gaussian process with nugget regression on noisy dataset")

import numpy as np
import spotPython
from spotPython.fun.objectivefunctions import analytical
from spotPython.spot import spot
from spotPython.design.spacefilling import spacefilling
from spotPython.build.kriging import Kriging
import matplotlib.pyplot as plt

gen = spacefilling(1)
rng = np.random.RandomState(1)
lower = np.array([-10])
upper = np.array([10])
fun = analytical().fun_runge
fun_control = fun_control_init(
    PREFIX="07_Y",
    sigma=0.25,
    seed=123,)

X = gen.scipy_lhd(10, lower=lower, upper = upper)
print(X)
y = fun(X, fun_control=fun_control)
print(y)
y.shape
X_train = X.reshape(-1,1)
y_train = y

S = Kriging(name='kriging',  seed=123, log_level=50, n_theta=1, noise=False)
S.fit(X_train, y_train)

X_axis = np.linspace(start=-13, stop=13, num=1000).reshape(-1, 1)
mean_prediction, std_prediction, ei = S.predict(X_axis, return_val="all")

plt.scatter(X_train, y_train, label="Observations")
#plt.plot(X, ei, label="Expected Improvement")
plt.plot(X_axis, mean_prediction, label="mue")
plt.legend()
plt.xlabel("$x$")
plt.ylabel("$f(x)$")
_ = plt.title("Gaussian process regression on noisy dataset")
Created spot_tensorboard_path: runs/spot_logs/07_Y_maans14_2024-04-22_00-26-45 for SummaryWriter()
[[ 0.63529627]
 [-4.10764204]
 [-0.44071975]
 [ 9.63125638]
 [-8.3518118 ]
 [-3.62418901]
 [ 4.15331   ]
 [ 3.4468512 ]
 [ 6.36049088]
 [-7.77978539]]
[0.712453   0.05595118 0.83735691 0.0106654  0.01413372 0.07074765
 0.05479457 0.07763503 0.02412205 0.01625354]

S = Kriging(name='kriging',
            seed=123,
            log_level=50,
            n_theta=1,
            noise=True)
S.fit(X_train, y_train)

X_axis = np.linspace(start=-13, stop=13, num=1000).reshape(-1, 1)
mean_prediction, std_prediction, ei = S.predict(X_axis, return_val="all")

plt.scatter(X_train, y_train, label="Observations")
#plt.plot(X, ei, label="Expected Improvement")
plt.plot(X_axis, mean_prediction, label="mue")
plt.legend()
plt.xlabel("$x$")
plt.ylabel("$f(x)$")
_ = plt.title("Gaussian process regression with nugget on noisy dataset")

12.13 Modifying Lambda Search Space

S = Kriging(name='kriging',
            seed=123,
            log_level=50,
            n_theta=1,
            noise=True,
            min_Lambda=0.1,
            max_Lambda=10)
S.fit(X_train, y_train)

print(f"Lambda: {S.Lambda}")
Lambda: 0.1
X_axis = np.linspace(start=-13, stop=13, num=1000).reshape(-1, 1)
mean_prediction, std_prediction, ei = S.predict(X_axis, return_val="all")

plt.scatter(X_train, y_train, label="Observations")
#plt.plot(X, ei, label="Expected Improvement")
plt.plot(X_axis, mean_prediction, label="mue")
plt.legend()
plt.xlabel("$x$")
plt.ylabel("$f(x)$")
_ = plt.title("Gaussian process regression with nugget on noisy dataset. Modified Lambda search space.")