8  Multi-dimensional Functions

This chapter illustrates how high-dimensional functions can be optimized and analyzed.

8.1 Example: Spot and the 3-dim Sphere Function

import numpy as np
from spotPython.fun.objectivefunctions import analytical
from spotPython.utils.init import fun_control_init, surrogate_control_init
from spotPython.spot import spot

8.1.1 The Objective Function: 3-dim Sphere

The spotPython package provides several classes of objective functions. We will use an analytical objective function, i.e., a function that can be described by a (closed) formula: \[ f(x) = \sum_i^k x_i^2. \]

It is avaliable as fun_sphere in the analytical class [SOURCE].

fun = analytical().fun_sphere

Here we will use problem dimension \(k=3\), which can be specified by the lower bound arrays. The size of the lower bound array determines the problem dimension. If we select -1.0 * np.ones(3), a three-dimensional function is created. In contrast to the one-dimensional case (Section 7.5), where only one theta value was used, we will use three different theta values (one for each dimension), i.e., we set n_theta=3 in the surrogate_control. The prefix is set to "03" to distinguish the results from the one-dimensional case. Again, TensorBoard can be used to monitor the progress of the optimization.

We can also add interpreable labels to the dimensions, which will be used in the plots. Therefore, we set var_name=["Pressure", "Temp", "Lambda"] instead of the default var_name=None, which would result in the labels x_0, x_1, and x_2.

fun_control = fun_control_init(
              PREFIX="03",
              lower = -1.0*np.ones(3),
              upper = np.ones(3),
              var_name=["Pressure", "Temp", "Lambda"],
              show_progress=True)
surrogate_control = surrogate_control_init(n_theta=3)
spot_3 = spot.Spot(fun=fun,
                  fun_control=fun_control,
                  surrogate_control=surrogate_control)
spot_3.run()
Created spot_tensorboard_path: runs/spot_logs/03_maans14_2024-04-22_00-22-21 for SummaryWriter()
spotPython tuning: 0.03443324167631616 [#######---] 73.33% 
spotPython tuning: 0.03134655155643102 [########--] 80.00% 
spotPython tuning: 0.0009630181526749273 [#########-] 86.67% 
spotPython tuning: 8.570154459856623e-05 [#########-] 93.33% 
spotPython tuning: 6.496172516667557e-05 [##########] 100.00% Done...

{'CHECKPOINT_PATH': 'runs/saved_models/',
 'DATASET_PATH': 'data/',
 'PREFIX': '03',
 'RESULTS_PATH': 'results/',
 'TENSORBOARD_PATH': 'runs/',
 '_L_in': None,
 '_L_out': None,
 '_torchmetric': None,
 'accelerator': 'auto',
 'converters': None,
 'core_model': None,
 'core_model_name': None,
 'counter': 15,
 'data': None,
 'data_dir': './data',
 'data_module': None,
 'data_set': None,
 'data_set_name': None,
 'db_dict_name': None,
 'design': None,
 'device': None,
 'devices': 1,
 'enable_progress_bar': False,
 'eval': None,
 'fun_evals': 15,
 'fun_repeats': 1,
 'horizon': None,
 'infill_criterion': 'y',
 'k_folds': 3,
 'log_graph': False,
 'log_level': 50,
 'loss_function': None,
 'lower': array([-1., -1., -1.]),
 'max_surrogate_points': 30,
 'max_time': 1,
 'metric_params': {},
 'metric_river': None,
 'metric_sklearn': None,
 'metric_sklearn_name': None,
 'metric_torch': None,
 'model_dict': {},
 'n_points': 1,
 'n_samples': None,
 'n_total': None,
 'noise': False,
 'num_workers': 0,
 'ocba_delta': 0,
 'oml_grace_period': None,
 'optimizer': None,
 'path': None,
 'prep_model': None,
 'prep_model_name': None,
 'progress_file': None,
 'save_model': False,
 'scenario': None,
 'seed': 123,
 'show_batch_interval': 1000000,
 'show_models': False,
 'show_progress': True,
 'shuffle': None,
 'sigma': 0.0,
 'spot_tensorboard_path': 'runs/spot_logs/03_maans14_2024-04-22_00-22-21',
 'spot_writer': <torch.utils.tensorboard.writer.SummaryWriter object at 0x1045a3810>,
 'target_column': None,
 'target_type': None,
 'task': None,
 'test': None,
 'test_seed': 1234,
 'test_size': 0.4,
 'tolerance_x': 0,
 'train': None,
 'upper': array([1., 1., 1.]),
 'var_name': ['Pressure', 'Temp', 'Lambda'],
 'var_type': ['num'],
 'verbosity': 0,
 'weight_coeff': 0.0,
 'weights': 1.0,
 'weights_entry': None}
<spotPython.spot.spot.Spot at 0x395121fd0>
Note

Now we can start TensorBoard in the background with the following command:

tensorboard --logdir="./runs"

and can access the TensorBoard web server with the following URL:

http://localhost:6006/

8.1.2 Results

_ = spot_3.print_results()
min y: 6.496172516667557e-05
Pressure: 0.005280070995399376
Temp: 0.0019490323308060742
Lambda: 0.005769215581315232
spot_3.plot_progress()

8.1.3 A Contour Plot

We can select two dimensions, say \(i=0\) and \(j=1\), and generate a contour plot as follows.

Note:

We have specified identical min_z and max_z values to generate comparable plots.

spot_3.plot_contour(i=0, j=1, min_z=0, max_z=2.25)

  • In a similar manner, we can plot dimension \(i=0\) and \(j=2\):
spot_3.plot_contour(i=0, j=2, min_z=0, max_z=2.25)

  • The final combination is \(i=1\) and \(j=2\):
spot_3.plot_contour(i=1, j=2, min_z=0, max_z=2.25)

  • The three plots look very similar, because the fun_sphere is symmetric.
  • This can also be seen from the variable importance:
_ = spot_3.print_importance()
Pressure:  95.79368533570627
Temp:  99.99999999999999
Lambda:  87.19542775477797
spot_3.plot_importance()

8.1.4 TensorBoard

TensorBoard visualization of the spotPython process. Objective function values plotted against wall time.

The second TensorBoard visualization shows the input values, i.e., \(x_0, \ldots, x_2\), plotted against the wall time. TensorBoard visualization of the spotPython process.

The third TensorBoard plot illustrates how spotPython can be used as a microscope for the internal mechanisms of the surrogate-based optimization process. Here, one important parameter, the learning rate \(\theta\) of the Kriging surrogate is plotted against the number of optimization steps.

TensorBoard visualization of the spotPython surrogate model.

8.1.5 Conclusion

Based on this quick analysis, we can conclude that all three dimensions are equally important (as expected, because the analytical function is known).

8.2 Factorial Variables

Until now, we have considered continuous variables. However, in many applications, the variables are not continuous, but rather discrete or categorical. For example, the number of layers in a neural network, the number of trees in a random forest, or the type of kernel in a support vector machine are all discrete variables. In the following, we will consider a simple example with two numerical variables and one categorical variable.

from spotPython.design.spacefilling import spacefilling
from spotPython.build.kriging import Kriging
from spotPython.fun.objectivefunctions import analytical
import numpy as np

First, we generate the test data set for fitting the Kriging model. We use the spacefilling class to generate the first two diemnsion of \(n=30\) design points. The third dimension is a categorical variable, which can take the values \(0\), \(1\), or \(2\).

gen = spacefilling(2)
n = 30
rng = np.random.RandomState(1)
lower = np.array([-5,-0])
upper = np.array([10,15])
fun_orig = analytical().fun_branin
fun = analytical().fun_branin_factor

X0 = gen.scipy_lhd(n, lower=lower, upper = upper)
X1 = np.random.randint(low=0, high=3, size=(n,))
X = np.c_[X0, X1]
print(X[:5,:])
[[-2.84117593  5.97308949  2.        ]
 [-3.61017994  6.90781409  1.        ]
 [ 9.91204705  5.09395275  2.        ]
 [-4.4616725   1.3617128   2.        ]
 [-2.40987728  8.05505365  0.        ]]

The objective function is the fun_branin_factor in the analytical class [SOURCE]. It calculates the Branin function of \((x_1, x_2)\) with an additional factor based on the value of \(x_3\). If \(x_3 = 1\), the value of the Branin function is increased by 10. If \(x_3 = 2\), the value of the Branin function is decreased by 10. Otherwise, the value of the Branin function is not changed.

y = fun(X)
y_orig = fun_orig(X0)
data = np.c_[X, y_orig, y]
print(data[:5,:])
[[ -2.84117593   5.97308949   2.          32.09388125  22.09388125]
 [ -3.61017994   6.90781409   1.          43.965223    53.965223  ]
 [  9.91204705   5.09395275   2.           6.25588575  -3.74411425]
 [ -4.4616725    1.3617128    2.         212.41884106 202.41884106]
 [ -2.40987728   8.05505365   0.           9.25981051   9.25981051]]

We fit two Kriging models, one with three numerical variables and one with two numerical variables and one categorical variable. We then compare the predictions of the two models.

S = Kriging(name='kriging',  seed=123, log_level=50, n_theta=3, noise=False, var_type=["num", "num", "num"])
S.fit(X, y)
Sf = Kriging(name='kriging',  seed=123, log_level=50, n_theta=3, noise=False, var_type=["num", "num", "factor"])
Sf.fit(X, y)

We can now compare the predictions of the two models. We generate a new test data set and calculate the sum of the absolute differences between the predictions of the two models and the true values of the objective function. If the categorical variable is important, the sum of the absolute differences should be smaller than if the categorical variable is not important.

n = 100
k = 100
y_true = np.zeros(n*k)
y_pred= np.zeros(n*k)
y_factor_pred= np.zeros(n*k)
for i in range(k):
  X0 = gen.scipy_lhd(n, lower=lower, upper = upper)
  X1 = np.random.randint(low=0, high=3, size=(n,))
  X = np.c_[X0, X1]
  a = i*n
  b = (i+1)*n
  y_true[a:b] = fun(X)
  y_pred[a:b] = S.predict(X)
  y_factor_pred[a:b] = Sf.predict(X)
import pandas as pd
df = pd.DataFrame({"y":y_true, "Prediction":y_pred, "Prediction_factor":y_factor_pred})
df.head()
y Prediction Prediction_factor
0 6.684749 17.660407 8.981754
1 95.865258 90.509493 94.789658
2 49.811774 31.120551 50.354570
3 8.177150 5.917583 8.441082
4 10.968377 14.164791 4.821081
df.tail()
y Prediction Prediction_factor
9995 73.620503 82.887195 73.604479
9996 76.187178 92.365618 76.894174
9997 29.494401 27.820939 29.928223
9998 15.390268 15.671184 3.957893
9999 26.261264 13.626489 25.011506
s=np.sum(np.abs(y_pred - y_true))
sf=np.sum(np.abs(y_factor_pred - y_true))
res = (sf - s)
print(res)
-93783.8689147306
from spotPython.plot.validation import plot_actual_vs_predicted
plot_actual_vs_predicted(y_test=df["y"], y_pred=df["Prediction"], title="Default")
plot_actual_vs_predicted(y_test=df["y"], y_pred=df["Prediction_factor"], title="Factor")

8.3 Exercises

8.3.1 1. The Three Dimensional fun_cubed

  • The input dimension is 3. The search range is \(-1 \leq x \leq 1\) for all dimensions.
  • Generate contour plots
  • Calculate the variable importance.
  • Discuss the variable importance:
    • Are all variables equally important?
    • If not:
      • Which is the most important variable?
      • Which is the least important variable?

8.3.2 2. The Ten Dimensional fun_wing_wt

  • The input dimension is 10. The search range is \(0 \leq x \leq 1\) for all dimensions.
  • Calculate the variable importance.
  • Discuss the variable importance:
    • Are all variables equally important?
    • If not:
      • Which is the most important variable?
      • Which is the least important variable?
    • Generate contour plots for the three most important variables. Do they confirm your selection?

8.3.3 3. The Three Dimensional fun_runge

  • The input dimension is 3. The search range is \(-5 \leq x \leq 5\) for all dimensions.
  • Generate contour plots
  • Calculate the variable importance.
  • Discuss the variable importance:
    • Are all variables equally important?
    • If not:
      • Which is the most important variable?
      • Which is the least important variable?

8.3.4 4. The Three Dimensional fun_linear

  • The input dimension is 3. The search range is \(-5 \leq x \leq 5\) for all dimensions.
  • Generate contour plots
  • Calculate the variable importance.
  • Discuss the variable importance:
    • Are all variables equally important?
    • If not:
      • Which is the most important variable?
      • Which is the least important variable?

8.3.5 5. The Two Dimensional Rosenbrock Function fun_rosen

  • The input dimension is 2. The search range is \(-5 \leq x \leq 10\) for all dimensions.
  • See Rosenbrock function and Rosenbrock Function for details.
  • Generate contour plots
  • Calculate the variable importance.
  • Discuss the variable importance:
    • Are all variables equally important?
    • If not:
      • Which is the most important variable?
      • Which is the least important variable?

8.4 Selected Solutions

8.4.1 Solution to Exercise Section 8.3.5: The Two-dimensional Rosenbrock Function fun_rosen

import numpy as np
from spotPython.fun.objectivefunctions import analytical
from spotPython.utils.init import fun_control_init, surrogate_control_init
from spotPython.spot import spot

8.4.1.1 The Objective Function: 2-dim fun_rosen

The spotPython package provides several classes of objective functions. We will use the fun_rosen in the analytical class [SOURCE].

fun_rosen = analytical().fun_rosen

Here we will use problem dimension \(k=2\), which can be specified by the lower bound arrays. The size of the lower bound array determines the problem dimension. If we select -5.0 * np.ones(2), a two-dimensional function is created. In contrast to the one-dimensional case, where only one theta value is used, we will use \(k\) different theta values (one for each dimension), i.e., we set n_theta=3 in the surrogate_control. The prefix is set to "ROSEN". Again, TensorBoard can be used to monitor the progress of the optimization.

fun_control = fun_control_init(
              PREFIX="ROSEN",
              lower = -5.0*np.ones(2),
              upper = 10*np.ones(2),
              show_progress=True,
              fun_evals=25)
surrogate_control = surrogate_control_init(n_theta=2)
spot_rosen = spot.Spot(fun=fun_rosen,
                  fun_control=fun_control,
                  surrogate_control=surrogate_control)
spot_rosen.run()
Created spot_tensorboard_path: runs/spot_logs/ROSEN_maans14_2024-04-22_00-22-36 for SummaryWriter()
spotPython tuning: 90.7801015955818 [####------] 44.00% 
spotPython tuning: 1.0172832635943474 [#####-----] 48.00% 
spotPython tuning: 1.0172832635943474 [#####-----] 52.00% 
spotPython tuning: 1.0172832635943474 [######----] 56.00% 
spotPython tuning: 1.0172832635943474 [######----] 60.00% 
spotPython tuning: 1.0172832635943474 [######----] 64.00% 
spotPython tuning: 1.0172832635943474 [#######---] 68.00% 
spotPython tuning: 1.0172832635943474 [#######---] 72.00% 
spotPython tuning: 1.0172832635943474 [########--] 76.00% 
spotPython tuning: 1.0172832635943474 [########--] 80.00% 
spotPython tuning: 0.9921822630967522 [########--] 84.00% 
spotPython tuning: 0.7147779101762312 [#########-] 88.00% 
spotPython tuning: 0.7147779101762312 [#########-] 92.00% 
spotPython tuning: 0.7147779101762312 [##########] 96.00% 
spotPython tuning: 0.7147779101762312 [##########] 100.00% Done...

{'CHECKPOINT_PATH': 'runs/saved_models/',
 'DATASET_PATH': 'data/',
 'PREFIX': 'ROSEN',
 'RESULTS_PATH': 'results/',
 'TENSORBOARD_PATH': 'runs/',
 '_L_in': None,
 '_L_out': None,
 '_torchmetric': None,
 'accelerator': 'auto',
 'converters': None,
 'core_model': None,
 'core_model_name': None,
 'counter': 25,
 'data': None,
 'data_dir': './data',
 'data_module': None,
 'data_set': None,
 'data_set_name': None,
 'db_dict_name': None,
 'design': None,
 'device': None,
 'devices': 1,
 'enable_progress_bar': False,
 'eval': None,
 'fun_evals': 25,
 'fun_repeats': 1,
 'horizon': None,
 'infill_criterion': 'y',
 'k_folds': 3,
 'log_graph': False,
 'log_level': 50,
 'loss_function': None,
 'lower': array([-5., -5.]),
 'max_surrogate_points': 30,
 'max_time': 1,
 'metric_params': {},
 'metric_river': None,
 'metric_sklearn': None,
 'metric_sklearn_name': None,
 'metric_torch': None,
 'model_dict': {},
 'n_points': 1,
 'n_samples': None,
 'n_total': None,
 'noise': False,
 'num_workers': 0,
 'ocba_delta': 0,
 'oml_grace_period': None,
 'optimizer': None,
 'path': None,
 'prep_model': None,
 'prep_model_name': None,
 'progress_file': None,
 'save_model': False,
 'scenario': None,
 'seed': 123,
 'show_batch_interval': 1000000,
 'show_models': False,
 'show_progress': True,
 'shuffle': None,
 'sigma': 0.0,
 'spot_tensorboard_path': 'runs/spot_logs/ROSEN_maans14_2024-04-22_00-22-36',
 'spot_writer': <torch.utils.tensorboard.writer.SummaryWriter object at 0x397557410>,
 'target_column': None,
 'target_type': None,
 'task': None,
 'test': None,
 'test_seed': 1234,
 'test_size': 0.4,
 'tolerance_x': 0,
 'train': None,
 'upper': array([10., 10.]),
 'var_name': None,
 'var_type': ['num'],
 'verbosity': 0,
 'weight_coeff': 0.0,
 'weights': 1.0,
 'weights_entry': None}
<spotPython.spot.spot.Spot at 0x397766ad0>
Note

Now we can start TensorBoard in the background with the following command:

tensorboard --logdir="./runs"

and can access the TensorBoard web server with the following URL:

http://localhost:6006/

8.4.1.2 Results

_ = spot_rosen.print_results()
min y: 0.7147779101762312
x0: 0.19951670458655138
x1: 0.1258327277797004
spot_rosen.plot_progress(log_y=True)

8.4.1.3 A Contour Plot

We can select two dimensions, say \(i=0\) and \(j=1\), and generate a contour plot as follows.

Note:

For higher dimensions, it might be useful to have identical min_z and max_z values to generate comparable plots. The default values are min_z=None and max_z=None, which will be replaced by the minimum and maximum values of the objective function.

min_z = None
max_z = None
spot_rosen.plot_contour(i=0, j=1, min_z=min_z, max_z=max_z)

  • The variable importance can be calculated as follows:
_ = spot_rosen.print_importance()
x0:  100.0
x1:  1.2641431841859785
spot_rosen.plot_importance()

8.4.1.4 TensorBoard

TBD

8.5 Jupyter Notebook

Note