[[15]{.chapter-number}  [Sequential Parameter Optimization: Using `scipy` Optimizers]{.chapter-title}]{#sec-scipy-optimizers .quarto-section-identifier}

doi:10.48550/arXiv.2307.10262

15 Sequential Parameter Optimization: Using `scipy` Optimizers

As a default optimizer, spotpython uses differential_evolution from the scipy.optimize package. Alternatively, any other optimizer from the scipy.optimize package can be used. This chapter describes how different optimizers from the scipy optimize package can be used on the surrogate. The optimization algorithms are available from https://docs.scipy.org/doc/scipy/reference/optimize.html

import numpy as np
from math import inf
from spotpython.fun.objectivefunctions import Analytical
from spotpython.spot import Spot
from scipy.optimize import shgo
from scipy.optimize import direct
from scipy.optimize import differential_evolution
from scipy.optimize import dual_annealing
from scipy.optimize import basinhopping
from spotpython.utils.init import fun_control_init, design_control_init, optimizer_control_init, surrogate_control_init

15.1 The Objective Function Branin

The spotpython package provides several classes of objective functions. We will use an analytical objective function, i.e., a function that can be described by a (closed) formula. Here we will use the Branin function. The 2-dim Branin function is \[ y = a (x_2 - b x_1^2 + c x_1 - r) ^2 + s (1 - t) \cos(x_1) + s, \] where values of \(a\), \(b\), \(c\), \(r\), \(s\) and \(t\) are: \(a = 1\), \(b = 5.1 / (4\pi^2)\), \(c = 5 / \pi\), \(r = 6\), \(s = 10\) and \(t = 1 / (8\pi)\).

It has three global minima: \(f(x) = 0.397887\) at \((-\pi, 12.275)\), \((\pi, 2.275)\), and \((9.42478, 2.475)\).

Input Domain: This function is usually evaluated on the square \(x_1 \in [-5, 10] \times x_2 \in [0, 15]\).

from spotpython.fun.objectivefunctions import Analytical
lower = np.array([-5,-0])
upper = np.array([10,15])
fun = Analytical(seed=123).fun_branin

15.2 The Optimizer

Differential Evolution (DE) from the scikit.optimize package, see https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.differential_evolution.html#scipy.optimize.differential_evolution is the default optimizer for the search on the surrogate. Other optimiers that are available in spotpython, see https://docs.scipy.org/doc/scipy/reference/optimize.html#global-optimization.

dual_annealing
direct
shgo
basinhopping

These optimizers can be selected as follows:

from scipy.optimize import differential_evolution
optimizer = differential_evolution

As noted above, we will use differential_evolution. The optimizer can use 1000 evaluations. This value will be passed to the differential_evolution method, which has the argument maxiter (int). It defines the maximum number of generations over which the entire differential evolution population is evolved, see https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.differential_evolution.html#scipy.optimize.differential_evolution

TensorBoard

Similar to the one-dimensional case, which is discussed in Section 13.8, we can use TensorBoard to monitor the progress of the optimization. We will use a similar code, only the prefix is different:

fun_control=fun_control_init(
                    lower = lower,
                    upper = upper,
                    fun_evals = 20,
                    PREFIX = "04_DE_"
                    )
surrogate_control=surrogate_control_init(
                    n_theta=len(lower))

spot_de = Spot(fun=fun,
                    fun_control=fun_control,
                    surrogate_control=surrogate_control)
spot_de.run()

spotpython tuning: 3.80045515981163 [######----] 55.00% 
spotpython tuning: 3.80045515981163 [######----] 60.00% 
spotpython tuning: 3.1595355074600704 [######----] 65.00% 
spotpython tuning: 3.134553168752926 [#######---] 70.00% 
spotpython tuning: 3.052359008241419 [########--] 75.00% 
spotpython tuning: 2.9065279131311197 [########--] 80.00% 
spotpython tuning: 0.4487806844392068 [########--] 85.00% 
spotpython tuning: 0.4193192118950613 [#########-] 90.00% 
spotpython tuning: 0.3992766654184763 [##########] 95.00% 
spotpython tuning: 0.3981104654592649 [##########] 100.00% Done...

Experiment saved to 04_DE__res.pkl

<spotpython.spot.spot.Spot at 0x118f3cfb0>

15.2.1 TensorBoard

If the prefix argument in fun_control_init()is not None (as above, where the prefix was set to 04_DE_) , we can start TensorBoard in the background with the following command:

tensorboard --logdir="./runs"

We can access the TensorBoard web server with the following URL:

http://localhost:6006/

The TensorBoard plot illustrates how spotpython can be used as a microscope for the internal mechanisms of the surrogate-based optimization process. Here, one important parameter, the learning rate \(\theta\) of the Kriging surrogate is plotted against the number of optimization steps.

TensorBoard visualization of the spotpython optimization process and the surrogate model.

15.3 Print the Results

spot_de.print_results()

min y: 0.3981104654592649
x0: 3.1354737819805276
x1: 2.273192326567295

[['x0', np.float64(3.1354737819805276)], ['x1', np.float64(2.273192326567295)]]

15.4 Show the Progress

spot_de.plot_progress(log_y=True)

spot_de.surrogate.plot()

15.5 Exercises

15.5.1 `dual_annealing`

Describe the optimization algorithm, see scipy.optimize.dual_annealing.
Use the algorithm as an optimizer on the surrogate.

Tip: Selecting the Optimizer for the Surrogate

We can run spotpython with the dual_annealing optimizer as follows:

spot_da = Spot(fun=fun,
                    fun_control=fun_control,
                    optimizer=dual_annealing,
                    surrogate_control=surrogate_control)
spot_da.run()
spot_da.print_results()
spot_da.plot_progress(log_y=True)
spot_da.surrogate.plot()

spotpython tuning: 3.8004527658422695 [######----] 55.00% 
spotpython tuning: 3.8004527658422695 [######----] 60.00% 
spotpython tuning: 3.159739385950637 [######----] 65.00% 
spotpython tuning: 3.134737088109639 [#######---] 70.00% 
spotpython tuning: 3.0516422017960814 [########--] 75.00% 
spotpython tuning: 2.9054302378143575 [########--] 80.00% 
spotpython tuning: 0.4421664422471121 [########--] 85.00% 
spotpython tuning: 0.42372505893309587 [#########-] 90.00% 
spotpython tuning: 0.39983413422763014 [##########] 95.00% 
spotpython tuning: 0.398257665280795 [##########] 100.00% Done...

Experiment saved to 04_DE__res.pkl
min y: 0.398257665280795
x0: 3.1343284505688267
x1: 2.2698569095443624

15.5.2 `direct`

Describe the optimization algorithm
Use the algorithm as an optimizer on the surrogate

Tip: Selecting the Optimizer for the Surrogate

We can run spotpython with the direct optimizer as follows:

spot_di = Spot(fun=fun,
                    fun_control=fun_control,
                    optimizer=direct,
                    surrogate_control=surrogate_control)
spot_di.run()
spot_di.print_results()
spot_di.plot_progress(log_y=True)
spot_di.surrogate.plot()

spotpython tuning: 3.808603529901438 [######----] 55.00% 
spotpython tuning: 3.808603529901438 [######----] 60.00% 
spotpython tuning: 3.19804562480188 [######----] 65.00% 
spotpython tuning: 3.17767194117126 [#######---] 70.00% 
spotpython tuning: 3.165751373773567 [########--] 75.00% 
spotpython tuning: 3.133265047041581 [########--] 80.00% 
spotpython tuning: 3.1245953461467835 [########--] 85.00% 
spotpython tuning: 0.505142127626538 [#########-] 90.00% 
spotpython tuning: 0.45462335569409085 [##########] 95.00% 
spotpython tuning: 0.4026960069118921 [##########] 100.00% Done...

Experiment saved to 04_DE__res.pkl
min y: 0.4026960069118921
x0: 3.110425240054868
x1: 2.287379972565158

15.5.3 `shgo`

Describe the optimization algorithm
Use the algorithm as an optimizer on the surrogate

Tip: Selecting the Optimizer for the Surrogate

We can run spotpython with the direct optimizer as follows:

spot_sh = Spot(fun=fun,
                    fun_control=fun_control,
                    optimizer=shgo,
                    surrogate_control=surrogate_control)
spot_sh.run()
spot_sh.print_results()
spot_sh.plot_progress(log_y=True)
spot_sh.surrogate.plot()

spotpython tuning: 3.8004579559249176 [######----] 55.00% 
spotpython tuning: 3.8004579559249176 [######----] 60.00% 
spotpython tuning: 3.1595754982621527 [######----] 65.00% 
spotpython tuning: 3.134470467449841 [#######---] 70.00% 
spotpython tuning: 3.0500504117227276 [########--] 75.00% 
spotpython tuning: 2.9036702793420908 [########--] 80.00% 
spotpython tuning: 0.4550529222929214 [########--] 85.00% 
spotpython tuning: 0.42255005357414355 [#########-] 90.00% 
spotpython tuning: 0.3994253165131507 [##########] 95.00% 
spotpython tuning: 0.398217967002898 [##########] 100.00% Done...

Experiment saved to 04_DE__res.pkl
min y: 0.398217967002898
x0: 3.134718511984561
x1: 2.270181590317817

15.5.4 `basinhopping`

Describe the optimization algorithm
Use the algorithm as an optimizer on the surrogate

Tip: Selecting the Optimizer for the Surrogate

We can run spotpython with the direct optimizer as follows:

spot_bh = Spot(fun=fun,
                    fun_control=fun_control,
                    optimizer=basinhopping,
                    surrogate_control=surrogate_control)
spot_bh.run()
spot_bh.print_results()
spot_bh.plot_progress(log_y=True)
spot_bh.surrogate.plot()

spotpython tuning: 3.800515912225105 [######----] 55.00% 
spotpython tuning: 3.800515912225105 [######----] 60.00% 
spotpython tuning: 3.160076172435943 [######----] 65.00% 
spotpython tuning: 3.1345680452422 [#######---] 70.00% 
spotpython tuning: 3.0518661500352735 [########--] 75.00% 
spotpython tuning: 2.9063991714379007 [########--] 80.00% 
spotpython tuning: 0.4449573323802589 [########--] 85.00% 
spotpython tuning: 0.4217546937771459 [#########-] 90.00% 
spotpython tuning: 0.39939160635564264 [##########] 95.00% 
spotpython tuning: 0.3982273107606229 [##########] 100.00% Done...

Experiment saved to 04_DE__res.pkl
min y: 0.3982273107606229
x0: 3.1346952559844707
x1: 2.269823514736108

15.5.5 Performance Comparison

Compare the performance and run time of the 5 different optimizers:

differential_evolution
dual_annealing
direct
shgo
basinhopping.

The Branin function has three global minima:

\(f(x) = 0.397887\) at
- \((-\pi, 12.275)\),
- \((\pi, 2.275)\), and
- \((9.42478, 2.475)\).
Which optima are found by the optimizers?
Does the seed argument in fun = Analytical(seed=123).fun_branin change this behavior?

15.6 Jupyter Notebook

Note

The Jupyter-Notebook of this chapter is available on GitHub in the Hyperparameter-Tuning-Cookbook Repository

15.1 The Objective Function Branin

15.2 The Optimizer

15.2.1 TensorBoard

15.3 Print the Results

15.4 Show the Progress

15.5 Exercises

15.5.1 dual_annealing

15.5.2 direct

15.5.3 shgo

15.5.4 basinhopping

15.5.5 Performance Comparison

15.6 Jupyter Notebook

15.5.1 `dual_annealing`

15.5.2 `direct`

15.5.3 `shgo`

15.5.4 `basinhopping`