In this section, we will show how spotpython
can be integrated into the PyTorch
Lightning training workflow for a regression task. It demonstrates how easy it is to use spotpython
to tune hyperparameters for a PyTorch
Lightning model.
After importing the necessary libraries, the fun_control
dictionary is set up via the fun_control_init
function. The fun_control
dictionary contains
PREFIX
: a unique identifier for the experiment
fun_evals
: the number of function evaluations
max_time
: the maximum run time in minutes
data_set
: the data set. Here we use the Diabetes
data set that is provided by spotpython
.
core_model_name
: the class name of the neural network model. This neural network model is provided by spotpython
.
hyperdict
: the hyperparameter dictionary. This dictionary is used to define the hyperparameters of the neural network model. It is also provided by spotpython
.
_L_in
: the number of input features. Since the Diabetes
data set has 10 features, _L_in
is set to 10.
_L_out
: the number of output features. Since we want to predict a single value, _L_out
is set to 1.
The HyperLight
class is used to define the objective function fun
. It connects the PyTorch
and the spotpython
methods and is provided by spotpython
.
from spotpython.data.diabetes import Diabetes
from spotpython.hyperdict.light_hyper_dict import LightHyperDict
from spotpython.fun.hyperlight import HyperLight
from spotpython.utils.init import (fun_control_init, surrogate_control_init, design_control_init)
from spotpython.utils.eda import print_exp_table
from spotpython.spot import Spot
from spotpython.utils.file import get_experiment_filename
PREFIX= "605"
data_set = Diabetes()
fun_control = fun_control_init(
PREFIX= PREFIX,
fun_evals= inf,
max_time= 1 ,
data_set = data_set,
core_model_name= "light.regression.NNResNetRegressor" ,
hyperdict= LightHyperDict,
_L_in= 10 ,
_L_out= 1 )
fun = HyperLight().fun
module_name: light
submodule_name: regression
model_name: NNResNetRegressor
The method set_hyperparameter
allows the user to modify default hyperparameter settings. Here we modify some hyperparameters to keep the model small and to decrease the tuning time.
from spotpython.hyperparameters.values import set_hyperparameter
set_hyperparameter(fun_control, "optimizer" , [ "Adadelta" , "Adam" , "Adamax" ])
set_hyperparameter(fun_control, "l1" , [3 ,4 ])
set_hyperparameter(fun_control, "epochs" , [3 ,7 ])
set_hyperparameter(fun_control, "batch_size" , [4 ,11 ])
set_hyperparameter(fun_control, "dropout_prob" , [0.0 , 0.025 ])
set_hyperparameter(fun_control, "patience" , [2 ,3 ])
set_hyperparameter(fun_control, "lr_mult" , [0.1 , 20.0 ])
design_control = design_control_init(init_size= 10 )
print_exp_table(fun_control)
| name | type | default | lower | upper | transform |
|----------------|--------|-----------|---------|---------|-----------------------|
| l1 | int | 3 | 3 | 4 | transform_power_2_int |
| epochs | int | 4 | 3 | 7 | transform_power_2_int |
| batch_size | int | 4 | 4 | 11 | transform_power_2_int |
| act_fn | factor | ReLU | 0 | 5 | None |
| optimizer | factor | SGD | 0 | 2 | None |
| dropout_prob | float | 0.01 | 0 | 0.025 | None |
| lr_mult | float | 1.0 | 0.1 | 20 | None |
| patience | int | 2 | 2 | 3 | transform_power_2_int |
| initialization | factor | Default | 0 | 4 | None |
Finally, a Spot
object is created. Calling the method run()
starts the hyperparameter tuning process.
spot_tuner = Spot(fun= fun,fun_control= fun_control, design_control= design_control)
res = spot_tuner.run()
train_model result: {'val_loss': 24054.90625, 'hp_metric': 24054.90625}
train_model result: {'val_loss': 24432.75, 'hp_metric': 24432.75}
train_model result: {'val_loss': 24467.1171875, 'hp_metric': 24467.1171875}
train_model result: {'val_loss': 23441.474609375, 'hp_metric': 23441.474609375}
train_model result: {'val_loss': 22728.330078125, 'hp_metric': 22728.330078125}
train_model result: {'val_loss': 4730.1328125, 'hp_metric': 4730.1328125}
train_model result: {'val_loss': 24181.34375, 'hp_metric': 24181.34375}
train_model result: {'val_loss': 23692.3203125, 'hp_metric': 23692.3203125}
train_model result: {'val_loss': 21419.78125, 'hp_metric': 21419.78125}
train_model result: {'val_loss': 23542.013671875, 'hp_metric': 23542.013671875}
train_model result: {'val_loss': 5147.28515625, 'hp_metric': 5147.28515625}
spotpython tuning: 4730.1328125 [#---------] 5.29%
train_model result: {'val_loss': 4493.28173828125, 'hp_metric': 4493.28173828125}
spotpython tuning: 4493.28173828125 [#---------] 9.52%
train_model result: {'val_loss': 19665.021484375, 'hp_metric': 19665.021484375}
spotpython tuning: 4493.28173828125 [###-------] 26.18%
train_model result: {'val_loss': 4639.09423828125, 'hp_metric': 4639.09423828125}
spotpython tuning: 4493.28173828125 [###-------] 31.72%
train_model result: {'val_loss': 4680.62841796875, 'hp_metric': 4680.62841796875}
spotpython tuning: 4493.28173828125 [####------] 38.84%
train_model result: {'val_loss': 3852.15380859375, 'hp_metric': 3852.15380859375}
spotpython tuning: 3852.15380859375 [####------] 43.93%
train_model result: {'val_loss': 5720.23193359375, 'hp_metric': 5720.23193359375}
spotpython tuning: 3852.15380859375 [#####-----] 48.76%
train_model result: {'val_loss': 3629.165283203125, 'hp_metric': 3629.165283203125}
spotpython tuning: 3629.165283203125 [######----] 58.64%
train_model result: {'val_loss': 20231.388671875, 'hp_metric': 20231.388671875}
spotpython tuning: 3629.165283203125 [#######---] 65.61%
train_model result: {'val_loss': 4027.729248046875, 'hp_metric': 4027.729248046875}
spotpython tuning: 3629.165283203125 [#######---] 70.76%
train_model result: {'val_loss': 24181.111328125, 'hp_metric': 24181.111328125}
spotpython tuning: 3629.165283203125 [#######---] 73.92%
train_model result: {'val_loss': 24083.830078125, 'hp_metric': 24083.830078125}
spotpython tuning: 3629.165283203125 [########--] 78.18%
train_model result: {'val_loss': 5339.49365234375, 'hp_metric': 5339.49365234375}
spotpython tuning: 3629.165283203125 [########--] 82.09%
train_model result: {'val_loss': 5072.3544921875, 'hp_metric': 5072.3544921875}
spotpython tuning: 3629.165283203125 [#########-] 86.99%
train_model result: {'val_loss': 22432.462890625, 'hp_metric': 22432.462890625}
spotpython tuning: 3629.165283203125 [#########-] 90.28%
train_model result: {'val_loss': 22386.333984375, 'hp_metric': 22386.333984375}
spotpython tuning: 3629.165283203125 [#########-] 92.49%
train_model result: {'val_loss': 24093.95703125, 'hp_metric': 24093.95703125}
spotpython tuning: 3629.165283203125 [#########-] 94.30%
train_model result: {'val_loss': 3213.126220703125, 'hp_metric': 3213.126220703125}
spotpython tuning: 3213.126220703125 [##########] 100.00% Done...
Experiment saved to 605_res.pkl
Looking at the Results
Tuning Progress
After the hyperparameter tuning run is finished, the progress of the hyperparameter tuning can be visualized with spotpython
’s method plot_progress
. The black points represent the performace values (score or metric) of hyperparameter configurations from the initial design, whereas the red points represents the hyperparameter configurations found by the surrogate model based optimization.
spot_tuner.plot_progress()
Tuned Hyperparameters and Their Importance
Results can be printed in tabular form.
from spotpython.utils.eda import print_res_table
print_res_table(spot_tuner)
| name | type | default | lower | upper | tuned | transform | importance | stars |
|----------------|--------|-----------|---------|---------|----------------------|-----------------------|--------------|---------|
| l1 | int | 3 | 3.0 | 4.0 | 3.0 | transform_power_2_int | 0.00 | |
| epochs | int | 4 | 3.0 | 7.0 | 6.0 | transform_power_2_int | 0.00 | |
| batch_size | int | 4 | 4.0 | 11.0 | 5.0 | transform_power_2_int | 0.00 | |
| act_fn | factor | ReLU | 0.0 | 5.0 | ELU | None | 0.00 | |
| optimizer | factor | SGD | 0.0 | 2.0 | Adadelta | None | 25.28 | * |
| dropout_prob | float | 0.01 | 0.0 | 0.025 | 0.007463950384757653 | None | 100.00 | *** |
| lr_mult | float | 1.0 | 0.1 | 20.0 | 15.202481588806721 | None | 0.01 | |
| patience | int | 2 | 2.0 | 3.0 | 2.0 | transform_power_2_int | 0.00 | |
| initialization | factor | Default | 0.0 | 4.0 | xavier_uniform | None | 76.40 | ** |
A histogram can be used to visualize the most important hyperparameters.
spot_tuner.plot_importance(threshold= 1.0 )
spot_tuner.plot_important_hyperparameter_contour(max_imp= 3 )
l1: 0.0017576060538769414
epochs: 0.0017576060538769414
batch_size: 0.0017576060538769414
act_fn: 0.0017576060538769414
optimizer: 25.28359990634509
dropout_prob: 100.0
lr_mult: 0.00784501783523542
patience: 0.0017576060538769414
initialization: 76.40363221152083
Get the Tuned Architecture
import pprint
from spotpython.hyperparameters.values import get_tuned_architecture
config = get_tuned_architecture(spot_tuner)
pprint.pprint(config)
{'act_fn': ELU(),
'batch_size': 32,
'dropout_prob': 0.007463950384757653,
'epochs': 64,
'initialization': 'xavier_uniform',
'l1': 8,
'lr_mult': 15.202481588806721,
'optimizer': 'Adadelta',
'patience': 4}
Test on the full data set
# set the value of the key "TENSORBOARD_CLEAN" to True in the fun_control dictionary and use the update() method to update the fun_control dictionary
import os
# if the directory "./runs" exists, delete it
if os.path.exists("./runs" ):
os.system("rm -r ./runs" )
fun_control.update({"tensorboard_log" : True })
from spotpython.light.testmodel import test_model
from spotpython.utils.init import get_feature_names
test_model(config, fun_control)
get_feature_names(fun_control)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Test metric ┃ DataLoader 0 ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ hp_metric │ 3390.247802734375 │
│ val_loss │ 3390.247802734375 │
└───────────────────────────┴───────────────────────────┘
test_model result: {'val_loss': 3390.247802734375, 'hp_metric': 3390.247802734375}
['age',
'sex',
'bmi',
'bp',
's1_tc',
's2_ldl',
's3_hdl',
's4_tch',
's5_ltg',
's6_glu']
Cross Validation With Lightning
The KFold
class from sklearn.model_selection
is used to generate the folds for cross-validation.
These mechanism is used to generate the folds for the final evaluation of the model.
The CrossValidationDataModule
class [SOURCE] is used to generate the folds for the hyperparameter tuning process.
It is called from the cv_model
function [SOURCE] .
{'l1': 8,
'epochs': 64,
'batch_size': 32,
'act_fn': ELU(),
'optimizer': 'Adadelta',
'dropout_prob': 0.007463950384757653,
'lr_mult': 15.202481588806721,
'patience': 4,
'initialization': 'xavier_uniform'}
from spotpython.light.cvmodel import cv_model
fun_control.update({"k_folds" : 2 })
fun_control.update({"test_size" : 0.6 })
cv_model(config, fun_control)
train_model result: {'val_loss': 2948.783935546875, 'hp_metric': 2948.783935546875}
k: 1
train_model result: {'val_loss': 3377.771728515625, 'hp_metric': 3377.771728515625}
Summary
This section presented an introduction to the basic setup of hyperparameter tuning with spotpython
and PyTorch
Lightning using a ResNet model for the Diabetes data set.