55  Hyperparameter Tuning with spotpython and PyTorch Lightning for the Diabetes Data Set Using a ResNet Model

In this section, we will show how spotpython can be integrated into the PyTorch Lightning training workflow for a regression task. It demonstrates how easy it is to use spotpython to tune hyperparameters for a PyTorch Lightning model.

After importing the necessary libraries, the fun_control dictionary is set up via the fun_control_init function. The fun_control dictionary contains

The HyperLight class is used to define the objective function fun. It connects the PyTorch and the spotpython methods and is provided by spotpython.

Note, the divergence_threshold is set to 5,000, which is based on some pre-experiments with the Diabetes data set.

from spotpython.data.diabetes import Diabetes
from spotpython.hyperdict.light_hyper_dict import LightHyperDict
from spotpython.fun.hyperlight import HyperLight
from spotpython.utils.init import (fun_control_init, surrogate_control_init, design_control_init)
from spotpython.utils.eda import print_exp_table
from spotpython.spot import Spot
from spotpython.utils.file import get_experiment_filename

PREFIX="605"

data_set = Diabetes()

fun_control = fun_control_init(
    PREFIX=PREFIX,
    fun_evals=inf,
    max_time=1,
    data_set = data_set,
    core_model_name="light.regression.NNResNetRegressor",
    hyperdict=LightHyperDict,
    divergence_threshold=25_000,
    _L_in=10,
    _L_out=1)

fun = HyperLight().fun
module_name: light
submodule_name: regression
model_name: NNResNetRegressor

The method set_hyperparameter allows the user to modify default hyperparameter settings. Here we modify some hyperparameters to keep the model small and to decrease the tuning time.

from spotpython.hyperparameters.values import set_hyperparameter
set_hyperparameter(fun_control, "optimizer", [ "Adadelta", "Adam", "Adamax"])
set_hyperparameter(fun_control, "l1", [3,4])
set_hyperparameter(fun_control, "epochs", [3,7])
set_hyperparameter(fun_control, "batch_size", [4,11])
set_hyperparameter(fun_control, "dropout_prob", [0.0, 0.025])
set_hyperparameter(fun_control, "patience", [2,3])
set_hyperparameter(fun_control, "lr_mult", [0.1, 20.0])

design_control = design_control_init(init_size=10)

print_exp_table(fun_control)
| name           | type   | default   |   lower |   upper | transform             |
|----------------|--------|-----------|---------|---------|-----------------------|
| l1             | int    | 3         |     3   |   4     | transform_power_2_int |
| epochs         | int    | 4         |     3   |   7     | transform_power_2_int |
| batch_size     | int    | 4         |     4   |  11     | transform_power_2_int |
| act_fn         | factor | ReLU      |     0   |   5     | None                  |
| optimizer      | factor | SGD       |     0   |   2     | None                  |
| dropout_prob   | float  | 0.01      |     0   |   0.025 | None                  |
| lr_mult        | float  | 1.0       |     0.1 |  20     | None                  |
| patience       | int    | 2         |     2   |   3     | transform_power_2_int |
| initialization | factor | Default   |     0   |   4     | None                  |

Finally, a Spot object is created. Calling the method run() starts the hyperparameter tuning process.

spot_tuner = Spot(fun=fun,fun_control=fun_control, design_control=design_control)
res = spot_tuner.run()
┏━━━┳━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━┓
┃    Name    Type        Params  Mode    FLOPs   In sizes  Out sizes ┃
┡━━━╇━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━┩
│ 0 │ layers │ Sequential │    637 │ train │ 95.7 K │ [128, 10]   [128, 1] │
└───┴────────┴────────────┴────────┴───────┴────────┴───────────┴───────────┘
Trainable params: 637                                                                                              
Non-trainable params: 0                                                                                            
Total params: 637                                                                                                  
Total estimated model params size (MB): 0                                                                          
Modules in train mode: 63                                                                                          
Modules in eval mode: 0                                                                                            
Total FLOPs: 95.7 K                                                                                                
train_model result: {'val_loss': 23488.20703125, 'hp_metric': 23488.20703125}
┏━━━┳━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━┓
┃    Name    Type        Params  Mode   FLOPs   In sizes  Out sizes ┃
┡━━━╇━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━┩
│ 0 │ layers │ Sequential │  1.8 K │ train │ 310 K │ [128, 10]   [128, 1] │
└───┴────────┴────────────┴────────┴───────┴───────┴───────────┴───────────┘
Trainable params: 1.8 K                                                                                            
Non-trainable params: 0                                                                                            
Total params: 1.8 K                                                                                                
Total estimated model params size (MB): 0                                                                          
Modules in train mode: 101                                                                                         
Modules in eval mode: 0                                                                                            
Total FLOPs: 310 K                                                                                                 
train_model result: {'val_loss': 23382.1171875, 'hp_metric': 23382.1171875}
┏━━━┳━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━┓
┃    Name    Type        Params  Mode   FLOPs   In sizes  Out sizes ┃
┡━━━╇━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━┩
│ 0 │ layers │ Sequential │  1.8 K │ train │ 1.2 M │ [512, 10]   [512, 1] │
└───┴────────┴────────────┴────────┴───────┴───────┴───────────┴───────────┘
Trainable params: 1.8 K                                                                                            
Non-trainable params: 0                                                                                            
Total params: 1.8 K                                                                                                
Total estimated model params size (MB): 0                                                                          
Modules in train mode: 101                                                                                         
Modules in eval mode: 0                                                                                            
Total FLOPs: 1.2 M                                                                                                 
train_model result: {'val_loss': 23932.60546875, 'hp_metric': 23932.60546875}
┏━━━┳━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━┓
┃    Name    Type        Params  Mode    FLOPs  In sizes  Out sizes ┃
┡━━━╇━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━┩
│ 0 │ layers │ Sequential │  1.8 K │ train │ 38.8 K │ [16, 10]    [16, 1] │
└───┴────────┴────────────┴────────┴───────┴────────┴──────────┴───────────┘
Trainable params: 1.8 K                                                                                            
Non-trainable params: 0                                                                                            
Total params: 1.8 K                                                                                                
Total estimated model params size (MB): 0                                                                          
Modules in train mode: 101                                                                                         
Modules in eval mode: 0                                                                                            
Total FLOPs: 38.8 K                                                                                                
train_model result: {'val_loss': 23544.123046875, 'hp_metric': 23544.123046875}
┏━━━┳━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃    Name    Type        Params  Mode   FLOPs    In sizes  Out sizes ┃
┡━━━╇━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ 0 │ layers │ Sequential │  1.8 K │ train │ 2.5 M │ [1024, 10]  [1024, 1] │
└───┴────────┴────────────┴────────┴───────┴───────┴────────────┴───────────┘
Trainable params: 1.8 K                                                                                            
Non-trainable params: 0                                                                                            
Total params: 1.8 K                                                                                                
Total estimated model params size (MB): 0                                                                          
Modules in train mode: 101                                                                                         
Modules in eval mode: 0                                                                                            
Total FLOPs: 2.5 M                                                                                                 
train_model result: {'val_loss': 23700.796875, 'hp_metric': 23700.796875}
┏━━━┳━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃    Name    Type        Params  Mode   FLOPs    In sizes  Out sizes ┃
┡━━━╇━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ 0 │ layers │ Sequential │    637 │ train │ 1.5 M │ [2048, 10]  [2048, 1] │
└───┴────────┴────────────┴────────┴───────┴───────┴────────────┴───────────┘
Trainable params: 637                                                                                              
Non-trainable params: 0                                                                                            
Total params: 637                                                                                                  
Total estimated model params size (MB): 0                                                                          
Modules in train mode: 63                                                                                          
Modules in eval mode: 0                                                                                            
Total FLOPs: 1.5 M                                                                                                 
train_model result: {'val_loss': 5643.2451171875, 'hp_metric': 5643.2451171875}
┏━━━┳━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━┓
┃    Name    Type        Params  Mode   FLOPs   In sizes  Out sizes ┃
┡━━━╇━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━┩
│ 0 │ layers │ Sequential │    637 │ train │ 191 K │ [256, 10]   [256, 1] │
└───┴────────┴────────────┴────────┴───────┴───────┴───────────┴───────────┘
Trainable params: 637                                                                                              
Non-trainable params: 0                                                                                            
Total params: 637                                                                                                  
Total estimated model params size (MB): 0                                                                          
Modules in train mode: 63                                                                                          
Modules in eval mode: 0                                                                                            
Total FLOPs: 191 K                                                                                                 
train_model result: {'val_loss': 24055.76953125, 'hp_metric': 24055.76953125}
┏━━━┳━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━┓
┃    Name    Type        Params  Mode   FLOPs   In sizes  Out sizes ┃
┡━━━╇━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━┩
│ 0 │ layers │ Sequential │    637 │ train │ 382 K │ [512, 10]   [512, 1] │
└───┴────────┴────────────┴────────┴───────┴───────┴───────────┴───────────┘
Trainable params: 637                                                                                              
Non-trainable params: 0                                                                                            
Total params: 637                                                                                                  
Total estimated model params size (MB): 0                                                                          
Modules in train mode: 63                                                                                          
Modules in eval mode: 0                                                                                            
Total FLOPs: 382 K                                                                                                 
train_model result: {'val_loss': 22876.900390625, 'hp_metric': 22876.900390625}
┏━━━┳━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━┓
┃    Name    Type        Params  Mode    FLOPs  In sizes  Out sizes ┃
┡━━━╇━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━┩
│ 0 │ layers │ Sequential │    637 │ train │ 23.9 K │ [32, 10]    [32, 1] │
└───┴────────┴────────────┴────────┴───────┴────────┴──────────┴───────────┘
Trainable params: 637                                                                                              
Non-trainable params: 0                                                                                            
Total params: 637                                                                                                  
Total estimated model params size (MB): 0                                                                          
Modules in train mode: 63                                                                                          
Modules in eval mode: 0                                                                                            
Total FLOPs: 23.9 K                                                                                                
train_model result: {'val_loss': 23417.650390625, 'hp_metric': 23417.650390625}
┏━━━┳━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━┓
┃    Name    Type        Params  Mode   FLOPs  In sizes  Out sizes ┃
┡━━━╇━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━┩
│ 0 │ layers │ Sequential │  1.8 K │ train │ 155 K │ [64, 10]    [64, 1] │
└───┴────────┴────────────┴────────┴───────┴───────┴──────────┴───────────┘
Trainable params: 1.8 K                                                                                            
Non-trainable params: 0                                                                                            
Total params: 1.8 K                                                                                                
Total estimated model params size (MB): 0                                                                          
Modules in train mode: 101                                                                                         
Modules in eval mode: 0                                                                                            
Total FLOPs: 155 K                                                                                                 
train_model result: {'val_loss': 23713.25390625, 'hp_metric': 23713.25390625}
Anisotropic model: n_theta set to 9
┏━━━┳━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━┓
┃    Name    Type        Params  Mode   FLOPs   In sizes  Out sizes ┃
┡━━━╇━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━┩
│ 0 │ layers │ Sequential │  1.8 K │ train │ 310 K │ [128, 10]   [128, 1] │
└───┴────────┴────────────┴────────┴───────┴───────┴───────────┴───────────┘
Trainable params: 1.8 K                                                                                            
Non-trainable params: 0                                                                                            
Total params: 1.8 K                                                                                                
Total estimated model params size (MB): 0                                                                          
Modules in train mode: 101                                                                                         
Modules in eval mode: 0                                                                                            
Total FLOPs: 310 K                                                                                                 
train_model result: {'val_loss': 25599.056640625, 'hp_metric': 25599.056640625}
Anisotropic model: n_theta set to 9
spotpython tuning: 5643.2451171875 [----------] 0.75% 
┏━━━┳━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃    Name    Type        Params  Mode   FLOPs    In sizes  Out sizes ┃
┡━━━╇━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ 0 │ layers │ Sequential │    637 │ train │ 1.5 M │ [2048, 10]  [2048, 1] │
└───┴────────┴────────────┴────────┴───────┴───────┴────────────┴───────────┘
Trainable params: 637                                                                                              
Non-trainable params: 0                                                                                            
Total params: 637                                                                                                  
Total estimated model params size (MB): 0                                                                          
Modules in train mode: 63                                                                                          
Modules in eval mode: 0                                                                                            
Total FLOPs: 1.5 M                                                                                                 
train_model result: {'val_loss': 6140.361328125, 'hp_metric': 6140.361328125}
Anisotropic model: n_theta set to 9
spotpython tuning: 5643.2451171875 [#---------] 5.54% 
┏━━━┳━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃    Name    Type        Params  Mode   FLOPs    In sizes  Out sizes ┃
┡━━━╇━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ 0 │ layers │ Sequential │    637 │ train │ 1.5 M │ [2048, 10]  [2048, 1] │
└───┴────────┴────────────┴────────┴───────┴───────┴────────────┴───────────┘
Trainable params: 637                                                                                              
Non-trainable params: 0                                                                                            
Total params: 637                                                                                                  
Total estimated model params size (MB): 0                                                                          
Modules in train mode: 63                                                                                          
Modules in eval mode: 0                                                                                            
Total FLOPs: 1.5 M                                                                                                 
train_model result: {'val_loss': 24080.84375, 'hp_metric': 24080.84375}
Anisotropic model: n_theta set to 9
spotpython tuning: 5643.2451171875 [#---------] 7.44% 
┏━━━┳━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃    Name    Type        Params  Mode   FLOPs    In sizes  Out sizes ┃
┡━━━╇━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ 0 │ layers │ Sequential │    637 │ train │ 1.5 M │ [2048, 10]  [2048, 1] │
└───┴────────┴────────────┴────────┴───────┴───────┴────────────┴───────────┘
Trainable params: 637                                                                                              
Non-trainable params: 0                                                                                            
Total params: 637                                                                                                  
Total estimated model params size (MB): 0                                                                          
Modules in train mode: 63                                                                                          
Modules in eval mode: 0                                                                                            
Total FLOPs: 1.5 M                                                                                                 
train_model result: {'val_loss': 21211.412109375, 'hp_metric': 21211.412109375}
Anisotropic model: n_theta set to 9
spotpython tuning: 5643.2451171875 [##--------] 21.88% 
┏━━━┳━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━┓
┃    Name    Type        Params  Mode   FLOPs   In sizes  Out sizes ┃
┡━━━╇━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━┩
│ 0 │ layers │ Sequential │  1.8 K │ train │ 1.2 M │ [512, 10]   [512, 1] │
└───┴────────┴────────────┴────────┴───────┴───────┴───────────┴───────────┘
Trainable params: 1.8 K                                                                                            
Non-trainable params: 0                                                                                            
Total params: 1.8 K                                                                                                
Total estimated model params size (MB): 0                                                                          
Modules in train mode: 101                                                                                         
Modules in eval mode: 0                                                                                            
Total FLOPs: 1.2 M                                                                                                 
train_model result: {'val_loss': 4956.70556640625, 'hp_metric': 4956.70556640625}
Anisotropic model: n_theta set to 9
spotpython tuning: 4956.70556640625 [###-------] 30.88% 
┏━━━┳━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━┓
┃    Name    Type        Params  Mode   FLOPs   In sizes  Out sizes ┃
┡━━━╇━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━┩
│ 0 │ layers │ Sequential │  1.8 K │ train │ 310 K │ [128, 10]   [128, 1] │
└───┴────────┴────────────┴────────┴───────┴───────┴───────────┴───────────┘
Trainable params: 1.8 K                                                                                            
Non-trainable params: 0                                                                                            
Total params: 1.8 K                                                                                                
Total estimated model params size (MB): 0                                                                          
Modules in train mode: 101                                                                                         
Modules in eval mode: 0                                                                                            
Total FLOPs: 310 K                                                                                                 
train_model result: {'val_loss': 4509.54052734375, 'hp_metric': 4509.54052734375}
Anisotropic model: n_theta set to 9
spotpython tuning: 4509.54052734375 [####------] 36.88% 
┏━━━┳━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━┓
┃    Name    Type        Params  Mode    FLOPs  In sizes  Out sizes ┃
┡━━━╇━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━┩
│ 0 │ layers │ Sequential │  1.8 K │ train │ 38.8 K │ [16, 10]    [16, 1] │
└───┴────────┴────────────┴────────┴───────┴────────┴──────────┴───────────┘
Trainable params: 1.8 K                                                                                            
Non-trainable params: 0                                                                                            
Total params: 1.8 K                                                                                                
Total estimated model params size (MB): 0                                                                          
Modules in train mode: 101                                                                                         
Modules in eval mode: 0                                                                                            
Total FLOPs: 38.8 K                                                                                                
train_model result: {'val_loss': 4590.9990234375, 'hp_metric': 4590.9990234375}
Anisotropic model: n_theta set to 9
spotpython tuning: 4509.54052734375 [#####-----] 50.24% 
┏━━━┳━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━┓
┃    Name    Type        Params  Mode   FLOPs   In sizes  Out sizes ┃
┡━━━╇━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━┩
│ 0 │ layers │ Sequential │  1.8 K │ train │ 310 K │ [128, 10]   [128, 1] │
└───┴────────┴────────────┴────────┴───────┴───────┴───────────┴───────────┘
Trainable params: 1.8 K                                                                                            
Non-trainable params: 0                                                                                            
Total params: 1.8 K                                                                                                
Total estimated model params size (MB): 0                                                                          
Modules in train mode: 101                                                                                         
Modules in eval mode: 0                                                                                            
Total FLOPs: 310 K                                                                                                 
train_model result: {'val_loss': 23239.7265625, 'hp_metric': 23239.7265625}
Anisotropic model: n_theta set to 9
spotpython tuning: 4509.54052734375 [######----] 64.70% 
┏━━━┳━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━┓
┃    Name    Type        Params  Mode   FLOPs  In sizes  Out sizes ┃
┡━━━╇━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━┩
│ 0 │ layers │ Sequential │  1.8 K │ train │ 155 K │ [64, 10]    [64, 1] │
└───┴────────┴────────────┴────────┴───────┴───────┴──────────┴───────────┘
Trainable params: 1.8 K                                                                                            
Non-trainable params: 0                                                                                            
Total params: 1.8 K                                                                                                
Total estimated model params size (MB): 0                                                                          
Modules in train mode: 101                                                                                         
Modules in eval mode: 0                                                                                            
Total FLOPs: 155 K                                                                                                 
train_model result: {'val_loss': 5191.73486328125, 'hp_metric': 5191.73486328125}
Anisotropic model: n_theta set to 9
spotpython tuning: 4509.54052734375 [########--] 80.88% 
┏━━━┳━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━┓
┃    Name    Type        Params  Mode   FLOPs   In sizes  Out sizes ┃
┡━━━╇━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━┩
│ 0 │ layers │ Sequential │    637 │ train │ 191 K │ [256, 10]   [256, 1] │
└───┴────────┴────────────┴────────┴───────┴───────┴───────────┴───────────┘
Trainable params: 637                                                                                              
Non-trainable params: 0                                                                                            
Total params: 637                                                                                                  
Total estimated model params size (MB): 0                                                                          
Modules in train mode: 63                                                                                          
Modules in eval mode: 0                                                                                            
Total FLOPs: 191 K                                                                                                 
train_model result: {'val_loss': 25099.01953125, 'hp_metric': 25099.01953125}
Anisotropic model: n_theta set to 9
spotpython tuning: 4509.54052734375 [########--] 81.73% 
┏━━━┳━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━┓
┃    Name    Type        Params  Mode    FLOPs  In sizes  Out sizes ┃
┡━━━╇━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━┩
│ 0 │ layers │ Sequential │  1.8 K │ train │ 77.7 K │ [32, 10]    [32, 1] │
└───┴────────┴────────────┴────────┴───────┴────────┴──────────┴───────────┘
Trainable params: 1.8 K                                                                                            
Non-trainable params: 0                                                                                            
Total params: 1.8 K                                                                                                
Total estimated model params size (MB): 0                                                                          
Modules in train mode: 101                                                                                         
Modules in eval mode: 0                                                                                            
Total FLOPs: 77.7 K                                                                                                
train_model result: {'val_loss': 14591.408203125, 'hp_metric': 14591.408203125}
Anisotropic model: n_theta set to 9
spotpython tuning: 4509.54052734375 [##########] 100.00% Done...

Experiment saved to 605_res.pkl

55.1 Looking at the Results

55.1.1 Tuning Progress

After the hyperparameter tuning run is finished, the progress of the hyperparameter tuning can be visualized with spotpython’s method plot_progress. The black points represent the performace values (score or metric) of hyperparameter configurations from the initial design, whereas the red points represents the hyperparameter configurations found by the surrogate model based optimization.

spot_tuner.plot_progress()

55.1.2 Tuned Hyperparameters and Their Importance

Results can be printed in tabular form.

from spotpython.utils.eda import print_res_table
print_res_table(spot_tuner)
| name           | type   | default   |   lower |   upper | tuned                | transform             |   importance | stars   |
|----------------|--------|-----------|---------|---------|----------------------|-----------------------|--------------|---------|
| l1             | int    | 3         |     3.0 |     4.0 | 4.0                  | transform_power_2_int |         0.00 |         |
| epochs         | int    | 4         |     3.0 |     7.0 | 6.0                  | transform_power_2_int |         1.29 | *       |
| batch_size     | int    | 4         |     4.0 |    11.0 | 7.0                  | transform_power_2_int |         0.00 |         |
| act_fn         | factor | ReLU      |     0.0 |     5.0 | LeakyReLU            | None                  |       100.00 | ***     |
| optimizer      | factor | SGD       |     0.0 |     2.0 | Adadelta             | None                  |       100.00 | ***     |
| dropout_prob   | float  | 0.01      |     0.0 |   0.025 | 0.007332827692081354 | None                  |         0.00 |         |
| lr_mult        | float  | 1.0       |     0.1 |    20.0 | 20.0                 | None                  |         0.00 |         |
| patience       | int    | 2         |     2.0 |     3.0 | 3.0                  | transform_power_2_int |        81.40 | **      |
| initialization | factor | Default   |     0.0 |     4.0 | kaiming_uniform      | None                  |        18.13 | *       |

A histogram can be used to visualize the most important hyperparameters.

spot_tuner.plot_importance(threshold=1.0)

spot_tuner.plot_important_hyperparameter_contour(max_imp=3)
l1:  0.001
epochs:  1.2854928558378267
batch_size:  0.001
act_fn:  100.0
optimizer:  100.0
dropout_prob:  0.001
lr_mult:  0.001
patience:  81.40368244932799
initialization:  18.133381659251647

55.1.3 Get the Tuned Architecture

import pprint
from spotpython.hyperparameters.values import get_tuned_architecture
config = get_tuned_architecture(spot_tuner)
pprint.pprint(config)
{'act_fn': LeakyReLU(),
 'batch_size': 128,
 'dropout_prob': 0.007332827692081354,
 'epochs': 64,
 'initialization': 'kaiming_uniform',
 'l1': 16,
 'lr_mult': 20.0,
 'optimizer': 'Adadelta',
 'patience': 8}

55.1.4 Test on the full data set

# set the value of the key "TENSORBOARD_CLEAN" to True in the fun_control dictionary and use the update() method to update the fun_control dictionary
import os
# if the directory "./runs" exists, delete it
if os.path.exists("./runs"):
    os.system("rm -r ./runs")
fun_control.update({"tensorboard_log": True})
from spotpython.light.testmodel import test_model
from spotpython.utils.init import get_feature_names

test_model(config, fun_control)
get_feature_names(fun_control)
┏━━━┳━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━┓
┃    Name    Type        Params  Mode   FLOPs   In sizes  Out sizes ┃
┡━━━╇━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━┩
│ 0 │ layers │ Sequential │  1.8 K │ train │ 310 K │ [128, 10]   [128, 1] │
└───┴────────┴────────────┴────────┴───────┴───────┴───────────┴───────────┘
Trainable params: 1.8 K                                                                                            
Non-trainable params: 0                                                                                            
Total params: 1.8 K                                                                                                
Total estimated model params size (MB): 0                                                                          
Modules in train mode: 101                                                                                         
Modules in eval mode: 0                                                                                            
Total FLOPs: 310 K                                                                                                 
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃        Test metric               DataLoader 0        ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│         hp_metric               31357.228515625      │
│         val_loss                31357.228515625      │
└───────────────────────────┴───────────────────────────┘
test_model result: {'val_loss': 31357.228515625, 'hp_metric': 31357.228515625}
['age',
 'sex',
 'bmi',
 'bp',
 's1_tc',
 's2_ldl',
 's3_hdl',
 's4_tch',
 's5_ltg',
 's6_glu']

55.1.5 Cross Validation With Lightning

  • The KFold class from sklearn.model_selection is used to generate the folds for cross-validation.
  • These mechanism is used to generate the folds for the final evaluation of the model.
  • The CrossValidationDataModule class [SOURCE] is used to generate the folds for the hyperparameter tuning process.
  • It is called from the cv_model function [SOURCE].
config
{'l1': 16,
 'epochs': 64,
 'batch_size': 128,
 'act_fn': LeakyReLU(),
 'optimizer': 'Adadelta',
 'dropout_prob': 0.007332827692081354,
 'lr_mult': 20.0,
 'patience': 8,
 'initialization': 'kaiming_uniform'}
from spotpython.light.cvmodel import cv_model
fun_control.update({"k_folds": 2})
fun_control.update({"test_size": 0.6})
cv_model(config, fun_control)
k: 0
┏━━━┳━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━┓
┃    Name    Type        Params  Mode   FLOPs   In sizes  Out sizes ┃
┡━━━╇━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━┩
│ 0 │ layers │ Sequential │  1.8 K │ train │ 310 K │ [128, 10]   [128, 1] │
└───┴────────┴────────────┴────────┴───────┴───────┴───────────┴───────────┘
Trainable params: 1.8 K                                                                                            
Non-trainable params: 0                                                                                            
Total params: 1.8 K                                                                                                
Total estimated model params size (MB): 0                                                                          
Modules in train mode: 101                                                                                         
Modules in eval mode: 0                                                                                            
Total FLOPs: 310 K                                                                                                 
train_model result: {'val_loss': 5008.822265625, 'hp_metric': 5008.822265625}
k: 1
┏━━━┳━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━┓
┃    Name    Type        Params  Mode   FLOPs   In sizes  Out sizes ┃
┡━━━╇━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━┩
│ 0 │ layers │ Sequential │  1.8 K │ train │ 310 K │ [128, 10]   [128, 1] │
└───┴────────┴────────────┴────────┴───────┴───────┴───────────┴───────────┘
Trainable params: 1.8 K                                                                                            
Non-trainable params: 0                                                                                            
Total params: 1.8 K                                                                                                
Total estimated model params size (MB): 0                                                                          
Modules in train mode: 101                                                                                         
Modules in eval mode: 0                                                                                            
Total FLOPs: 310 K                                                                                                 
train_model result: {'val_loss': 6234.5859375, 'hp_metric': 6234.5859375}
5621.7041015625

55.2 Summary

This section presented an introduction to the basic setup of hyperparameter tuning with spotpython and PyTorch Lightning using a ResNet model for the Diabetes data set.