SpotOptim provides a MLP class that implements a flexible Multi-Layer Perceptron (MLP) architecture using PyTorch. It is designed to be easily plug-and-play for both standalone usage and hyperparameter optimization with SpotOptim.
23.1 Overview
The MLP class extends torch.nn.Sequential and offers:
Flexible Architecture: Define layers explicitly via hidden_channels or using compact hyperparameters like l1 (width) and num_hidden_layers (depth).
Integrated Components: Built-in support for normalization layers, activation functions, and dropout.
Optimization Helpers: Includes a get_optimizer method to easily instantiate optimizers with unified learning rates.
Tuning Ready: A get_default_parameters static method returns a ParameterSet ready for SpotOptim.
23.2 Basic Usage
23.2.1 Initialization
You can initialize an MLP by describing its architecture explicitly.
For hyperparameter tuning, it is often easier to control the network’s size with just two numbers: width and depth.
# Create a network with 3 hidden layers, each having 64 neuronsmodel_compact = MLP( in_channels=10, l1=64, # Width (neurons per hidden layer) num_hidden_layers=3, # Depth (number of hidden layers) output_dim=1)print(model_compact)
The MLP class simplifies optimizer creation, specifically handling the “unified learning rate” concept used in SpotOptim (where different optimizers have their default learning rates mapped to a common scale).
# Create model with a unified learning rate of 1.0 (default)model = MLP(in_channels=10, hidden_channels=[32, 1], lr=1.0)# Get Adam optimizer (lr=1.0 maps to 0.001)opt_adam = model.get_optimizer("Adam")print(f"Adam lr: {opt_adam.param_groups[0]['lr']}")# Get SGD optimizer (lr=1.0 maps to 0.01)opt_sgd = model.get_optimizer("SGD")print(f"SGD lr: {opt_sgd.param_groups[0]['lr']}")
Adam lr: 0.001
SGD lr: 0.01
You can also pass extra arguments to the optimizer:
# SGD with momentumopt_sgd_mom = model.get_optimizer("SGD", momentum=0.9)print(opt_sgd_mom)
One of the key features of the MLP class is its ability to suggest a default ParameterSet for tuning. This provides a great starting point for finding the best architecture.
from spotoptim.hyperparameters import ParameterSet# Get default search spaceparams = MLP.get_default_parameters()print("Default tunable parameters:", params.names())
This setup automatically tunes the architecture (l1, num_hidden_layers), usage of activation functions (activation), learning rate (lr), and optimization method (optimizer) if left in the parameter set.