import torch
import torch.nn as nn
from spotoptim.nn.mlp import MLP22 The TorchObjective Class for Hyperparameter Tuning
SpotOptim provides a TorchObjective class that simplifies the process of hyperparameter tuning for PyTorch models. It acts as a bridge between your PyTorch code (models, data, training logic) and the SpotOptim optimizer.
This tutorial guides you through the entire workflow, from defining a model to running a comprehensive hyperparameter optimization experiment.
22.1 Overview
The TorchObjective class automates several tedious tasks:
- Translating Hyperparameters: Converts the vector of values provided by
SpotOptiminto a dictionary of named parameters with correct types (int, float, categorical). - Model Instantiation: Creates a new instance of your model for each evaluation with the specific hyperparameters.
- Data Loading: Handles creating PyTorch
DataLoaders, including dynamic batch sizes. - Training & Evaluation: Runs the training loop and returns validation metrics (e.g., validation loss, MSE).
22.2 Workflow Steps
The typical workflow involves 5 steps:
- Define Model: Create a standard PyTorch
nn.Module. - Prepare Data: Wrap your data in a SpotOptim data container.
- Define Hyperparameters: Specify what to tune using
ParameterSet. - Create Experiment: Bundle everything into an
ExperimentControlobject. - Optimize: Use
SpotOptimto find the best configuration.
22.3 Step-by-Step Example
Let’s walk through a complete example tuning a Multi-Layer Perceptron (MLP) on the California Housing dataset.
22.3.1 1. Define the Model
You can use any standard PyTorch model. For this example, we’ll use the built-in MLP class from spotoptim.nn.mlp, but you could define your own nn.Module.
22.3.2 2. Prepare the Data
Load your data (numpy arrays or tensors) and wrap it in SpotDataFromArray.
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from spotoptim.core.data import SpotDataFromArray
# Load data
data = fetch_california_housing()
X, y = data.data, data.target.reshape(-1, 1)
# Split and scale
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42)
scaler_X = StandardScaler()
scaler_y = StandardScaler()
X_train = scaler_X.fit_transform(X_train)
X_val = scaler_X.transform(X_val)
y_train = scaler_y.fit_transform(y_train)
y_val = scaler_y.transform(y_val)
# Create SpotData container
spot_data = SpotDataFromArray(
x_train=X_train, y_train=y_train,
x_val=X_val, y_val=y_val
)22.3.3 3. Define Hyperparameters
Use ParameterSet to define the search space. You can tune floats, integers, and categorical factors.
from spotoptim.hyperparameters import ParameterSet
params = ParameterSet()
# Tuning Learning Rate (log scale)
params.add_float("lr", 1e-4, 1e-1, default=1e-3, transform="log10")
# Tuning Model Architecture
params.add_int("num_hidden_layers", 1, 3, default=2)
params.add_int("l1", 16, 128, default=32)
params.add_float("dropout", 0.0, 0.5, default=0.0)
# Tuning Optimization Parameters
params.add_int("epochs", 10, 50, default=20)
params.add_int("batch_size", 32, 256, default=64) # Batch size can now be tuned!ParameterSet(
lr=Parameter(
name='lr',
var_name='lr',
bounds=Bounds(low=0.0001, high=0.1),
default=0.001,
transform='log10',
type='float'
),
num_hidden_layers=Parameter(
name='num_hidden_layers',
var_name='num_hidden_layers',
bounds=Bounds(low=1, high=3),
default=2,
transform=None,
type='int'
),
l1=Parameter(
name='l1',
var_name='l1',
bounds=Bounds(low=16, high=128),
default=32,
transform=None,
type='int'
),
dropout=Parameter(
name='dropout',
var_name='dropout',
bounds=Bounds(low=0.0, high=0.5),
default=0.0,
transform=None,
type='float'
),
epochs=Parameter(
name='epochs',
var_name='epochs',
bounds=Bounds(low=10, high=50),
default=20,
transform=None,
type='int'
),
batch_size=Parameter(
name='batch_size',
var_name='batch_size',
bounds=Bounds(low=32, high=256),
default=64,
transform=None,
type='int'
),
)
22.3.4 4. Create Experiment Control
Bundles the configuration together.
import torch
from spotoptim.core.experiment import ExperimentControl
experiment = ExperimentControl(
experiment_name="california_housing_tuning",
model_class=MLP,
dataset=spot_data,
hyperparameters=params,
metrics=["val_loss"],
device="cpu", # or "cuda" if available
loss_function=nn.MSELoss(),
num_workers=0,
batch_size=64 # Default fallback if not tuned
)22.3.5 5. Initialize TorchObjective
This wraps the experiment into a callable function that SpotOptim can use.
from spotoptim.function.torch_objective import TorchObjective
objective = TorchObjective(experiment)22.3.6 6. Run Optimization
Now use SpotOptim to find the best hyperparameters.
from spotoptim import SpotOptim
import numpy as np
optimizer = SpotOptim(
fun=objective,
bounds=objective.bounds,
var_type=objective.var_type,
var_name=objective.var_name,
var_trans=objective.var_trans, # Use defined transformations (e.g., log10 for lr)
n_initial=5, # Number of initial random points
max_iter=10, # Total evaluations (initial + sequential)
seed=42, # For reproducibility
verbose=True
)
result = optimizer.optimize()
print("\nBest Configuration Found:")
best_params = objective._get_hyperparameters(result.x)
for k, v in best_params.items():
print(f"{k}: {v}")
print(f"\nBest Validation Loss: {result.fun:.5f}")TensorBoard logging disabled
Initial best: f(x) = 0.629256
Iter 1 | Best: 0.625072 | Rate: 1.00 | Evals: 60.0%
Iter 2 | Best: 0.584372 | Rate: 1.00 | Evals: 70.0%
Iter 3 | Best: 0.584372 | Curr: 0.588897 | Rate: 0.67 | Evals: 80.0%
Iter 4 | Best: 0.584372 | Curr: 0.600478 | Rate: 0.50 | Evals: 90.0%
Iter 5 | Best: 0.584372 | Curr: 0.612876 | Rate: 0.40 | Evals: 100.0%
Best Configuration Found:
lr: 0.09933581768039416
num_hidden_layers: 2
l1: 64
dropout: 0.5
epochs: 26
batch_size: 239
Best Validation Loss: 0.58437
22.4 Advanced Features
22.4.1 Tuning Batch Size
TorchObjective supports dynamic batch size tuning. By adding batch_size to your ParameterSet (as shown above), the objective function will automatically recreate DataLoaders with the specific batch size for each evaluation.
This allows you to optimize the trade-off between training speed (larger batches) and convergence quality (often better with smaller batches).
22.4.2 Tuning Architecture
You can tune structural parameters like:
num_hidden_layers: Depth of the network.l1: Width of the layers (first hidden layer size, propagated if others are not specified).dropout: Regularization strength.- Optimization method: You can even tune the optimizer itself (e.g., “Adam” vs “SGD”) if you add it as a factor hyperparameter and your model class supports a
get_optimizermethod (likeMLPdoes).
Example adding optimizer tuning:
# Add optimizer choice
params.add_factor("optimizer", ["Adam", "SGD", "RMSprop"], default="Adam")ParameterSet(
lr=Parameter(
name='lr',
var_name='lr',
bounds=Bounds(low=0.0001, high=0.1),
default=0.001,
transform='log10',
type='float'
),
num_hidden_layers=Parameter(
name='num_hidden_layers',
var_name='num_hidden_layers',
bounds=Bounds(low=1, high=3),
default=2,
transform=None,
type='int'
),
l1=Parameter(
name='l1',
var_name='l1',
bounds=Bounds(low=16, high=128),
default=32,
transform=None,
type='int'
),
dropout=Parameter(
name='dropout',
var_name='dropout',
bounds=Bounds(low=0.0, high=0.5),
default=0.0,
transform=None,
type='float'
),
epochs=Parameter(
name='epochs',
var_name='epochs',
bounds=Bounds(low=10, high=50),
default=20,
transform=None,
type='int'
),
batch_size=Parameter(
name='batch_size',
var_name='batch_size',
bounds=Bounds(low=32, high=256),
default=64,
transform=None,
type='int'
),
optimizer=Parameter(
name='optimizer',
var_name='optimizer',
bounds=['Adam', 'SGD', 'RMSprop'],
default='Adam',
transform=None,
type='factor'
),
)
22.4.3 Log-Scale Transformations
For parameters that span several orders of magnitude (like Learning Rate), it is highly recommended to use a log transformation. ParameterSet handles this easily:
# Search lr in [0.0001, 0.1] but optimizer sees [-4, -1]
params.add_float("lr", 1e-4, 1e-1, transform="log10")SpotOptim will propose values in the transformed space (e.g., -3.0), and TorchObjective will automatically untransform them (10^-3 = 0.001) before passing them to the model.
22.5 Custom Models
To use your own model, simply define a class inheriting from nn.Module whose __init__ accepts the hyperparameters you defined.
class MyCustomModel(nn.Module):
def __init__(self, input_dim, output_dim, hidden_size, activation):
super().__init__()
# ... verify args match your ParameterSet names ...
self.layer1 = nn.Linear(input_dim, hidden_size)
self.act = getattr(nn, activation)()
self.layer2 = nn.Linear(hidden_size, output_dim)
def forward(self, x):
return self.layer2(self.act(self.layer1(x)))Ensure your ParameterSet includes hidden_size and activation (as a factor). input_dim and output_dim are automatically passed from the dataset info.