22  The TorchObjective Class for Hyperparameter Tuning

SpotOptim provides a TorchObjective class that simplifies the process of hyperparameter tuning for PyTorch models. It acts as a bridge between your PyTorch code (models, data, training logic) and the SpotOptim optimizer.

This tutorial guides you through the entire workflow, from defining a model to running a comprehensive hyperparameter optimization experiment.

22.1 Overview

The TorchObjective class automates several tedious tasks:

  1. Translating Hyperparameters: Converts the vector of values provided by SpotOptim into a dictionary of named parameters with correct types (int, float, categorical).
  2. Model Instantiation: Creates a new instance of your model for each evaluation with the specific hyperparameters.
  3. Data Loading: Handles creating PyTorch DataLoaders, including dynamic batch sizes.
  4. Training & Evaluation: Runs the training loop and returns validation metrics (e.g., validation loss, MSE).

22.2 Workflow Steps

The typical workflow involves 5 steps:

  1. Define Model: Create a standard PyTorch nn.Module.
  2. Prepare Data: Wrap your data in a SpotOptim data container.
  3. Define Hyperparameters: Specify what to tune using ParameterSet.
  4. Create Experiment: Bundle everything into an ExperimentControl object.
  5. Optimize: Use SpotOptim to find the best configuration.

22.3 Step-by-Step Example

Let’s walk through a complete example tuning a Multi-Layer Perceptron (MLP) on the California Housing dataset.

22.3.1 1. Define the Model

You can use any standard PyTorch model. For this example, we’ll use the built-in MLP class from spotoptim.nn.mlp, but you could define your own nn.Module.

import torch
import torch.nn as nn
from spotoptim.nn.mlp import MLP

22.3.2 2. Prepare the Data

Load your data (numpy arrays or tensors) and wrap it in SpotDataFromArray.

from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from spotoptim.core.data import SpotDataFromArray

# Load data
data = fetch_california_housing()
X, y = data.data, data.target.reshape(-1, 1)

# Split and scale
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42)
scaler_X = StandardScaler()
scaler_y = StandardScaler()

X_train = scaler_X.fit_transform(X_train)
X_val = scaler_X.transform(X_val)
y_train = scaler_y.fit_transform(y_train)
y_val = scaler_y.transform(y_val)

# Create SpotData container
spot_data = SpotDataFromArray(
    x_train=X_train, y_train=y_train,
    x_val=X_val, y_val=y_val
)

22.3.3 3. Define Hyperparameters

Use ParameterSet to define the search space. You can tune floats, integers, and categorical factors.

from spotoptim.hyperparameters import ParameterSet

params = ParameterSet()

# Tuning Learning Rate (log scale)
params.add_float("lr", 1e-4, 1e-1, default=1e-3, transform="log10")

# Tuning Model Architecture
params.add_int("num_hidden_layers", 1, 3, default=2)
params.add_int("l1", 16, 128, default=32)
params.add_float("dropout", 0.0, 0.5, default=0.0)

# Tuning Optimization Parameters
params.add_int("epochs", 10, 50, default=20)
params.add_int("batch_size", 32, 256, default=64)  # Batch size can now be tuned!
ParameterSet(
    lr=Parameter(
            name='lr',
            var_name='lr',
            bounds=Bounds(low=0.0001, high=0.1),
            default=0.001,
            transform='log10',
            type='float'
        ),
    num_hidden_layers=Parameter(
            name='num_hidden_layers',
            var_name='num_hidden_layers',
            bounds=Bounds(low=1, high=3),
            default=2,
            transform=None,
            type='int'
        ),
    l1=Parameter(
            name='l1',
            var_name='l1',
            bounds=Bounds(low=16, high=128),
            default=32,
            transform=None,
            type='int'
        ),
    dropout=Parameter(
            name='dropout',
            var_name='dropout',
            bounds=Bounds(low=0.0, high=0.5),
            default=0.0,
            transform=None,
            type='float'
        ),
    epochs=Parameter(
            name='epochs',
            var_name='epochs',
            bounds=Bounds(low=10, high=50),
            default=20,
            transform=None,
            type='int'
        ),
    batch_size=Parameter(
            name='batch_size',
            var_name='batch_size',
            bounds=Bounds(low=32, high=256),
            default=64,
            transform=None,
            type='int'
        ),
)

22.3.4 4. Create Experiment Control

Bundles the configuration together.

import torch
from spotoptim.core.experiment import ExperimentControl

experiment = ExperimentControl(
    experiment_name="california_housing_tuning",
    model_class=MLP,
    dataset=spot_data,
    hyperparameters=params,
    metrics=["val_loss"],
    device="cpu",  # or "cuda" if available
    loss_function=nn.MSELoss(),
    num_workers=0,
    batch_size=64 # Default fallback if not tuned
)

22.3.5 5. Initialize TorchObjective

This wraps the experiment into a callable function that SpotOptim can use.

from spotoptim.function.torch_objective import TorchObjective

objective = TorchObjective(experiment)

22.3.6 6. Run Optimization

Now use SpotOptim to find the best hyperparameters.

from spotoptim import SpotOptim
import numpy as np

optimizer = SpotOptim(
    fun=objective,
    bounds=objective.bounds,
    var_type=objective.var_type,
    var_name=objective.var_name,
    var_trans=objective.var_trans, # Use defined transformations (e.g., log10 for lr)
    n_initial=5,    # Number of initial random points
    max_iter=10,    # Total evaluations (initial + sequential)
    seed=42,        # For reproducibility
    verbose=True
)

result = optimizer.optimize()

print("\nBest Configuration Found:")
best_params = objective._get_hyperparameters(result.x)
for k, v in best_params.items():
    print(f"{k}: {v}")

print(f"\nBest Validation Loss: {result.fun:.5f}")
TensorBoard logging disabled
Initial best: f(x) = 0.629256
Iter 1 | Best: 0.625072 | Rate: 1.00 | Evals: 60.0%
Iter 2 | Best: 0.584372 | Rate: 1.00 | Evals: 70.0%
Iter 3 | Best: 0.584372 | Curr: 0.588897 | Rate: 0.67 | Evals: 80.0%
Iter 4 | Best: 0.584372 | Curr: 0.600478 | Rate: 0.50 | Evals: 90.0%
Iter 5 | Best: 0.584372 | Curr: 0.612876 | Rate: 0.40 | Evals: 100.0%

Best Configuration Found:
lr: 0.09933581768039416
num_hidden_layers: 2
l1: 64
dropout: 0.5
epochs: 26
batch_size: 239

Best Validation Loss: 0.58437

22.4 Advanced Features

22.4.1 Tuning Batch Size

TorchObjective supports dynamic batch size tuning. By adding batch_size to your ParameterSet (as shown above), the objective function will automatically recreate DataLoaders with the specific batch size for each evaluation.

This allows you to optimize the trade-off between training speed (larger batches) and convergence quality (often better with smaller batches).

22.4.2 Tuning Architecture

You can tune structural parameters like:

  • num_hidden_layers: Depth of the network.
  • l1: Width of the layers (first hidden layer size, propagated if others are not specified).
  • dropout: Regularization strength.
  • Optimization method: You can even tune the optimizer itself (e.g., “Adam” vs “SGD”) if you add it as a factor hyperparameter and your model class supports a get_optimizer method (like MLP does).

Example adding optimizer tuning:

# Add optimizer choice
params.add_factor("optimizer", ["Adam", "SGD", "RMSprop"], default="Adam")
ParameterSet(
    lr=Parameter(
            name='lr',
            var_name='lr',
            bounds=Bounds(low=0.0001, high=0.1),
            default=0.001,
            transform='log10',
            type='float'
        ),
    num_hidden_layers=Parameter(
            name='num_hidden_layers',
            var_name='num_hidden_layers',
            bounds=Bounds(low=1, high=3),
            default=2,
            transform=None,
            type='int'
        ),
    l1=Parameter(
            name='l1',
            var_name='l1',
            bounds=Bounds(low=16, high=128),
            default=32,
            transform=None,
            type='int'
        ),
    dropout=Parameter(
            name='dropout',
            var_name='dropout',
            bounds=Bounds(low=0.0, high=0.5),
            default=0.0,
            transform=None,
            type='float'
        ),
    epochs=Parameter(
            name='epochs',
            var_name='epochs',
            bounds=Bounds(low=10, high=50),
            default=20,
            transform=None,
            type='int'
        ),
    batch_size=Parameter(
            name='batch_size',
            var_name='batch_size',
            bounds=Bounds(low=32, high=256),
            default=64,
            transform=None,
            type='int'
        ),
    optimizer=Parameter(
            name='optimizer',
            var_name='optimizer',
            bounds=['Adam', 'SGD', 'RMSprop'],
            default='Adam',
            transform=None,
            type='factor'
        ),
)

22.4.3 Log-Scale Transformations

For parameters that span several orders of magnitude (like Learning Rate), it is highly recommended to use a log transformation. ParameterSet handles this easily:

# Search lr in [0.0001, 0.1] but optimizer sees [-4, -1]
params.add_float("lr", 1e-4, 1e-1, transform="log10")

SpotOptim will propose values in the transformed space (e.g., -3.0), and TorchObjective will automatically untransform them (10^-3 = 0.001) before passing them to the model.

22.5 Custom Models

To use your own model, simply define a class inheriting from nn.Module whose __init__ accepts the hyperparameters you defined.

class MyCustomModel(nn.Module):
    def __init__(self, input_dim, output_dim, hidden_size, activation):
        super().__init__()
        # ... verify args match your ParameterSet names ...
        self.layer1 = nn.Linear(input_dim, hidden_size)
        self.act = getattr(nn, activation)()
        self.layer2 = nn.Linear(hidden_size, output_dim)

    def forward(self, x):
        return self.layer2(self.act(self.layer1(x)))

Ensure your ParameterSet includes hidden_size and activation (as a factor). input_dim and output_dim are automatically passed from the dataset info.