nn.mlp.MLP

nn.mlp.MLP(
    in_channels,
    hidden_channels=None,
    norm_layer=None,
    activation_layer=torch.nn.ReLU,
    inplace=None,
    bias=True,
    dropout=0.0,
    lr=1.0,
    l1=64,
    num_hidden_layers=2,
    output_dim=1,
)

This block implements the multi-layer perceptron (MLP) module.

Parameters

Name	Type	Description	Default
in_channels	int	Number of channels of the input	required
hidden_channels	List[int]	List of the hidden channel dimensions. Note that the last element of this list is the output dimension of the network.	`None`
norm_layer	Callable[…, `torch`.`nn`.`Module`]	Norm layer that will be stacked on top of the linear layer. If `None` this layer won’t be used. Default: `None`	`None`
activation_layer	Callable[…, `torch`.`nn`.`Module`]	Activation function which will be stacked on top of the normalization layer (if not None), otherwise on top of the linear layer. If `None` this layer won’t be used. Default: `torch.nn.ReLU`	`torch.nn.ReLU`
inplace	bool	Parameter for the activation layer, which can optionally do the operation in-place. Default is `None`, which uses the respective default values of the `activation_layer` and Dropout layer.	`None`
bias	bool	Whether to use bias in the linear layer. Default `True`	`True`
dropout	float	The probability for the dropout layer. Default: 0.0	`0.0`
lr	float	Unified learning rate multiplier. This value is automatically scaled to optimizer-specific learning rates using the map_lr() function. A value of 1.0 corresponds to the optimizer’s default learning rate. Default: 1.0.	`1.0`
l1	int	Number of neurons in each hidden layer. Will only be used if hidden_channels is None. Default: 64	`64`
num_hidden_layers	int	Number of hidden layers. Will only be used if hidden_channels is None. Default: 2	`2`

Note

Parameter Definitions:

hidden_channels: This defines the explicit architecture of the MLP. It is a list where each element is the size of a layer. The last element is the output dimension. Example: [32, 32, 1] means two hidden layers of size 32 and an output layer of size 1.
l1 and num_hidden_layers: These are helper parameters often used in hyperparameter optimization (see get_default_parameters()). They will only be used if hidden_channels is None.
- l1: The number of neurons in each hidden layer.
- num_hidden_layers: The number of hidden layers before the output layer.
They describe the architecture in a more compact way but are less flexible than hidden_channels. Relationship: To convert l1 and num_hidden_layers to hidden_channels for a given output_dim: hidden_channels = [l1] * num_hidden_layers + [output_dim]

Examples

Basic usage:

>>> import torch
>>> from spotoptim.nn.mlp import MLP
>>> # Input: 10 features. Output (is considered a hidden layer): 30 features. Hidden layer: 20 neurons.
>>> mlp = MLP(in_channels=10, hidden_channels=[20, 30])
>>> x = torch.randn(5, 10)
>>> output = mlp(x)
>>> print(output.shape)
torch.Size([5, 30])

Using get_optimizer:

>>> model = MLP(in_channels=10, hidden_channels=[32, 1], lr=0.5)
>>> optimizer = model.get_optimizer("Adam")  # Uses 0.5 * 0.001
>>> print(optimizer)
Adam (
Parameter Group 0
    amsgrad: False
    betas: (0.9, 0.999)
    capturable: False
    differentiable: False
    eps: 1e-08
    foreach: None
    fused: None
    lr: 0.0005
    maximize: False
    weight_decay: 0
)

Using l1 and num_hidden_layers parameters: This example shows how to use the hyperparameters suggested by get_default_parameters() to construct the hidden_channels list.

>>> input_dim = 10
>>> output_dim = 1
>>>
>>> # Hyperparameters (e.g., from spotoptim tuning)
>>> l1 = 64
>>> num_hidden_layers = 2
>>>
>>> # Construct hidden_channels list
>>> # [64, 64, 1] -> 2 hidden layers of 64, output layer of 1
>>> # Relationship: To convert l1 and num_hidden_layers to hidden_channels for a given output_dim:
>>> # hidden_channels = [l1] * num_hidden_layers + [output_dim]
>>> # but we can pass l1 and num_hidden_layers directly to the constructor
>>> model = MLP(in_channels=input_dim, l1=l1, num_hidden_layers=num_hidden_layers, output_dim=output_dim)
>>> print(model)
MLP(
  (0): Linear(in_features=10, out_features=64, bias=True)
  (1): ReLU()
  (2): Dropout(p=0.0, inplace=False)
  (3): Linear(in_features=64, out_features=64, bias=True)
  (4): ReLU()
  (5): Dropout(p=0.0, inplace=False)
  (6): Linear(in_features=64, out_features=1, bias=True)
  (7): Dropout(p=0.0, inplace=False)
)

Getting default parameters for tuning:

>>> params = MLP.get_default_parameters()
>>> print(params.names())
['l1', 'num_hidden_layers', 'activation', 'lr', 'optimizer']

Methods

Name	Description
get_default_parameters	Returns a ParameterSet populated with default hyperparameters for this model.
get_optimizer	Get a PyTorch optimizer configured for this model.

get_default_parameters

nn.mlp.MLP.get_default_parameters()

Returns a ParameterSet populated with default hyperparameters for this model.

Note

Since MLP structure is generic (list of hidden channels), the default parameters provided here are a starting point assuming a simple structure similar to LinearRegressor (l1 units per layer, num_hidden_layers). This might need adjustment for specific architectures.

Returns

Name	Type	Description
ParameterSet	ParameterSet	Default hyperparameters.

Examples

>>> params = MLP.get_default_parameters()
>>> print(params.names())
['l1', 'num_hidden_layers', 'activation', 'lr', 'optimizer']

get_optimizer

nn.mlp.MLP.get_optimizer(optimizer_name='Adam', lr=None, **kwargs)

Get a PyTorch optimizer configured for this model.

Parameters

Name	Type	Description	Default
optimizer_name	str	Name of the optimizer from torch.optim. Defaults to “Adam”.	`'Adam'`
lr	float	Unified learning rate multiplier. If None, uses self.lr. This value is automatically scaled to optimizer-specific learning rates. A value of 1.0 corresponds to the optimizer’s default learning rate. Defaults to None (uses self.lr).	`None`
**kwargs	Any	Additional optimizer-specific parameters.	`{}`

Returns

Name	Type	Description
	`optim`.`Optimizer`	optim.Optimizer: Configured optimizer instance ready for training.