positionalEncoding
PositionalEncoding
¶
Bases: Module
Positional encoding module injects some information
about the relative or absolute position of the tokens in the sequence.
The positional encodings have the same dimension as the embeddings, so that the two can be summed.
Here, we use sine
and cosine
functions of different frequencies.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
d_model |
int
|
the embedding dimension. Should be even. |
required |
dropout_prob |
float
|
the dropout value |
required |
max_len |
int
|
the maximum length of the incoming sequence. Usually related to the max batch_size. Can be larger as the batch size, e.g., if prediction is done on a single test set. Default: 12552 |
12552
|
Shape
Input:
x: Tensor, shape [seq_len, batch_size, embedding_dim]
Output:
Tensor, shape [seq_len, batch_size, embedding_dim]
Notes
No return value, but torch
’s methodregister_buffer
is used to register the positional encodings.- Code adapted from PyTorch: “Transformer Tutorial”
Reference
https://pytorch.org/tutorials/beginner/transformer_tutorial.html#positional-encoding
Examples:
>>> from spotpython.light.transformer.positionalEncoding import PositionalEncoding
import torch
# number of tensors
n = 3
# dimension of each tensor, should be even
k = 10
pe = PositionalEncoding(d_model=k, dropout_prob=0)
input = torch.zeros(1, n, k)
# Generate a tensor of size (1, 10, 4) with values from 1 to 10
for i in range(n):
input[0, i, :] = i
print(f"Input shape: {input.shape}")
print(f"Input: {input}")
output = pe(input)
print(f"Output shape: {output.shape}")
print(f"Output: {output}")
position: tensor([[ 0],
[ 1],
[ 2],
...,
[99997],
[99998],
[99999]])
div_term: tensor([1.0000e+00, 1.5849e-01, 2.5119e-02, 3.9811e-03, 6.3096e-04])
Input shape: torch.Size([1, 3, 10])
Input: tensor([[[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
[2., 2., 2., 2., 2., 2., 2., 2., 2., 2.]]])
Output shape: torch.Size([1, 3, 10])
Output: tensor([[[0., 1., 0., 1., 0., 1., 0., 1., 0., 1.],
[1., 2., 1., 2., 1., 2., 1., 2., 1., 2.],
[2., 3., 2., 3., 2., 3., 2., 3., 2., 3.]]])
Source code in spotpython/light/transformer/positionalEncoding.py
6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 |
|
forward(x)
¶
Add positional encoding to the input tensor.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
Tensor
|
Tensor, shape |
required |
Returns:
Type | Description |
---|---|
Tensor
|
Tensor, shape |
Raises:
Type | Description |
---|---|
IndexError
|
if the positional encoding cannot be added to the input tensor |
Source code in spotpython/light/transformer/positionalEncoding.py
83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 |
|