Skip to content

Commit 1e3d29b

Browse files
aymuos15ericspod
andauthored
Fix GEGLU docstring: Sigmoid -> GELU (#8696)
## Summary - Fixed GEGLU docstring which incorrectly stated the activation function was Sigmoid - The code correctly uses GELU, as specified in the original GEGLU paper ## Details - GLU uses Sigmoid: GLU(x) = σ(xW) ⊗ xV - GEGLU uses GELU: GEGLU(x) = GELU(xW) ⊗ xV Reference: https://arxiv.org/abs/2002.05202 --------- Signed-off-by: Soumya Snigdha Kundu <soumya_snigdha.kundu@kcl.ac.uk> Co-authored-by: Eric Kerfoot <17726042+ericspod@users.noreply.github.com>
1 parent 342bd7a commit 1e3d29b

File tree

1 file changed

+9
-1
lines changed

1 file changed

+9
-1
lines changed

monai/networks/blocks/activation.py

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -168,7 +168,7 @@ class GEGLU(nn.Module):
168168
r"""Applies the element-wise function:
169169
170170
.. math::
171-
\text{GEGLU}(x) = x_1 * \text{Sigmoid}(x_2)
171+
\text{GEGLU}(x) = x_1 * \text{GELU}(x_2)
172172
173173
where :math:`x_1` and :math:`x_2` are split from the input tensor along the last dimension.
174174
@@ -177,6 +177,14 @@ class GEGLU(nn.Module):
177177
Shape:
178178
- Input: :math:`(N, *, 2 * D)`
179179
- Output: :math:`(N, *, D)`, where `*` means, any number of additional dimensions
180+
181+
Examples::
182+
183+
>>> import torch
184+
>>> from monai.networks.layers.factories import Act
185+
>>> m = Act['geglu']()
186+
>>> input = torch.randn(2, 8) # last dim must be even
187+
>>> output = m(input)
180188
"""
181189

182190
def forward(self, input: torch.Tensor):

0 commit comments

Comments
 (0)