Skip to content

Commit 423118e

Browse files
Document activation/aggregation behavior found by adversarial review
Add a scaling, clamping, and special-case reference table to docs/activation.rst covering the 9 activation functions that deviate from their canonical textbook forms. Previously the only prose in the file was the introductory sentence "some of these functions are scaled differently from the canonical versions" — with no indication of which functions, what scaling factors, or what clamping ranges are used. The new table documents: - sigmoid, tanh, sin, softplus input scaling (5x, 2.5x, 5x, and 5x in with 0.2x out, respectively) and ±60 input clamping - gauss ±3.4 input clamp and -5 exponent coefficient - exp ±60 input clamp - log 1e-7 input floor (so non-positive inputs return log(1e-7) rather than raise ValueError) - inv ArithmeticError -> 0.0 fallback on division by zero or overflow - lelu 0.005 leak coefficient explicitly noted as non-standard, with a reference to the conventional 0.01 used by PyTorch nn.LeakyReLU - The 9 remaining activations (relu, elu, selu, identity, clamped, abs, hat, square, cube) listed as canonical with no scaling Add empty-input behavior notes to all 7 aggregation function docs in docs/module_summaries.rst. Previously these were documented as pure math formulas (\max(x), \min(x), etc.) with no mention of what the functions return for an empty input iterable. max, min, maxabs, median, and mean all have explicit "if x else 0.0" guards in the source; sum inherits Python's sum([]) = 0 behavior; product inherits reduce's 1.0 initializer. These are deliberate and address a real edge case (orphaned nodes with no incoming connections), but the behavior was invisible to anyone reading the docs. Rewrite the validate_aggregation prose in docs/module_summaries.rst from the inaccurate "takes at least one argument" to the accurate "callable with exactly one positional argument", and document the builtin early-return fallback that mirrors validate_activation's new behavior. Expand the :raises: clause to enumerate the three conditions under which InvalidAggregationFunction is raised. Sphinx build passes: make clean html produces 18 warnings, all of which are pre-existing in other files (academic_research.rst, xor_example.rst, genome-interface.rst, installation.rst, reproduction-interface.rst); none originate in activation.rst or module_summaries.rst. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 93d0164 commit 423118e

File tree

2 files changed

+66
-14
lines changed

2 files changed

+66
-14
lines changed

docs/activation.rst

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,43 @@ more of the functions' "interesting" behavior in the region :math:`\left[-1, 1\r
1111

1212
The implementation of these functions can be found in the :py:mod:`activations` module.
1313

14+
The following table summarizes the scaling, clamping, and non-canonical
15+
behavior of the activation functions that differ from their textbook forms.
16+
Input ``z`` is clamped to the given range before any output transform is
17+
applied. Functions not listed below (``relu``, ``elu``, ``selu``, ``identity``,
18+
``clamped``, ``abs``, ``hat``, ``square``, ``cube``) apply their canonical
19+
transforms directly with no scaling or clamping.
20+
21+
+-------------+--------------------+----------------+------------------------------------------------+
22+
| Function | Input clamp | Scaling | Transform |
23+
+=============+====================+================+================================================+
24+
| sigmoid | ±60 after 5×z | 5× input | :math:`1 / (1 + e^{-5z})` |
25+
+-------------+--------------------+----------------+------------------------------------------------+
26+
| tanh | ±60 after 2.5×z | 2.5× input | :math:`\tanh(2.5\,z)` |
27+
+-------------+--------------------+----------------+------------------------------------------------+
28+
| sin | ±60 after 5×z | 5× input | :math:`\sin(5\,z)` |
29+
+-------------+--------------------+----------------+------------------------------------------------+
30+
| gauss | ±3.4 | −5 in exponent | :math:`e^{-5 z^2}` |
31+
+-------------+--------------------+----------------+------------------------------------------------+
32+
| softplus | ±60 after 5×z | 5× in, 0.2× out| :math:`0.2 \log(1 + e^{5z})` |
33+
+-------------+--------------------+----------------+------------------------------------------------+
34+
| exp | ±60 | none | :math:`e^{z}` |
35+
+-------------+--------------------+----------------+------------------------------------------------+
36+
| log | floor at ``1e-7`` | none | :math:`\log(\max(10^{-7}, z))` — non-positive |
37+
| | | | inputs yield :math:`\log(10^{-7}) \approx |
38+
| | | | -16.118` rather than ``ValueError``. |
39+
+-------------+--------------------+----------------+------------------------------------------------+
40+
| inv | none | none | :math:`1/z`, returning ``0.0`` on |
41+
| | | | ``ArithmeticError`` (e.g. division by zero |
42+
| | | | or overflow). |
43+
+-------------+--------------------+----------------+------------------------------------------------+
44+
| lelu | none | none | :math:`z` if :math:`z > 0`, otherwise |
45+
| | | | :math:`0.005\,z`. **Note: non-standard leak |
46+
| | | | coefficient** — the conventional leaky ReLU |
47+
| | | | uses ``0.01`` (e.g. PyTorch's |
48+
| | | | ``nn.LeakyReLU`` default). |
49+
+-------------+--------------------+----------------+------------------------------------------------+
50+
1451
abs
1552
---
1653

docs/module_summaries.rst

Lines changed: 29 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -76,57 +76,67 @@ Has the built-in :term:`aggregation functions <aggregation function>`, code for
7676
.. py:function:: product_aggregation(x)
7777
7878
An adaptation of the multiplication function to take an :pygloss:`iterable`.
79+
Returns ``1.0`` for an empty input (the multiplicative identity, from
80+
``reduce``'s initializer).
7981

8082
:param x: The numbers to be multiplied together; takes any ``iterable``.
8183
:type x: list(:pytypes:`float <typesnumeric>`) or tuple(:pytypes:`float <typesnumeric>`) or set(:pytypes:`float <typesnumeric>`)
82-
:return: :math:`\prod(x)`
84+
:return: :math:`\prod(x)` for nonempty ``x``, otherwise ``1.0``.
8385
:rtype: :pytypes:`float <typesnumeric>`
8486

8587
.. py:function:: sum_aggregation(x)
8688
87-
Probably the most commonly-used aggregation function.
89+
Probably the most commonly-used aggregation function. Returns ``0`` for an
90+
empty input (via Python's built-in ``sum``).
8891

8992
:param x: The numbers to find the sum of; takes any :pygloss:`iterable`.
9093
:type x: list(:pytypes:`float <typesnumeric>`) or tuple(:pytypes:`float <typesnumeric>`) or set(:pytypes:`float <typesnumeric>`)
91-
:return: :math:`\sum(x)`
94+
:return: :math:`\sum(x)` for nonempty ``x``, otherwise ``0``.
9295
:rtype: :pytypes:`float <typesnumeric>`
9396

9497
.. py:function:: max_aggregation(x)
9598
96-
Returns the maximum of the inputs.
99+
Returns the maximum of the inputs, or ``0.0`` for an empty input (e.g.
100+
an orphaned node with no incoming connections).
97101

98102
:param x: The numbers to find the greatest of; takes any :pygloss:`iterable`.
99103
:type x: list(:pytypes:`float <typesnumeric>`) or tuple(:pytypes:`float <typesnumeric>`) or set(:pytypes:`float <typesnumeric>`)
100-
:return: :math:`\max(x)`
104+
:return: :math:`\max(x)` for nonempty ``x``, otherwise ``0.0``.
101105
:rtype: :pytypes:`float <typesnumeric>`
102106

103107
.. py:function:: min_aggregation(x)
104108
105-
Returns the minimum of the inputs.
109+
Returns the minimum of the inputs, or ``0.0`` for an empty input (e.g.
110+
an orphaned node with no incoming connections).
106111

107112
:param x: The numbers to find the least of; takes any :pygloss:`iterable`.
108113
:type x: list(:pytypes:`float <typesnumeric>`) or tuple(:pytypes:`float <typesnumeric>`) or set(:pytypes:`float <typesnumeric>`)
109-
:return: :math:`\min(x)`
114+
:return: :math:`\min(x)` for nonempty ``x``, otherwise ``0.0``.
110115
:rtype: :pytypes:`float <typesnumeric>`
111116

112117
.. py:function:: maxabs_aggregation(x)
113118
114-
Returns the maximum by absolute value, which may be positive or negative. Envisioned as suitable for neural network pooling operations.
119+
Returns the maximum by absolute value, which may be positive or negative.
120+
Envisioned as suitable for neural network pooling operations. Returns
121+
``0.0`` for an empty input (e.g. an orphaned node with no incoming
122+
connections).
115123

116124
:param x: The numbers to find the absolute-value maximum of; takes any :pygloss:`iterable`.
117125
:type x: list(:pytypes:`float <typesnumeric>`) or tuple(:pytypes:`float <typesnumeric>`) or set(:pytypes:`float <typesnumeric>`)
118-
:return: :math:`x_i, i = \text{argmax}\lvert\mathbf{x}\rvert`
126+
:return: :math:`x_i, i = \text{argmax}\lvert\mathbf{x}\rvert` for nonempty ``x``, otherwise ``0.0``.
119127
:rtype: :pytypes:`float <typesnumeric>`
120128

121129
.. versionadded:: 0.92
122130

123131
.. py:function:: median_aggregation(x)
124132
125-
Returns the :py:func:`median <math_util.median2>` of the inputs.
133+
Returns the :py:func:`median <math_util.median2>` of the inputs, or
134+
``0.0`` for an empty input (e.g. an orphaned node with no incoming
135+
connections).
126136

127137
:param x: The numbers to find the median of; takes any :pygloss:`iterable`.
128138
:type x: list(:pytypes:`float <typesnumeric>`) or tuple(:pytypes:`float <typesnumeric>`) or set(:pytypes:`float <typesnumeric>`)
129-
:return: The median; if there are an even number of inputs, takes the mean of the middle two.
139+
:return: The median for nonempty ``x`` (if there are an even number of inputs, takes the mean of the middle two); otherwise ``0.0``.
130140
:rtype: :pytypes:`float <typesnumeric>`
131141

132142
.. versionadded:: 0.92
@@ -135,10 +145,11 @@ Has the built-in :term:`aggregation functions <aggregation function>`, code for
135145
136146
Returns the arithmetic mean. Potentially maintains a more stable result than ``sum`` for changing numbers of :term:`enabled`
137147
:term:`connections <connection>`, which may be good or bad depending on the circumstances; having both available to the algorithm is advised.
148+
Returns ``0.0`` for an empty input (e.g. an orphaned node with no incoming connections).
138149

139150
:param x: The numbers to find the mean of; takes any :pygloss:`iterable`.
140151
:type x: list(:pytypes:`float <typesnumeric>`) or tuple(:pytypes:`float <typesnumeric>`) or set(:pytypes:`float <typesnumeric>`)
141-
:return: The arithmetic mean.
152+
:return: The arithmetic mean for nonempty ``x``, otherwise ``0.0``.
142153
:rtype: :pytypes:`float <typesnumeric>`
143154

144155
.. versionadded:: 0.92
@@ -152,11 +163,15 @@ Has the built-in :term:`aggregation functions <aggregation function>`, code for
152163

153164
.. py:function:: validate_aggregation(function)
154165
155-
Checks to make sure its parameter is a function that takes at least one argument.
166+
Checks that ``function`` is callable with exactly one positional argument.
167+
Returns early (accepting the callable) for CPython builtins whose
168+
signatures cannot be inspected via ``inspect.signature``.
156169

157170
:param function: Object to be checked.
158171
:type function: :datamodel:`object <objects-values-and-types>`
159-
:raises InvalidAggregationFunction: If the object does not pass the tests.
172+
:raises InvalidAggregationFunction: If the object is not callable, its
173+
signature cannot be inspected (and it is not a builtin), or it cannot
174+
be invoked with exactly one positional argument.
160175

161176
.. versionadded:: 0.92
162177

0 commit comments

Comments
 (0)