You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/axon.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -15,7 +15,7 @@ A fundamental challenge for any theoretical understanding of something as comple
15
15
16
16
In this context, a primary goal in generating the extensive content here on [compcogneuro.org](https://compcogneuro.org) is to document the full extent to which the current version of Axon captures the existing scientific findings in neuroscience and cognitive psychology, so that the interested reader may form their own opinion about the extent to which the model provides an accurate picture of brain function. Feedback on any and all such issues is encouraged, using the [github discussions](https://github.com/compcogneuro/web/discussions) forum. The [github issues](https://github.com/compcogneuro/web/issues) can be used to report typos or other such "bugs", and pull requests for suggested fixes or other contributions are always welcome, and provide a way to document contributions.
17
17
18
-
Ultimately, the definitive step in the scientific method is direct empirical tests of the specific predictions from the model, of which there have been a large number over the years, as documented in the relevant places herein. Perhaps the most central such test is reported in [[Jiang et al 2025]], which directly tests the [[temporal derivative]] form of [[synaptic plasticity]] that drives learning in the Axon model.
18
+
Ultimately, the definitive step in the scientific method is direct empirical tests of the specific predictions from the model, of which there have been a large number over the years, as documented in the relevant places herein. Perhaps the most central such test is reported in [[Jiang et al 2026]], which directly tests the [[temporal derivative]] form of [[synaptic plasticity]] that drives learning in the Axon model.
19
19
20
20
## Computational motivation
21
21
@@ -57,7 +57,7 @@ The central elements of Axon in terms of neural and computational mechanisms are
57
57
58
58
The discrete spiking behavior of these neurons enables effective graded information integration over time in a way that continuous [[rate code activation]] communication does not, by allowing many different signals to be communicated over time, competing for the overall control of the network activation state as a function of the collective integration of spikes within the neurons in the network. As a result, Axon models are overall much more robust and well-behaved overall compared to their [[Leabra]] rate-code based counterparts, especially with respect to [[constraint satisfaction]] computation.
59
59
60
-
*[[Error-driven learning]] based on errors computed via a [[temporal derivative]] that naturally supports [[predictive learning]], as the difference over time of network activity states representing the prediction followed by the outcome. Local [[synaptic plasticity]] based on the competition between kinases updating at different rates, i.e., the [[kinase algorithm]], naturally computes the error gradient via the temporal derivative dynamic. The result is a fully biologically plausible form of the computationally powerful [[error backpropagation]] algorithm, as shown by the [[GeneRec]] algorithm. Initial empirical support for this mechanism is reported in [[Jiang et al 2025]], in electrophysiological measurements of synaptic plasticity in a rodent preparation.
60
+
*[[Error-driven learning]] based on errors computed via a [[temporal derivative]] that naturally supports [[predictive learning]], as the difference over time of network activity states representing the prediction followed by the outcome. Local [[synaptic plasticity]] based on the competition between kinases updating at different rates, i.e., the [[kinase algorithm]], naturally computes the error gradient via the temporal derivative dynamic. The result is a fully biologically plausible form of the computationally powerful [[error backpropagation]] algorithm, as shown by the [[GeneRec]] algorithm. Initial empirical support for this mechanism is reported in [[Jiang et al 2026]], in electrophysiological measurements of synaptic plasticity in a rodent preparation.
61
61
62
62
The combination of robust error-driven learning and biologically-detailed spiking neurons in Axon enables these neurons to learn to perform arbitrary computational and cognitive tasks. Furthermore, the availability of a clear computational measure of performance in terms of overall learning capability across a wide range of tasks has enabled the optimization of all the biological parameters to maximize learning performance. There is a consistent set of such parameters that generally works best across all the tasks investigated to date, and thus the additional degrees of freedom associated with these parameters are generally eliminated from consideration in constructing new models, greatly reducing the effective degrees of freedom of the model.
63
63
@@ -105,7 +105,7 @@ There will be many practical challenges associated with scaling up these models,
105
105
106
106
We can summarize the overall approach by way of answering several key questions that one might reasonably ask in evaluating a theoretical and computational model of the human brain:
107
107
108
-
1._Is it scientifically accurate?_ The Axon mechanisms are all grounded in detailed [[neuroscience]] data, and produce known [[cognition|cognitive]] and behavioral phenomena accurately. There are no significant errors of commission in the mechanisms included: each such mechanism has solid evidence in support of it, including critically the basis for powerful error-driven learning via a [[temporal derivative]] computed by the competition between a faster LTP-promoting CaMKII pathway and a slower opposing LTD-promoting DAPK1 pathway in the [[kinase algorithm]]. An initial experimental test of this hypothesis ([[Jiang et al 2025]]) shows consistent evidence.
108
+
1._Is it scientifically accurate?_ The Axon mechanisms are all grounded in detailed [[neuroscience]] data, and produce known [[cognition|cognitive]] and behavioral phenomena accurately. There are no significant errors of commission in the mechanisms included: each such mechanism has solid evidence in support of it, including critically the basis for powerful error-driven learning via a [[temporal derivative]] computed by the competition between a faster LTP-promoting CaMKII pathway and a slower opposing LTD-promoting DAPK1 pathway in the [[kinase algorithm]]. An initial experimental test of this hypothesis ([[Jiang et al 2026]]) shows consistent evidence.
109
109
110
110
2._Does it have a clear principled basis for effective computation?_ The principle of [[search]] through high-dimensional spaces unifies our understanding of both learning and online computation through [[optimized representations]] (via [[constraint satisfaction]] supported by [[bidirectional connectivity]]). The principal challenge is the [[curse of dimensionality]], which requires _dedicated-parallel_, _gradient-based_ search mechanisms. Most of the existing [[reinforcement learning#model-based]] reinforcement learning mechanisms do not scale well because they involve serial search of one form or another. By contrast, the [[Rubicon]] framework leverages the parallel mechanisms in Axon, along with a neuroscience-based [[computational-cognitive-neuroscience#reverse engineering]] of the results of millions of years of parallel [[evolution|evolutionary]] search to build in stronger [[bias-variance tradeoff|biases]] that shape and constrain goal-driven learning.
Copy file name to clipboardExpand all lines: content/bidirectional-connectivity.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,7 +9,7 @@ Relative to most [[abstract neural network]] (ANN) models, [[Axon]] is unique in
9
9
10
10
Learning is also much more difficult in the context of complex activation dynamics, and interestingly there are surprisingly impressive results from [[reservoir computing]] networks that eschew learning within bidirectionally connected networks entirely, using them instead as "reservoirs" of complex dynamical activity states from which signals can be decoded via simpler feedforward [[error-driven learning]] mechanisms.
11
11
12
-
By contrast, the form of learning in [[Axon]] depends critically on bidirectional excitatory connectivity for propagating error signals throughout the network, and can tune large, complex bidirectional networks to develop effective [[predictive learning]] representations of the environment, leveraging the principle of learning based on a [[temporal derivative]]. There is now experimental evidence consistent with this form of learning in at least one specific, widely-studied pathway involving cortical pyramidal neurons and synaptic mechanisms that exist throughout the neocortex ([[Jiang et al 2025]]).
12
+
By contrast, the form of learning in [[Axon]] depends critically on bidirectional excitatory connectivity for propagating error signals throughout the network, and can tune large, complex bidirectional networks to develop effective [[predictive learning]] representations of the environment, leveraging the principle of learning based on a [[temporal derivative]]. There is now experimental evidence consistent with this form of learning in at least one specific, widely-studied pathway involving cortical pyramidal neurons and synaptic mechanisms that exist throughout the neocortex ([[Jiang et al 2026]]).
13
13
14
14
From a computational cost perspective, bidirectional connectivity is very expensive because it doubles the number of synaptic connections, and requires roughly 200x iterations through the network to process a single input. This significantly limits the ability to scale up the models, which has been the primary driver of impressive computational performance in current feedforward ANN models. Nevertheless, as parallel compute hardware continues to improve, this limitation will hopefully be overcome (and the current version of [[Axon]] runs efficiently on any GPU, using WebGPU so it works through the browser too). For the time being, the models do focus more on capturing the principles rather than the kinds of raw performance improvements that come with scaling (see [[bias-variance tradeoff]] for more discussion).
Copy file name to clipboardExpand all lines: content/cerebellum.md
+6-4Lines changed: 6 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -465,13 +465,15 @@ The vestibulo-ocular reflex (VOR) is a thoroughly studied aspect of cerebellar f
465
465
466
466
The VOR is directly mediated by brainstem circuits driven by [[vestibular]] sensory inputs ([[#figure_vor-anatomy]]). In this case of the evolutionarily most ancient vestibular system aspect of cerebellar function, the functional equivalent of the cerebellar nucleus neurons are located directly within the secondary vestibular nuclear complex (VNC), as shown in the figure. These neurons directly participate in the VOR reflex, whereas in other cases the cerebellar nucleus neurons would typically be modulatory onto other brainstem circuits, e.g., in the MDJ and other areas.
467
467
468
-
The reflexive circuit shown in [[#figure_vor-anatomy]] is typical of many similar such circuits in different parts of the brainstem, where sensory signals directly drive associated motor behavior, via secondary neurons that can be modulated to enable or disable the reflex, and to tune its gain. The ability to tune the VOR in response to perturbations in the visual input (e.g., by wearing glasses) is critically dependent on the cerebellum.
468
+
The reflexive circuit shown in [[#figure_vor-anatomy]] is typical of many similar such circuits in different parts of the brainstem, where sensory signals directly drive associated motor behavior, via secondary neurons that can be modulated to enable or disable the reflex.
469
469
470
-
This contribution of the cerebellum is summarized by its contributions to the _VOR gain_ factor: the ratio of head motion to compensatory eye motion in the opposite direction. For the normal, baseline case, this gain factor is 1. But experimental manipulations including changing the magnification in the eyes or stimulating the vestibular nerves causes the gain to change, in a cerebellum-dependent manner.
470
+
Although it is described as a reflex, the key cerebellar contribution to the VOR is to _anticipate_ (through downbound forward model pathways) the sensory effects of head motion, so that the compensatory eye movements are not always lagging behind, and instead can be anticipatory and more precisely negate the effects of head motion.
471
471
472
-
The VOR gain is a _multiplicative_ factor, and yet neurons generally can not directly drive multiplicative effects on other neurons. Instead, the way this is accomplished in these circuits is through an [[opponent]] organization, so that the modulatory neuron excites _both_ a _pull towards_ and a _push away_ from the target action. As the level of excitatory drive is modulated, the overall range of firing of these opponent neurons will vary as well, and it is this overall range that corresponds to the effective gain factor.
472
+
Logically, this forward model would be one that is based on the actual full-field visual motion signals from the retina that would otherwise be produced by the head motion, because those will have the relevant temporal envelope for the actual motion signals that are being canceled. However, there is a catch-22 associated with those signals: they will be zeroed out by the compensatory eye movements. Therefore, it is important that the specific driving sensory input that is being anticipated comes from the vestibular senses, which will always be activated by head motion even with the active VOR eye movements.
473
+
474
+
In experimental studies, the contribution of the cerebellum is summarized by its contributions to the _VOR gain_ factor: the ratio of head motion to compensatory eye motion in the opposite direction. For the normal, baseline case, this gain factor is 1. But experimental manipulations including changing the magnification in the eyes (via glasses) or stimulating the vestibular nerves causes the gain to change, in a cerebellum-dependent manner.
473
475
474
-
TODO: key point: eye motor commands must _anticipate_ the actual head movement, based on efferent copy, so this requires the forward model prediction -- not just a simple gain learning function, but actual anticipatory learning in the vestib -> motor pathways. Cerebellar cortex can be thought of modulating gain, by affecting learning in vestib cells.
476
+
The VOR gain is a _multiplicative_ factor, and yet neurons generally can not directly drive multiplicative effects on other neurons. Instead, the way this is accomplished in these circuits is through an [[opponent]] organization, so that the modulatory neuron excites _both_ a _pull towards_ and a _push away_ from the target action. As the level of excitatory drive is modulated, the overall range of firing of these opponent neurons will vary as well, and it is this overall range that corresponds to the effective gain factor.
475
477
476
478
TODO: key question of how zero visual motion as error signal translates into learning -- this zeroing out is probably true for many other cases..
Copy file name to clipboardExpand all lines: content/kinase-algorithm.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -13,7 +13,7 @@ The relevant background for this algorithm is presented in the following pages:
13
13
14
14
*[[GeneRec]] derives a concrete learning algorithm directly from the mathematics of [[error backpropagation]], which uses [[bidirectional connectivity]] to propagate error gradients throughout the [[neocortex]]. The kinase algorithm leverages the same principles at a computational level, while using more directly biologically based mechanisms that also have some important quantitative differences in the gradients computed.
15
15
16
-
*[[Jiang et al 2025]] presents initial direct evidence showing that the direction of synaptic plasticity in neurons recorded in the mouse CA1 area is consistent with the temporal derivative hypothesis.
16
+
*[[Jiang et al 2026]] presents initial direct evidence showing that the direction of synaptic plasticity in neurons recorded in the mouse CA1 area is consistent with the temporal derivative hypothesis.
17
17
18
18
Here, we build on these foundations to describe the detailed mechanisms that actually drive learning in the Axon models, which represent an attempt to satisfy constraints from neuroscience, computational efficacy, and computational cost.
19
19
@@ -23,7 +23,7 @@ At a big-picture level, the two central ideas behind the kinase algorithm are:
23
23
24
24
* Apply a cascade of simple [[exponential integration]] steps to simulate the complex biochemical processes that follow from this Ca++ influx, with time constants optimized based on computational performance across a wide range of tasks. The final two steps in this cascade implement the [[temporal derivative]] computation where the faster penultimate step drives LTP (weight increases) while the final slower step drives LTD (weight decreases).
25
25
26
-
This strategy leverages biophysically constrained mechanisms where they are well-established, while adopting a more abstracted computationally motivated approach to the complexities of the subsequent biochemical processes, which are not yet sufficiently specified to support a more bottom-up approach. The overall mechanism behind the [[temporal derivative]] is supported by the general properties of the CaMKII and DAPK1 kinases and related mechanisms, as described in [[synaptic plasticity]], and by the initial empirical results of [[Jiang et al 2025]].
26
+
This strategy leverages biophysically constrained mechanisms where they are well-established, while adopting a more abstracted computationally motivated approach to the complexities of the subsequent biochemical processes, which are not yet sufficiently specified to support a more bottom-up approach. The overall mechanism behind the [[temporal derivative]] is supported by the general properties of the CaMKII and DAPK1 kinases and related mechanisms, as described in [[synaptic plasticity]], and by the initial empirical results of [[Jiang et al 2026]].
27
27
28
28
However, at a pragmatic implementational level, it would be very expensive to compute the Ca++ influx based on the NMDA and VGCC biophysical equations for each synapse individually, given that synapses greatly outnumber neurons (e.g., $N^2$ in a fully connected model), Therefore, we instead break out the computation into two subcomponents:
Copy file name to clipboardExpand all lines: content/learning.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,7 +10,7 @@ However, from a neuroscience perspective, the basic mechanisms of [[synaptic pla
10
10
11
11
Fortunately, it is possible to reconcile the computational imperative for error backpropagation with known synaptic plasticity mechanisms through the principle of the [[temporal derivative]], which naturally computes the error gradient at the heart of error backpropagation, as shown by the [[GeneRec]] algorithm. This algorithm uses [[bidirectional connectivity]] to communicate temporal derivatives throughout the network, allowing error signals arising anywhere to drive learning everywhere.
12
12
13
-
Biologically, a temporal derivative can be computed through the competition between two chemical processes with different time constants, and current [[synaptic plasticity]] research shows that the direction of synaptic weight change is determined by a competition between a faster process controlled by the _CaMKII_ kinase, versus a slower process controlled by the _DAPK1_ kinase. Initial empirical support for this [[kinase algorithm]] mechanism is reported in [[Jiang et al 2025]], in electrophysiological measurements of synaptic plasticity in a rodent preparation.
13
+
Biologically, a temporal derivative can be computed through the competition between two chemical processes with different time constants, and current [[synaptic plasticity]] research shows that the direction of synaptic weight change is determined by a competition between a faster process controlled by the _CaMKII_ kinase, versus a slower process controlled by the _DAPK1_ kinase. Initial empirical support for this [[kinase algorithm]] mechanism is reported in [[Jiang et al 2026]], in electrophysiological measurements of synaptic plasticity in a rodent preparation.
14
14
15
15
The temporal-derivative form of error-driven learning naturally supports [[predictive learning]], as the difference over time of network activity states representing the prediction followed by the outcome. This provides an ecologically-valid source of the error signals necessary for driving error-driven learning, and this same form of predictive learning is what drives LLMs, so we know it is capable of driving the formation of powerful [[cognitive]] representations.
0 commit comments