Skip to content

50+ layers missing [TrainableParameter] annotations — breaks tape-based autodiff training #1082

@ooples

Description

@ooples

Bug Description

50+ layer types have trainable parameter fields registered via `RegisterTrainableParameter()` but missing the `[TrainableParameter]` attribute. The source generator uses the attribute to produce `GetTrainableParameters()` and `SetTrainableParameters()`, so the tape-based autodiff system can't discover these parameters. Result: zero gradients and failed training for every network using these layers.

Impact

Any network using `TrainWithTape()` with an affected layer will silently produce zero gradients. This affects ColBERT, Siamese, InstructorEmbedding, RBF networks, and many others.

Affected Layers (50+ files)

Layers where `RegisterTrainableParameter` call count exceeds `[TrainableParameter]` attribute count:

Layer Attributes RegisterTrainable Calls
AttentionLayer 1 12
GRULayer 0 18
LSTMLayer 2 24
ConvLSTMLayer 2 24
RecurrentLayer 2 9
GraphTransformerLayer 0 13
DirectionalGraphLayer 0 10
MessagePassingLayer 0 11
HeterogeneousGraphLayer 0 5
HighwayLayer 2 8
MemoryWriteLayer 2 10
MemoryReadLayer 2 8
BatchNormalizationLayer 1 2
GroupNormalizationLayer 1 2
InstanceNormalizationLayer 1 2
LayerNormalizationLayer 1 2
OctonionLinearLayer 0 2
HyperbolicLinearLayer 0 3
SparseLinearLayer 0 1
QuantumLayer 0 2
PrimaryCapsuleLayer 0 2
SoftTreeLayer 0 2
SpatialTransformerLayer 0 4
FeedForwardLayer 0 4
GraphIsomorphismLayer 0 4
... and ~25 more

Fix

Add `[TrainableParameter(Role = PersistentTensorRole.Weights)]` or `[TrainableParameter(Role = PersistentTensorRole.Biases)]` to every trainable field that's registered via `RegisterTrainableParameter()`. The source generator will then include them in `GetTrainableParameters()` / `SetTrainableParameters()`.

This is mechanical but must be done carefully — each field needs the correct Role annotation matching what `RegisterTrainableParameter` passes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions