Skip to content

feat: named input/output ports for multi-input layers #1058

@ooples

Description

@ooples

Summary

Add named input/output port declarations to LayerBase<T> so layers can declare what inputs they need (e.g., DiffusionResBlock needs input + time_embed). Add a multi-input Forward overload that receives inputs by name.

Depends on: #1057 (GradientTape rewrite)

Problem

Every layer currently has Forward(Tensor<T> input) — single input only. Multi-input layers like DiffusionResBlock, AttentionLayer (Q/K/V), and CrossAttentionLayer work around this by:

  • Storing extra inputs as fields set before Forward (fragile, stateful)
  • Concatenating inputs into a single tensor (lossy, can't differentiate w.r.t. each input)
  • Having custom Forward(input, timeEmbed) overloads that bypass the base class (breaks polymorphism)

The training loop in NeuralNetworkBase.ForwardWithMemory assumes single-input layers chained sequentially. It can't express skip connections, residual paths, or auxiliary inputs.

Design

Port declarations

public record LayerPort(string Name, int[] Shape, bool Required = true);

public abstract class LayerBase<T>
{
    // Default: single input/output (backward compatible)
    public virtual IReadOnlyList<LayerPort> InputPorts =>
        [new LayerPort("input", InputShape)];

    public virtual IReadOnlyList<LayerPort> OutputPorts =>
        [new LayerPort("output", OutputShape)];

    // Multi-input forward — default delegates to single-input
    public virtual Tensor<T> Forward(IReadOnlyDictionary<string, Tensor<T>> inputs)
    {
        if (inputs.Count == 1 || inputs.ContainsKey("input"))
            return Forward(inputs.Values.First());

        throw new NotSupportedException(
            $"{GetType().Name} does not override multi-input Forward. " +
            $"Required ports: {string.Join(", ", InputPorts.Select(p => p.Name))}");
    }

    // Existing single-input forward stays unchanged
    public abstract Tensor<T> Forward(Tensor<T> input);
}

Multi-input layer example

public class DiffusionResBlock<T> : LayerBase<T>
{
    public override IReadOnlyList<LayerPort> InputPorts =>
    [
        new("input", InputShape),
        new("time_embed", [_timeEmbedDim])
    ];

    public override Tensor<T> Forward(IReadOnlyDictionary<string, Tensor<T>> inputs)
    {
        var x = inputs["input"];
        var timeEmbed = inputs["time_embed"];

        // Full forward with time conditioning — autodiff records automatically
        // via GradientTape from issue #1057
        var h = _norm1.Forward(x);
        h = Engine.Swish(h);
        h = _conv1.Forward(h);
        h = Engine.TensorAdd(h, _timeMlp.Forward(timeEmbed));
        h = _norm2.Forward(h);
        h = Engine.Swish(h);
        h = _conv2.Forward(h);

        var skip = _skipConv is not null ? _skipConv.Forward(x) : x;
        return Engine.TensorAdd(h, skip);
    }

    // Single-input Forward throws — this layer requires time_embed
    public override Tensor<T> Forward(Tensor<T> input) =>
        throw new InvalidOperationException(
            "DiffusionResBlock requires time_embed. Use Forward(dict) or the training pipeline.");
}

Backward with named inputs

public abstract class LayerBase<T>
{
    // Multi-input backward returns gradient per named input
    public virtual IReadOnlyDictionary<string, Tensor<T>> Backward(
        IReadOnlyDictionary<string, Tensor<T>> outputGradients)
    {
        // Default: delegate to single-input backward
        var inputGrad = Backward(outputGradients.Values.First());
        return new Dictionary<string, Tensor<T>> { ["input"] = inputGrad };
    }

    // Existing single-input backward stays unchanged
    public abstract Tensor<T> Backward(Tensor<T> outputGradient);
}

Implementation Steps

Step 1: LayerBase infrastructure

  • Add LayerPort record
  • Add InputPorts and OutputPorts virtual properties with single-input defaults
  • Add Forward(IReadOnlyDictionary<string, Tensor<T>>) with default delegation
  • Add Backward(IReadOnlyDictionary<string, Tensor<T>>) with default delegation
  • Verify all existing single-input layers work unchanged

Step 2: Update multi-input layers

  • DiffusionResBlock — ports: input, time_embed
  • NoisePredictorBase / UNet blocks — ports: noisy_sample, timestep, conditioning
  • AttentionLayer — ports: query, key, value (currently packed into single tensor)
  • CrossAttentionLayer — ports: query, context
  • VAEEncoder — ports: input (reparameterization is internal)
  • VAEDecoder — ports: latent
  • AddLayer / ConcatenateLayer — ports: input_a, input_b

Step 3: Update training pipeline

  • NeuralNetworkBase.ForwardWithMemory routes named outputs to named inputs
  • Network architecture defines wiring between layer ports (like a DAG)
  • Backward traversal follows the same DAG in reverse

Step 4: GPU path

  • ForwardGpu(IReadOnlyDictionary<string, IGpuTensor<T>>) overload
  • BackwardGpu(IReadOnlyDictionary<string, IGpuTensor<T>>) overload

Acceptance Criteria

  • All existing single-input layers pass tests unchanged (backward compat)
  • DiffusionResBlock forward includes time conditioning (previously dropped)
  • Attention layers receive Q/K/V as separate named inputs
  • Port mismatch (missing required input) throws at Forward time with clear error message

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions