[Research/Enhancement] Compile-pipeline DX: converter-failure diagnostics + streaming tape API

# [Research / Enhancement] Compile-pipeline DX — converter-failure diagnostics + streaming tape API

**Component:** `skainet-compile-hlo` / `skainet-lang` tape · **Version:** SKaiNET 0.31.0 · **Type:** enhancement / research (NOT a correctness bug)

While lowering a Whisper model (`skainet-whisper-kmp`) through tape → `ComputeGraph` → StableHLO → IREE,
two developer-experience rough edges surfaced. The actual correctness fixes all belonged in *our* code
(see "Not bugs"); these two are DX/research asks.

## 1. Converter-failure diagnostics: one failing node cascades into many misleading "Unsupported arity" errors

When a single op has no registered converter, the failure is reported not once but ~20× — because the
failed node produces no SSA value, every downstream consumer then reports `Unsupported <op> arity`.

Concrete repro: `StableHloConverterFactory.createBasic()` does **not** register
`NeuralNetOperationsConverter`, so a `conv1d` front-end emits:

```
// Unsupported op 'conv1d' (type=trace) for node n1_conv1d. Known names: [..., no conv1d ...]
// Unsupported squeeze arity for node n2_squeeze
// Unsupported add arity for node n9_add
// Unsupported batch matmul arity for node n21_matmul
// Unsupported SDPA arity for node n34_scaledDotProductAttention
... (~20 lines)
```

Root cause = **1** missing converter registration (use `createExtended()`); the other ~19 are cascade
victims. This masked the real cause and cost real debugging time.

**Ask (any of):**
- Distinguish *root* failures (`no converter for op X`) from *cascade* failures (`operand missing because a
  predecessor failed`) in the emitted comments / a summary.
- Emit a one-line summary: `N nodes failed; root causes: [conv1d]; M downstream skipped (missing operands)`.
- The `Known names: [...]` list is already helpful — pairing it with `op 'conv1d' is registered by
  createExtended()/createFast(), not createBasic()` would shortcut diagnosis to seconds.

## 2. Streaming / incremental tape API (open question)

`DefaultGraphExecutionContext.tape(...).record { }` materialises the whole tape, then `toComputeGraph()`
builds the full graph in memory. Is there interest in a **streaming/incremental** tape API — emit nodes as
recorded, or chunk by subgraph — so very large forwards can be lowered without buffering the entire tape?
Not a blocker for whisper (tiny.en records fine); relevant for multi-GB models.

## Not bugs (fixed in our code)
- `conv1d` "Unsupported" → use `createExtended()` (registers `NeuralNetOperationsConverter`); converter
  was always present.
- A build-time `ctx.ops.narrow(posEmb,…)` under `VoidTensorOps` baked **zeros** (its `narrow` returns
  zeros) → moved the slice *inside* `forward` so it's a traced op. Arguably `VoidTensorOps` silently
  returning zeros for value-producing ops is a footgun, but it's correct for a shape-only trace backend;
  noting only for awareness.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Research/Enhancement] Compile-pipeline DX: converter-failure diagnostics + streaming tape API #740

[Research / Enhancement] Compile-pipeline DX — converter-failure diagnostics + streaming tape API

1. Converter-failure diagnostics: one failing node cascades into many misleading "Unsupported arity" errors

2. Streaming / incremental tape API (open question)

Not bugs (fixed in our code)

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Research/Enhancement] Compile-pipeline DX: converter-failure diagnostics + streaming tape API #740

Description

[Research / Enhancement] Compile-pipeline DX — converter-failure diagnostics + streaming tape API

1. Converter-failure diagnostics: one failing node cascades into many misleading "Unsupported arity" errors

2. Streaming / incremental tape API (open question)

Not bugs (fixed in our code)

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions