Commit a5a3518
ssjia
Update base for Update on "[ET-VK] Deduplicate transition clone nodes in TagMemoryMetaPass"
When the same tensor is consumed by multiple ops that need a different
storage representation, the pass previously inserted a separate clone
transition for each consumer. Now it caches transition clones keyed by
(source_node, target_storage_type, target_layout) and reuses existing
clones when the same transition is needed again.
For Qwen3 0.6B (8da4w fp16), the embedding output (BUFFER due to
vocab_size exceeding texture limits) feeds both rms_norm and add which
need TEXTURE. Previously 2 clones were inserted; now 1 clone is shared.
Authored by Claude.
Differential Revision: [D100004700](https://our.internmc.facebook.com/intern/diff/D100004700/)
[ghstack-poisoned]1 parent 8e1640f commit a5a3518
0 file changed
0 commit comments