[ImportVerilog] Capture analysis should skip reprocessing identical module instances#10338
Merged
Merged
Conversation
rocallahan
added a commit
to rocallahan/circt
that referenced
this pull request
Apr 28, 2026
…odule instances (llvm#10338) Currently capture analysis unconditionally descends into all module instance bodies. This forces slang to instantiate complete ASTs for all module instances, transitively. For large designs this does not scale. For example with one 7M LOC Verilog example, ImportVerilog consumes more than 200GB of memory before crashing with OOM. Slang lets clients avoid this problem via "canonical instance bodies" that express deduplication. Setting `VisitCanonical` to true in the `ASTVisitor` template causes the `InstanceBodySymbol` visit to use the deduplicated canonical instance body (when available). For this analysis we don't need to re-analyze duplicate instance so we can just keep track of which bodies we've visited and avoid descending into already-visited bodies.
37701f3 to
e9d6aed
Compare
rocallahan
added a commit
to rocallahan/circt
that referenced
this pull request
Apr 28, 2026
…odule instances (llvm#10338) Currently capture analysis unconditionally descends into all module instance bodies. This forces slang to instantiate complete ASTs for all module instances, transitively. For large designs this does not scale. For example with one 7M LOC Verilog example, ImportVerilog consumes more than 200GB of memory before crashing with OOM. Slang lets clients avoid this problem via "canonical instance bodies" that express deduplication. Setting `VisitCanonical` to true in the `ASTVisitor` template causes the `InstanceBodySymbol` visit to use the deduplicated canonical instance body (when available). For this analysis we don't need to re-analyze duplicate instance so we can just keep track of which bodies we've visited and avoid descending into already-visited bodies.
e9d6aed to
c206baf
Compare
jpienaar
reviewed
Apr 30, 2026
Member
jpienaar
left a comment
There was a problem hiding this comment.
Oh very nice! I think this can help me drop a lot of bookkeeping I added in another PR too. (seems I can't trigger circt-tests from https://github.com/circt/circt-tests/ which would rerun the larger verilog ones, I think that would help here)
fabianschuiki
approved these changes
Apr 30, 2026
Contributor
fabianschuiki
left a comment
There was a problem hiding this comment.
This is fantastic @rocallahan 🥳. I've been hoping to get rid of that ad-hoc deduplication in the importer at some point. Thanks for doing that! LGTM, modulo @jpienaar's comments.
slang already provides this for us. We should just use it. Also, by using slang's canonical module bodies, analysis passes that walk the slang AST can produce results that refer to specific slang AST objects in canonical module bodies, and we know those are the exact AST objects we're going to use to generate MLIR.
rocallahan
added a commit
to rocallahan/circt
that referenced
this pull request
Apr 30, 2026
…odule instances (llvm#10338) Currently capture analysis unconditionally descends into all module instance bodies. This forces slang to instantiate complete ASTs for all module instances, transitively. For large designs this does not scale. For example with one 7M LOC Verilog example, ImportVerilog consumes more than 200GB of memory before crashing with OOM. Slang lets clients avoid this problem via "canonical instance bodies" that express deduplication. Setting `VisitCanonical` to true in the `ASTVisitor` template causes the `InstanceBodySymbol` visit to use the deduplicated canonical instance body (when available). For this analysis we don't need to re-analyze duplicate instance so we can just keep track of which bodies we've visited and avoid descending into already-visited bodies.
c206baf to
e5d7501
Compare
…odule instances (llvm#10338) Currently capture analysis unconditionally descends into all module instance bodies. This forces slang to instantiate complete ASTs for all module instances, transitively. For large designs this does not scale. For example with one 7M LOC Verilog example, ImportVerilog consumes more than 200GB of memory before crashing with OOM. Slang lets clients avoid this problem via "canonical instance bodies" that express deduplication. Setting `VisitCanonical` to true in the `ASTVisitor` template causes the `InstanceBodySymbol` visit to use the deduplicated canonical instance body (when available). For this analysis we don't need to re-analyze duplicate instances, so we can just keep track of which bodies we've visited and avoid descending into already-visited bodies.
e5d7501 to
5776404
Compare
Contributor
Contributor
rocallahan
added a commit
to rocallahan/circt
that referenced
this pull request
May 5, 2026
…l module instances Currently `HierarchicalNames` unconditionally descends into all module instance bodies. This forces slang to instantiate complete ASTs for all module instances, transitively. For large designs this does not scale. For example with one 7M LOC Verilog example, ImportVerilog consumes more than 200GB of memory before crashing with OOM. So, use slang's canonical module bodies to avoid re-traversing identical modules. With this fixed on top of PR llvm#10338, the aforementioned example is imported quite quickly, consuming no more than 30GB of memory.
rocallahan
added a commit
to rocallahan/circt
that referenced
this pull request
May 5, 2026
…l module instances Currently `HierarchicalNames` unconditionally descends into all module instance bodies. This forces slang to instantiate complete ASTs for all module instances, transitively. For large designs this does not scale. For example with one 7M LOC Verilog example, ImportVerilog consumes more than 200GB of memory before crashing with OOM. So, use slang's canonical module bodies to avoid re-traversing identical modules. With this fixed on top of PR llvm#10338, the aforementioned example is imported quite quickly, consuming no more than 30GB of memory.
jpienaar
pushed a commit
to rocallahan/circt
that referenced
this pull request
May 14, 2026
…l module instances Currently `HierarchicalNames` unconditionally descends into all module instance bodies. This forces slang to instantiate complete ASTs for all module instances, transitively. For large designs this does not scale. For example with one 7M LOC Verilog example, ImportVerilog consumes more than 200GB of memory before crashing with OOM. So, use slang's canonical module bodies to avoid re-traversing identical modules. With this fixed on top of PR llvm#10338, the aforementioned example is imported quite quickly, consuming no more than 30GB of memory.
rocallahan
added a commit
to rocallahan/circt
that referenced
this pull request
May 15, 2026
…l module instances Currently `HierarchicalNames` unconditionally descends into all module instance bodies. This forces slang to instantiate complete ASTs for all module instances, transitively. For large designs this does not scale. For example with one 7M LOC Verilog example, ImportVerilog consumes more than 200GB of memory before crashing with OOM. So, use slang's canonical module bodies to avoid re-traversing identical modules. With this fixed on top of PR llvm#10338, the aforementioned example is imported quite quickly, consuming no more than 30GB of memory.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Currently capture analysis unconditionally descends into all module instance bodies. This forces slang to instantiate complete ASTs for all module instances, transitively. For large designs this does not scale. For example with one 7M LOC Verilog example, ImportVerilog consumes more than 200GB of memory before crashing with OOM.
Slang lets clients avoid this problem via "canonical instance bodies" that express deduplication. Setting
VisitCanonicalto true in theASTVisitortemplate causes theInstanceBodySymbolvisit to use the deduplicated canonical instance body (when available). For this analysis we don't need to re-analyze duplicate instance so we can just keep track of which bodies we've visited and avoid descending into already-visited bodies.To make this work correctly we have to also ensure that MLIR generation uses the same canonical module bodies, so e.g. the
SubroutineSymbols encountered during MLIR generation are the sameSubroutineSymbols thatCaptureAnalysisrecorded. Currently the importer does its own module deduplication which might choose different slang AST module bodies. So, replace the importer's deduplication with slang's canonical module bodies.