Skip to content

Introduce Rewriter abstraction to cache use lookups#235

Merged
maleadt merged 3 commits into
mainfrom
tb/optimize_rewriter
May 28, 2026
Merged

Introduce Rewriter abstraction to cache use lookups#235
maleadt merged 3 commits into
mainfrom
tb/optimize_rewriter

Conversation

@maleadt

@maleadt maleadt commented May 28, 2026

Copy link
Copy Markdown
Member

As observed in #233, compilation of large kernels is slow. The culprit turned out to be the user/use lookup, which walks the entire IR every time since our IR doesn't encode use-def chains (as opposed to LLVM or MLIR). To avoid this cost, introduce a Rewriter abstraction that caches these lookups while invalidating them upon insertion, mutation, etc. Once again modeled after MLIR.

randn with Float64 does down from 13s to 1.3s on my system. Test time goes from 1m50 to 1m05.

@maleadt

maleadt commented May 28, 2026

Copy link
Copy Markdown
Member Author

CI goes from 7m50 to 5m14, with the random tests going from 298s to 57s... I think this obviates #233 then.

@maleadt maleadt merged commit 4a3cc60 into main May 28, 2026
1 check passed
@maleadt maleadt deleted the tb/optimize_rewriter branch May 28, 2026 14:08
@AntonOresten

Copy link
Copy Markdown
Collaborator

Great, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants