perf: replace O(n) name/varname list scan with O(1) dict lookup#199
Conversation
_ConvertBytecodeToConcrete.add() used list.index() to find whether a name or varname was already registered, making each lookup O(n) in the number of names accumulated so far. For functions with many locals or names this becomes a non-trivial cost on every HAS_LOCAL / HAS_NAME instruction. Replace the generic static add() method with two instance methods, add_varname() and add_name(), each backed by a parallel dict (varnames_map / names_map). Lookups are now O(1). The argnames pre-population loop is also updated to go through add_varname() so the map stays in sync. We also move some more instruction validation checks from iteration to insertion, which also contributes to the total uplift (~+3%). Benchmark (round-trip test): Baseline: 204.1 r/s (stdev 4.2) This change: 221.3 r/s (stdev 2.9) Uplift: +8.4%
5caa36b to
c7234a7
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #199 +/- ##
==========================================
+ Coverage 95.24% 95.39% +0.14%
==========================================
Files 7 7
Lines 2083 2107 +24
Branches 449 456 +7
==========================================
+ Hits 1984 2010 +26
+ Misses 55 54 -1
+ Partials 44 43 -1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
In what sense? The only change in CFG is the time when validation happens, but no API has been changed ( |
|
The fact the validation of the content now occurs earlier could break some usages (extend node and then break it into sub nodes). It is not a large breakage but worth documenting. |
|
Looking at codecov report we are actually missing tests for the error paths of append and insert. Would you mind adding those ? |
MatthieuDartiailh
left a comment
There was a problem hiding this comment.
I promise those are the last comments. Recently I only manage to review from my phone which makes me miss the broader context quite often sorry for this.
No worries! Changes made. |
_ConvertBytecodeToConcrete.add() used list.index() to find whether a name or varname was already registered, making each lookup O(n) in the number of names accumulated so far. For functions with many locals or names this becomes a non-trivial cost on every HAS_LOCAL / HAS_NAME instruction.
Replace the generic static add() method with two instance methods, add_varname() and add_name(), each backed by a parallel dict (varnames_map / names_map). Lookups are now O(1). The argnames pre-population loop is also updated to go through add_varname() so the map stays in sync.
Benchmark (round-trip test):
So a +8.4% improvement overall. We also move some more instruction validation checks from iteration to insertion, which also contributes to the total uplift (~ +3% of the +8.4% total).