Commit 25bf1bc
perf: replace dis.get_instructions with direct co_code parsing in from_code (#194)
* perf: replace dis.get_instructions with direct co_code parsing in from_code
dis.get_instructions performs two full passes over the bytecode:
- _make_labels_map → findlabels → _unpack_opargs (to build a jump-label map)
- _get_instructions_bytes (to iterate instructions with full metadata)
Neither pass is needed here. ConcreteBytecode.from_code only needs the
opname, raw arg byte, and source positions for each instruction word —
all of which are directly available from co_code and co_positions().
CACHE entries are already inline in co_code on all supported Python
versions, so direct 2-byte iteration handles them naturally without the
per-version cache_info loop that 3.13 previously required.
Throughput (round-trips of Bytecode.from_code().to_code() on the dis
module's own code object, timed over 1 second, 3 runs each):
Before: 92–94 round-trips/s
After: 107–111 round-trips/s (~+17%)
Austin CPU profile figures:
dis._unpack_opargs: 5.98% own → eliminated
dis._get_instructions_bytes: 3.45% own → eliminated
ConcreteBytecode.from_code: 3.63% own → 4.91% own
* Update src/bytecode/concrete.py
Co-authored-by: Matthieu Dartiailh <marul@laposte.net>
* assume co_positions always available
* perf: bypass validation for trusted InstrLocation and Instr construction
Add two fast-path factory methods that skip validation by using
object.__new__ + direct slot assignment, for call sites where the
inputs are already known to be valid:
**InstrLocation._from_tuple** — replaces InstrLocation(...) at four
internal sites where positions come from trusted sources (existing
InstrLocation.lineno, SetLineno.lineno, first_lineno):
- ConcreteBytecode.to_bytecode (fallback lineno-only location)
- ConcreteBytecode._pack_location (propagated from existing location)
- _ConvertBytecodeToConcrete.concrete_instructions (first_lineno seed
and SetLineno-derived locations)
**BaseInstr._from_trusted** — replaces Instr(name, arg, location=loc)
in ConcreteBytecode.to_bytecode, where name/opcode/arg/location are all
derived from already-validated ConcreteInstr objects.
CPU own-time profile data:
| Hotspot | Before | After |
|---|---|---|
| `ConcreteBytecode.to_bytecode` | 5.98% | 5.07% |
| `Instr._check_arg` | 2.87% | eliminated |
| `BaseInstr._set` (via to_bytecode) | 1.48% | eliminated |
| `BaseInstr._from_trusted` | — | <1% (not in top 20) |
Throughput (Bytecode.from_code().to_code() on dis module's code object,
1 second timed window, 5 runs):
| | r/s range |
|---|---|
| Before | 103–108 |
| After | 109–114 |
* use faster _from_tuple
* undo walrus
---------
Co-authored-by: Matthieu Dartiailh <marul@laposte.net>1 parent 9de3e78 commit 25bf1bc
1 file changed
Lines changed: 17 additions & 14 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
336 | 336 | | |
337 | 337 | | |
338 | 338 | | |
339 | | - | |
340 | | - | |
341 | | - | |
342 | | - | |
343 | | - | |
344 | | - | |
345 | | - | |
346 | | - | |
347 | | - | |
348 | | - | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
| 351 | + | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
349 | 355 | | |
350 | | - | |
351 | | - | |
352 | | - | |
353 | | - | |
| 356 | + | |
354 | 357 | | |
355 | 358 | | |
356 | 359 | | |
| |||
0 commit comments