Commit c48ecea
feat(test-tools): t8n file streaming optimizations (#2751)
* feat(test-tools): file-based t8n stream; replacing in-memory stdout
Replaces Python's stdout-buffered, multi-pass parse of t8n output
(which held up to ~5 copies of the alloc in flight: raw stdout bytes,
parsed dict, re-serialized string, re-parsed dict, validated Alloc)
with a file-based LazyAllocFile that streams {address: account_dict}
entries incrementally via ``ijson`` and validates each ``Account`` one
at a time, preserving every existing Pydantic validator.
This cuts per-test peak RSS on the unchunkified benchmark from
5.38 GB → 3.40 GB (measured, fixtures byte-identical) — ~2 GB of
Python-heap pressure removed per xdist worker.
* fix(test-tools): fix issues with evm-dump-dir and ijson updates
* fix: updates from comments on PR #2751
* feat(test-tools): stream chained-block t8n input via file path
* fix(test-client-clis): ensure output dir exists in evm-dump script
* fix(test-tools): force GC in `LazyAllocFile` keepalive test under PyPy
`test_lazy_alloc_file_keepalive_pins_temp_dir` asserted that `del lazy`
immediately wipes the temp directory pinned by `LazyAllocFile._keepalive`.
That holds on CPython, where reference-counted finalization runs
`TemporaryDirectory.__del__` -> `cleanup()` synchronously when the last
reference drops. PyPy uses a generational garbage collector with no
refcounting, so after `del lazy` the `TemporaryDirectory` is unreachable
but not yet finalized, and the directory is still on disk when the next
assertion runs. Add an explicit `gc.collect()` to trigger the finalizer
deterministically on both interpreters.
* fix(test-client-clis): release `LazyAllocFile` keepalive on `OutputCache` insert
The PR introducing the file-based t8n alloc streaming path pins each producing call's `TemporaryDirectory` onto the resulting `LazyAllocFile` via `_keepalive`, so the next chained block can read `output/alloc.json` directly without re-serialization. That lifetime is correct for the live, unchained-handoff path: the predecessor's temp dir is dropped as soon as the consumer call returns and the previous `TransitionToolOutput` falls out of scope.
`OutputCache`, however, keeps every `TransitionToolOutput` produced during a single test alive for the duration of that test (single-key cache, cleared on key change). When chained-block tests run with caching enabled, every cached subcall retains its own `output/alloc.json` plus the entire surrounding `TemporaryDirectory` on `/tmp`. This is `O(N)` for an `N`-block chained test, where each `alloc.json` can be hundreds of MB to several GB; for the unchunkified benchmark scenario this could easily exhaust `/tmp` on shared CI runners.
Materialize the streamed alloc into the cached `Alloc` and clear `_keepalive` before storing the result. By the time we cache, the chained handoff has already happened (the consumer block has finished its `subprocess.run`), so the on-disk file is no longer needed for zero-copy chaining: the only future readers are cache replays from another fixture format, which want a parsed `Alloc` anyway.
Tradeoff:
- Cache replays now pay the `ijson` parse cost on insert rather than on first `.get()`. This is the same parse the live path performs lazily; doing it eagerly at cache-set is fine because we know the caller has just finished using the result, so amortizing the parse here avoids a later surprise during replay.
- The cached entry retains the parsed `Alloc` in Python heap rather than the `alloc.json` on disk. For a typical chained benchmark this is a strict improvement: one parsed `Alloc` is smaller than the equivalent JSON text plus the `TemporaryDirectory` overhead, and the peak-RSS savings of the original streaming PR are preserved during the live run (the materialization happens after the producing call has already released its working buffers).
- The streaming benefit during the first run is fully preserved; only the cached-replay path materializes.
A move-the-keepalive-to-the-consumer alternative was considered and rejected: the consumer block currently sees only the predecessor's `Path`, not the `LazyAllocFile` instance, so threading explicit cleanup back to the predecessor would be invasive and tangled with the `TransitionToolData` flow. Releasing at cache-insertion is the lowest-blast-radius point: it's the moment we know the file isn't needed for chained handoff anymore, and it's localized to `OutputCache.set` rather than spread across the two `_evaluate_*` paths.
* fix(test-client-clis): reject non-object top-level JSON in `LazyAllocFile.validate`
`ijson.kvitems(f, "")` only yields key-value pairs when the top-level JSON value is an object: for `null`, `[]`, scalars, or other valid-but-non-object inputs, the iterator silently yields zero pairs. With the streaming alloc parser landed in this branch, that means `Alloc.model_validate({})` quietly succeeds and the t8n caller receives an empty post-state instead of an error.
The pre-streaming path (`LazyAllocStr.validate`, which calls `Alloc.model_validate_json`) raised on those same inputs, so the streaming PR is a regression in error fidelity even though success-path behavior is unchanged: the t8n binaries we drive (geth, evmone, execution-specs) all produce object-shaped `alloc.json` by contract, so this failure mode does not fire under normal operation. The risk is the failure shape: a t8n that crashes after redirecting stdout, an output path that gets overwritten with an error JSON, or any future divergence from the contract would silently zero the post-state, and the test would only fail downstream with a misleading consensus mismatch rather than a clean "alloc.json is malformed" at the point of corruption. Silent corruption that surfaces far from its source is the costly kind of debugging.
Probe the first parse event from `ijson.parse` and raise `ValueError` if it is not `start_map`, then seek back to zero and let `ijson.kvitems` consume the stream as before.
Tradeoffs:
- One extra parse event per call. `ijson.parse` is event-driven; pulling a single event then re-seeking is effectively free relative to the streamed body parse, and we keep the `kvitems`-based hot path so the entry-by-entry validation pattern is unchanged.
- `ValueError` rather than reusing `ijson.IncompleteJSONError`: the input here is well-formed JSON (just the wrong shape), so reporting it as an incomplete-stream error would itself be misleading. The existing malformed-JSON tests continue to surface as `IncompleteJSONError` from inside `ijson.parse` because those inputs raise before we ever read a complete first event: the new guard only takes effect for valid JSON of the wrong type.
- The legitimately empty alloc (`{}`) is preserved: the guard accepts `start_map` and the rest of the function returns an empty `Alloc` exactly as before. New regression test covers this.
Three new parametrized cases (`null`, `[]`, `42`) cover the silent-zero scenarios, and a positive test for `{}` pins the empty-object behavior so future tightening cannot accidentally reject it.
---------
Co-authored-by: danceratopz <danceratopz@gmail.com>1 parent 1fdd7e3 commit c48ecea
6 files changed
Lines changed: 644 additions & 87 deletions
File tree
- packages/testing
- src/execution_testing/client_clis
- tests
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
52 | 52 | | |
53 | 53 | | |
54 | 54 | | |
| 55 | + | |
55 | 56 | | |
56 | 57 | | |
57 | 58 | | |
| |||
Lines changed: 118 additions & 36 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
4 | | - | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
5 | 7 | | |
6 | 8 | | |
7 | 9 | | |
| |||
10 | 12 | | |
11 | 13 | | |
12 | 14 | | |
| 15 | + | |
13 | 16 | | |
14 | 17 | | |
15 | 18 | | |
16 | 19 | | |
| 20 | + | |
17 | 21 | | |
18 | 22 | | |
19 | 23 | | |
| 24 | + | |
20 | 25 | | |
21 | 26 | | |
22 | 27 | | |
| |||
458 | 463 | | |
459 | 464 | | |
460 | 465 | | |
| 466 | + | |
| 467 | + | |
| 468 | + | |
| 469 | + | |
| 470 | + | |
| 471 | + | |
| 472 | + | |
| 473 | + | |
| 474 | + | |
| 475 | + | |
| 476 | + | |
| 477 | + | |
| 478 | + | |
| 479 | + | |
| 480 | + | |
| 481 | + | |
| 482 | + | |
| 483 | + | |
| 484 | + | |
| 485 | + | |
| 486 | + | |
| 487 | + | |
| 488 | + | |
| 489 | + | |
| 490 | + | |
| 491 | + | |
| 492 | + | |
| 493 | + | |
| 494 | + | |
| 495 | + | |
| 496 | + | |
| 497 | + | |
| 498 | + | |
| 499 | + | |
| 500 | + | |
| 501 | + | |
| 502 | + | |
| 503 | + | |
| 504 | + | |
| 505 | + | |
| 506 | + | |
| 507 | + | |
| 508 | + | |
| 509 | + | |
| 510 | + | |
| 511 | + | |
| 512 | + | |
| 513 | + | |
| 514 | + | |
461 | 515 | | |
462 | 516 | | |
463 | 517 | | |
| |||
473 | 527 | | |
474 | 528 | | |
475 | 529 | | |
| 530 | + | |
| 531 | + | |
| 532 | + | |
| 533 | + | |
| 534 | + | |
476 | 535 | | |
477 | | - | |
478 | | - | |
479 | | - | |
480 | | - | |
| 536 | + | |
| 537 | + | |
| 538 | + | |
| 539 | + | |
| 540 | + | |
| 541 | + | |
481 | 542 | | |
482 | | - | |
| 543 | + | |
483 | 544 | | |
484 | 545 | | |
485 | 546 | | |
| |||
489 | 550 | | |
490 | 551 | | |
491 | 552 | | |
492 | | - | |
493 | | - | |
494 | | - | |
495 | | - | |
496 | | - | |
497 | | - | |
498 | | - | |
499 | | - | |
500 | | - | |
501 | 553 | | |
502 | | - | |
503 | | - | |
504 | | - | |
| 554 | + | |
| 555 | + | |
| 556 | + | |
505 | 557 | | |
506 | | - | |
| 558 | + | |
| 559 | + | |
| 560 | + | |
| 561 | + | |
| 562 | + | |
| 563 | + | |
| 564 | + | |
507 | 565 | | |
508 | 566 | | |
509 | 567 | | |
510 | | - | |
511 | | - | |
| 568 | + | |
| 569 | + | |
512 | 570 | | |
513 | | - | |
514 | | - | |
515 | | - | |
516 | | - | |
517 | | - | |
| 571 | + | |
| 572 | + | |
| 573 | + | |
| 574 | + | |
| 575 | + | |
| 576 | + | |
| 577 | + | |
| 578 | + | |
| 579 | + | |
| 580 | + | |
| 581 | + | |
| 582 | + | |
518 | 583 | | |
| 584 | + | |
| 585 | + | |
| 586 | + | |
| 587 | + | |
519 | 588 | | |
520 | 589 | | |
521 | 590 | | |
| |||
524 | 593 | | |
525 | 594 | | |
526 | 595 | | |
527 | | - | |
528 | | - | |
529 | | - | |
530 | | - | |
531 | | - | |
| 596 | + | |
| 597 | + | |
| 598 | + | |
| 599 | + | |
| 600 | + | |
| 601 | + | |
| 602 | + | |
532 | 603 | | |
533 | 604 | | |
534 | 605 | | |
535 | 606 | | |
536 | | - | |
537 | | - | |
538 | | - | |
| 607 | + | |
| 608 | + | |
| 609 | + | |
| 610 | + | |
539 | 611 | | |
540 | 612 | | |
541 | 613 | | |
| |||
547 | 619 | | |
548 | 620 | | |
549 | 621 | | |
| 622 | + | |
| 623 | + | |
| 624 | + | |
| 625 | + | |
550 | 626 | | |
551 | 627 | | |
552 | 628 | | |
| |||
582 | 658 | | |
583 | 659 | | |
584 | 660 | | |
| 661 | + | |
| 662 | + | |
| 663 | + | |
| 664 | + | |
585 | 665 | | |
586 | | - | |
587 | 666 | | |
588 | 667 | | |
589 | 668 | | |
590 | 669 | | |
591 | | - | |
| 670 | + | |
| 671 | + | |
| 672 | + | |
| 673 | + | |
592 | 674 | | |
593 | 675 | | |
594 | 676 | | |
| |||
Lines changed: 36 additions & 20 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
| 4 | + | |
4 | 5 | | |
5 | 6 | | |
6 | 7 | | |
| |||
9 | 10 | | |
10 | 11 | | |
11 | 12 | | |
| 13 | + | |
12 | 14 | | |
13 | 15 | | |
14 | 16 | | |
| |||
28 | 30 | | |
29 | 31 | | |
30 | 32 | | |
31 | | - | |
32 | | - | |
33 | | - | |
34 | | - | |
35 | | - | |
36 | | - | |
37 | | - | |
38 | | - | |
39 | | - | |
40 | | - | |
41 | | - | |
42 | | - | |
43 | | - | |
44 | | - | |
45 | | - | |
46 | | - | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
47 | 45 | | |
48 | | - | |
49 | | - | |
50 | | - | |
51 | | - | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
52 | 68 | | |
53 | 69 | | |
54 | 70 | | |
| |||
0 commit comments