fix cleaning mixed tests#542
Merged
Merged
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #542 +/- ##
==========================================
+ Coverage 47.28% 47.53% +0.25%
==========================================
Files 74 74
Lines 8868 8907 +39
==========================================
+ Hits 4193 4234 +41
+ Misses 4675 4673 -2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
paulorsousa
approved these changes
May 20, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes a latent bug where two
benchmark_type: "mixed"benchmarks sharing the same(setup, dataset_name)crashed onassert benchmark_type == "read-only"insidero_benchmark_reuse, because the second test inherited the first one's populatedsetup_details["env"]and got routed through the reuse branch.The fix is two-part: a refactor of the post-spin-up publish to
shared_env, plus a new pre-iteration teardown for mixed groups. (An initial one-line fix avoided the assertion crash but broke the cross-type handoff — see "Why the bug report's one-liner isn't enough" below.)The bug
mixedsemantically means "this workload may mutate state, so each test in the group must get a fresh Redis env." The framework respected that for the cross-type handoff (mixed → read-only on the same dataset) but not for the intra-group case.Triggers when a plan satisfies all of:
read-onlyand at least onemixedbenchmark → setsreuse_mixed = True.mixedbenchmarks sharing the same(setup, dataset_name).Symptom:
Real-world repro on RediSearch master:
search-expire-doc-1000-seconds.ymlandsearch-expire-write-read-concurrent.ymlare bothmixed, both targetdataset_name: "5200K-docs-union-iterators.idx10-expire-doc-1000s", share the same 8-setup list, and the surroundingsearch*.ymlglob includes plenty of read-only tests — whichever of the two mixed tests runs second in its group hits the assertion.Why the bug report's one-liner isn't enough
The original proposal was a single line: after publishing the mixed env to
shared_env, clearsetup_details["env"]so the next iteration takes the spin-up branch. That avoids the assertion crash — butrun_local.py:736's per-iteration teardown usessetup_details["env"] is Noneas the "tear down Redis" signal:So the one-liner also causes the per-iteration teardown to kill the Redis whose env was just published to
shared_env. The subsequent read-only group inherits a dict pointing to dead processes and crashes withConnectionRefusedError. The integration test in this PR caught that regression on the existingtest_run_local_dataset_reuse_memtierscenario (one mixed + read-only on the same dataset, which works on master today). The bug report's "the handoff still works becauseshared_envholds the env" assertion didn't account for the per-iteration teardown also killing the processes the env references.The same loop in
run_remote.py:1391already has anis_sharedcheck that protects the handoff there, but the inter-iteration teardown story is still needed for cleanliness; the fix is applied symmetrically.The fix (two parts)
1.
save_env_for_cross_type_reusejust publishes, no clearingPulled the publish into a named helper in both files. It writes
setup_details["env"]intoshared_env[env_key]whenreuse_mixed=Trueand leavessetup_details["env"]populated. The env stays alive across the inner loop iteration (and the existing end-of-group logic preserves it for the cross-type handoff).2. New
tear_down_previous_mixed_env_if_neededhelper, called at the top of each inner-loop iterationFor mixed groups with
reuse_mixed=True, this helper looks atsetup_details["env"]from the previous iteration. If populated, it tears down that Redis (viateardown_local_setup/shutdown_remote_redis), removes the corresponding entry fromshared_env(the Redis it references is now dead), and clearssetup_details["env"]— so the dispatcher takes the spin-up branch.No-op cases (all important):
setup_details["env"]is None, nothing to tear down.reuse_mixed).--keep-env-and-topo— user explicitly opted out of teardown.The result is the correct semantics:
run_local.py:752, equivalent inrun_remote.py) sees the env inshared_env, marksis_shared=True, and preserves it.(setup, dataset)picks it up via the existingshared_env[env_key]handoff at the top of the next outer-loop iteration.ro_benchmark_reusecall site is no longer reached on the mixed path, so theassert benchmark_type == "read-only"invariant is preserved as a belt-and-braces check.Applied symmetrically to
run_remote.py(shutdown_remote_redisfor the teardown, with the extraskip_remote_db_setupno-op case for the remote-specific opt-out).Tests
Integration tests (run in CI alongside the existing dataset-reuse tests)
tests/test_run_local.py::test_run_local_mixed_env_no_leak— drivesrun_local_command_logicend-to-end with a new plan (tests/test_data/mixed_env_leak/*.yml) containing twomixedbenchmarks plus oneread-onlybenchmark, all on the same(setup="oss-standalone", dataset_name="mixed-env-leak"). Pre-fix this would crash on the assertion. Post-fix the run completes with exit code 0; each mixed test gets a fresh spin-up and the read-only test reuses the env from the last mixed test. Usesmemtier_benchmark, which CI already installs from apt.tests/test_run_remote.py::test_run_remote_mixed_env_no_leak— same plan, driven throughrun_remote_command_logic. Gated behindRUN_REMOTE_TESTS=1like the existingtest_run_remote_dataset_reuse_memtier.Unit tests (fast feedback for the helpers)
test_save_env_for_cross_type_reuse_publishes_to_shared_env/_disabled_is_noop/_read_only_is_noop(remote-specific) /_mixed_without_reuse_mixed_is_noop— lock in the publish semantics.test_tear_down_previous_mixed_env_clears_and_removes_from_shared_env/_noop_on_first_iteration/_noop_for_read_only/_noop_when_keep_env_and_topo— lock in the four key invariants of the teardown helper, usingmonkeypatchto intercept the actual teardown calls.Verification done locally
memtier_benchmarkavailable locally).vanilla-memtier-load.yml+vanilla-memtier-query{,2}.ymlend-to-end → exit code 0.test_run_local_dataset_reuse_memtierfails locally onvanilla-memtier-monitor-input.ymlbecause the apt-installedmemtier_benchmark 2.1.4on my machine doesn't support--monitor-input; CI uses a newer build and the test passes there (last master run: 206 passed, 1 skipped).black --checkclean on all four touched files.Out of scope
Two mixed benchmarks sharing
(setup, dataset)each pay their own fresh spin-up — that's the contract ofmixed. If a benchmark author wants to share state between two mixed tests (e.g. warm-up + measured-run), a separate explicit opt-in (e.g.share_env_with: <prev-test-name>inclientconfig) would be cleaner than the implicit state-leak the old code produced. Flagged for future work.