Skip to content

fix(ci): pact-memory CLI subprocess tests flake — every save/update cold-installs deps via pip, racing the 60s timeout #1047

Description

@michael-wojcik

Summary

TestCliSubprocess / TestCliSaveVerificationE2E in pact-plugin/tests/test_memory_cli.py flake red on main on cold CI runners. Run 28275162571 on main failed with 12 subprocess.TimeoutExpired after 60s on cli.py save/update (zero assertion failures); an immediate re-run passed clean (9688 passed, ~118s). The identical content had passed on the PR branch minutes earlier — the tell is the duration: 816s (failed) vs ~118s (re-run) ≈ 12 × 60s of stacked timeouts.

Root cause (confirmed)

Every save/updatePACTMemory.save()/.update()_ensure_ready()ensure_memory_ready()check_and_install_dependencies() (pact-plugin/skills/pact-memory/scripts/memory_init.py:41-86), which runs subprocess.run([python, '-m', 'pip', 'install', '-q', pkg], timeout=60) for each missing dep (pysqlite3, sqlite-vec, model2vec).

CI installs none of these — .github/workflows/tests.yml:94 installs only pip pytest hypothesis httpx pytest-asyncio — so each fresh CLI subprocess cold-installs the full model2vec dependency tree (numpy, huggingface_hub, tokenizers, safetensors, …). Under network variance the cumulative install exceeds the test's outer timeout=60 (test_memory_cli.py:~1402+) → TimeoutExpired. _initialized is per-process, so every subprocess re-pays the cold cost (it self-heals once the shared pip cache is warm — hence the green re-run).

The model2vec HuggingFace download (embeddings.py:54-55) is a secondary cost, gated behind SQLITE_EXTENSIONS_ENABLED (memory_api.py:459-465), which is False in CI when deps are absent — so it isn't even reached. There is no offline/cache handling (HF_HUB_OFFLINE / HF_HOME / vendored model path are absent from code and workflows).

The in-process equivalents (TestCliSaveCommand) pass because they mock scripts.cli.PACTMemory — only the black-box subprocess tests hit the real auto-install path.

Relationship to #98 — SAME ROOT CAUSE

#98 ("change pact-memory auto-install to warn-only") targets the exact same function (check_and_install_dependencies). #98 was filed for a consent / invasive-pip reason, but implementing it (warn-only auto-install, or gating install behind an opt-in env var) eliminates this flake as a side effect: no auto-install → save/update return immediately, embedding silently skipped via the existing SQLITE_EXTENSIONS_ENABLED early-return. This issue tracks the CI-reliability symptom; the fix is #98. It also raises #98's priority — the auto-install is now actively reddening main, not just a UX concern.

Affected tests (12)

All in tests/test_memory_cli.py (TestCliSubprocess, TestCliSaveVerificationE2E); the hung command is always save or update (list/search/delete appear because their setup save hung):

  • test_update_unique_prefix_resolves_and_mutates
  • test_update_full_hash_unchanged
  • test_update_success_envelope_returns_resolved_full_id
  • test_uppercase_full_id_resolves_to_lowercase_stored_update
  • test_update_and_verify
  • test_save_via_stdin
  • test_save_roundtrip_confirms_verification_passed
  • test_save_via_stdin_confirms_verification_passed
  • test_list_returns_saved_memory
  • test_list_limit_flag
  • test_search_returns_results
  • test_delete_and_verify

Fix options

  1. Warn-only auto-install (implements refactor: change pact-memory auto-install to warn-only #98) — RECOMMENDED. memory_init.py:41-86 → report missing deps instead of pip-installing (or gate install behind opt-in PACT_AUTO_INSTALL=1). CI subprocesses then never pip-install → save/update return fast and deterministically. Also update the auto-install behavior tests in tests/test_memory_init.py (they assert install/timeout semantics and must move to the warn-only contract). Minimal root-cause fix; closes refactor: change pact-memory auto-install to warn-only #98.
  2. Pre-install + cache deps and the model in CI. tests.yml: add pysqlite3-binary sqlite-vec model2vec + actions/cache for ~/.cache/pip and ~/.cache/huggingface + a model warm-up step. Heavier; adds a network dependency (the workflow header already flags the native-dep path as a known flake source) and doesn't fix the invasive auto-install for real consumers.
  3. Test-scoped stopgap. Env-skip the auto-install in the subprocess tests (requires the Option-1 env guard), or raise/remove the 60s timeout, or skipif when deps aren't importable. Interim mute only — fragile.

Acceptance

  • The subprocess CLI tests are deterministic on a cold CI runner (no pip install in the hot path).
  • main no longer flakes red on a cold-cache run.

Discovered while verifying CI after merging the live-probe ledger PR #1046 (which itself is unrelated — its content passed identically on the PR branch).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions