WIP feat(coverage): Add coverage.py code coverage collection by sohil-kshirsagar · Pull Request #87 · Use-Tusk/drift-python-sdk

sohil-kshirsagar · 2026-04-03T21:42:17Z

Add code coverage collection using coverage.py to the Python SDK. When the CLI enables coverage, the SDK manages coverage.py programmatically and returns per-file line/branch coverage via protobuf.

How it works

CLI sets TUSK_COVERAGE=true env var
SDK starts coverage.py with branch=True during initialization
CLI sends CoverageSnapshotRequest via protobuf channel
SDK: stop() → get_data() / analysis2() → erase() → start() cycle
SDK returns CoverageSnapshotResponse with per-file data

Key design decisions

SDK-driven coverage.py: Python has no equivalent of NODE_V8_COVERAGE. The SDK runs inside the app process and drives coverage.py directly.
Branch coverage via _analyze(): Uses coverage.py's private _analyze() API for arc-based branch tracking. The public API only provides aggregate branch counts. Documented as a known limitation.
Thread-safe: All coverage operations protected by threading.Lock(). Safe for concurrent protobuf handler access.

Files changed

drift/core/coverage_server.py — Coverage lifecycle management (new file)
drift/core/communication/communicator.py — Coverage snapshot handler + betterproto fixes
drift/core/communication/types.py — Import new proto types
drift/drift_sdk.py — Hook coverage start/stop into SDK lifecycle
docs/coverage.md — SDK coverage internals documentation
docs/environment-variables.md — Coverage env vars section

Also includes (non-coverage)

betterproto getattr() fix: SetTimeTravelRequest handler used if not request: which fails for messages with all-default values (betterproto falsy bug). Fixed with getattr(cli_message, "field", None). Applied to both time travel and coverage handlers.

TODOs before merge

Blocked on WIP feat(proto): Add coverage snapshot messages tusk-drift-schemas#47 — update tusk-drift-schemas dependency once schemas are published
Add unit tests for coverage_server.py
Guard coverage start with REPLAY mode check (avoid overhead in RECORD/DISABLED modes)
Consider pinning minimum coverage.py version for _analyze() API stability

Edge cases / gotchas

_is_user_file uses os.sep trailing separator to prevent /app matching /application
os.path.realpath() resolves symlinks for consistent path comparison
Double-init guard: calling start_coverage_collection() twice stops the existing instance first
*/test* omit was too broad (matched testimony.py), narrowed to */tests/* and */test_*.py

When TUSK_COVERAGE_PORT env var is set, the SDK starts a tiny HTTP server that manages coverage.py. On each /snapshot request: - Stop coverage, get data, erase (reset), restart - Returns per-file line counts with clean per-test data - No diffing needed (coverage.py supports stop/erase/start cycle) Works with Flask, FastAPI, Django, gunicorn, uvicorn - any framework, because the SDK runs inside the app process. Requires: pip install coverage (or pip install tusk-drift[coverage])

When /snapshot?baseline=true is called, uses coverage.analysis2() to get ALL coverable statements (including uncovered) for the denominator. Regular /snapshot calls only return executed lines (for per-test data).

- Threading lock protects stop/get_data/erase/start sequence - stop_coverage_server() for clean shutdown, integrated into SDK shutdown() - Module-level server reference for proper cleanup

- Enable branch=True in coverage.py initialization - Extract branch data via cov._analyze(filename) API: - numbers.n_branches, n_missing_branches for totals - missing_branch_arcs() for per-line branch detail - Return branch data in /snapshot response alongside line coverage - Python shows accurate branch coverage (93.3% for demo app)

…- needs debug)

betterproto treats messages with all default values as falsy. CoverageSnapshotRequest(baseline=False) was falsy, causing per-test snapshots to be skipped. Changed 'if not request' to 'if request is None'. Also separated coverage initialization from HTTP server so coverage.py starts via start_coverage_collection() for the protobuf channel. Extracted take_coverage_snapshot() as reusable function.

- Remove HTTP server code (CoverageSnapshotHandler, start_coverage_server, _coverage_server global, HTTPServer import) - Replace with clean module-level state (_cov_instance, _source_root, _lock) - Extract _is_user_file() helper - stop_coverage_server() -> stop_coverage_collection() - Update module docstring to reflect protobuf-only architecture

Add _lock protection to stop_coverage_collection() to prevent race condition where shutdown sets _cov_instance=None while a snapshot is in progress on the background reader thread.

Add docs/coverage.md explaining coverage.py integration, branch coverage via arc tracking, thread safety, and limitations. Update environment-variables.md with coverage env vars section.

- Use getattr() for betterproto oneof field access (prevents AttributeError) - Fix _is_user_file path prefix collision (/app matching /application) - Add os.path.realpath() for symlink-safe path comparison - Add thread lock to start_coverage_collection() - Add double-init guard (stop existing instance before creating new) - Narrow */test* omit pattern to */tests/* and */test_*.py - Log failed file analysis at debug level instead of silent swallow

drift/core/coverage_server.py

cursor · 2026-04-03T21:49:38Z

drift/core/drift_sdk.py

+        # Start coverage collection early (before any SDK mode checks that might return early).
+        # Coverage data is accessed via protobuf channel (communicator handles requests).
+        from .coverage_server import start_coverage_collection
+        start_coverage_collection()


Coverage silently restarts and loses data on re-initialization

Medium Severity

start_coverage_collection() is called at line 166, before the cls._initialized early-return guard at line 187. When initialize() is called a second time, coverage.py is stopped and a fresh instance is created (the double-init guard restarts rather than skipping), silently erasing all accumulated coverage data. Then the _initialized check returns early, making the restart entirely unnecessary. Moving the call after the _initialized check, or changing the double-init guard to return early when already running, would prevent the data loss.

Additional Locations (1)

drift/core/coverage_server.py#L51-L74

^{Reviewed by Cursor Bugbot for commit 2a383b0. Configure here.}

- Move _cov_instance None check inside lock (TOCTOU race fix) - Fix branch counting to only include actual branch points, not all arcs

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 2 total unresolved issues (including 1 from previous review).

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 4d156c4. Configure here.}

cursor · 2026-04-03T22:36:03Z

drift/core/coverage_server.py

+        _cov_instance.erase()
+        _cov_instance.start()
+
+    return coverage


Missing try/finally leaves coverage permanently stopped on error

Medium Severity

In take_coverage_snapshot, _cov_instance.stop() is called at the top, but _cov_instance.erase() and _cov_instance.start() at the bottom are not protected by a try/finally. If any exception occurs during processing (e.g., get_data(), measured_files(), or data.lines() in the non-baseline path which has no per-file try/except), coverage is left permanently stopped. All subsequent snapshot requests will operate on a stopped instance that never restarts, producing empty or broken coverage data for the remainder of the session.

^{Reviewed by Cursor Bugbot for commit 4d156c4. Configure here.}

sohil-kshirsagar added 11 commits April 1, 2026 00:45

feat: add ?baseline=true parameter using coverage.py analysis2

3261a40

When /snapshot?baseline=true is called, uses coverage.analysis2() to get ALL coverable statements (including uncovered) for the denominator. Regular /snapshot calls only return executed lines (for per-test data).

chore: add thread safety lock and clean shutdown for coverage server

2a54664

- Threading lock protects stop/get_data/erase/start sequence - stop_coverage_server() for clean shutdown, integrated into SDK shutdown() - Module-level server reference for proper cleanup

wip: migrate coverage to protobuf channel (Python handler timing out …

020e2af

…- needs debug)

fix: prod readiness - thread-safe coverage shutdown

5e4aa97

Add _lock protection to stop_coverage_collection() to prevent race condition where shutdown sets _cov_instance=None while a snapshot is in progress on the background reader thread.

feat: use TUSK_COVERAGE instead of NODE_V8_COVERAGE for Python

18e8349

docs: add code coverage documentation

5d438df

Add docs/coverage.md explaining coverage.py integration, branch coverage via arc tracking, thread safety, and limitations. Update environment-variables.md with coverage env vars section.

sohil-kshirsagar mentioned this pull request Apr 3, 2026

WIP feat(coverage): Add code coverage collection and export Use-Tusk/tusk-cli#216

Draft

5 tasks

cursor bot reviewed Apr 3, 2026

View reviewed changes

sohil-kshirsagar added 2 commits April 3, 2026 15:02

docs: clean up AI writing patterns in coverage doc

5b3354b

fix: address bugbot review feedback

4d156c4

- Move _cov_instance None check inside lock (TOCTOU race fix) - Fix branch counting to only include actual branch points, not all arcs

cursor bot reviewed Apr 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP feat(coverage): Add coverage.py code coverage collection#87

WIP feat(coverage): Add coverage.py code coverage collection#87
sohil-kshirsagar wants to merge 13 commits intomainfrom
feat/code-coverage-tracking-poc

sohil-kshirsagar commented Apr 3, 2026

Uh oh!

Uh oh!

Uh oh!

cursor bot Apr 3, 2026

Uh oh!

cursor bot left a comment

Uh oh!

cursor bot Apr 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

sohil-kshirsagar commented Apr 3, 2026

How it works

Key design decisions

Files changed

Also includes (non-coverage)

TODOs before merge

Edge cases / gotchas

Uh oh!

Uh oh!

Uh oh!

cursor bot Apr 3, 2026

Choose a reason for hiding this comment

Coverage silently restarts and loses data on re-initialization

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor bot Apr 3, 2026

Choose a reason for hiding this comment

Missing try/finally leaves coverage permanently stopped on error

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant