[tests] Stream on-device test results to MTP as they finish#11833
[tests] Stream on-device test results to MTP as they finish#11833simonrozsival wants to merge 8 commits into
Conversation
The MTP test adapter (`Microsoft.Android.Run`) reported results all-or-nothing at the very end of a run: the on-device instrumentation buffered every result and only wrote a TRX + reported `resultsPath` in its final `Finish()` call, and the host adapter blocked on `am instrument -w`, parsed only the final bundle, then published every `TestNodeUpdateMessage` in one batch. If the app process crashed mid-run (e.g. a native SIGSEGV), `Finish()` never ran, so no TRX was written and no `resultsPath` was reported. The host threw "Instrumentation did not report a resultsPath in the bundle." and MTP showed **Zero tests ran** — discarding the hundreds of tests that had actually passed, with no indication a crash occurred. Stream results instead: * Device (`TestInstrumentation.TestListener`): send an enriched `INSTRUMENTATION_STATUS` block per test on start and finish, carrying the class/name/outcome and (Base64-encoded, so they stay single-line) failure message and stack trace. * Host (`AndroidTestAdapter`): run `am instrument -w -r`, parse the streamed status blocks line-by-line, and publish each test to MTP as it finishes. A `LineWriter` splits the process output into lines and a `StatusStreamParser` feeds completed blocks to an async consumer via a channel (so the sync read loop never blocks on `PublishAsync`). Now a mid-run crash only loses the in-flight test: every test completed beforehand is already reported. When the run doesn't finish cleanly, a synthetic failed test node is published so the run is clearly marked failed rather than silently empty. The TRX-pull path is kept as a fallback for instrumentation that doesn't stream. Also `[Ignore]` the `AndroidMessageHandlerTests .ServerCertificateCustomValidationCallback_*` tests, which crash the test process with a native SIGSEGV (dotnet#8608); the root cause will be diagnosed separately. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR updates the on-device NUnit instrumentation runner and the Microsoft.Android.Run MTP adapter to stream per-test results live (via am instrument -r) instead of buffering and only reporting at the end, improving resilience to mid-run crashes. It also disables a set of AndroidMessageHandler tests that are known to crash the process with SIGSEGV (linked to #8608).
Changes:
- Device runner: emit enriched
INSTRUMENTATION_STATUSblocks on test start/finish, including Base64-encoded message/stack for single-line transport. - Host adapter: switch to
am instrument -w -r, parse streamed status blocks line-by-line, and publishTestNodeUpdateMessages as tests finish; publish a synthetic failure node on detected mid-run crash. - Tests: mark multiple
AndroidMessageHandlerTests.ServerCertificateCustomValidationCallback_*tests as[Ignore]due to native crashes.
Show a summary per file
| File | Description |
|---|---|
| tests/TestRunner.Core/TestInstrumentation.cs | Adds per-test start/finish status streaming with Base64-encoded failure details for line-based instrumentation output. |
| src/Microsoft.Android.Run/AndroidTestAdapter.cs | Implements streamed parsing/publishing pipeline for am instrument -r, keeps TRX fallback, and adds a synthetic crash node. |
| tests/Mono.Android-Tests/Mono.Android-Tests/Xamarin.Android.Net/AndroidMessageHandlerTests.cs | Ignores several crash-prone tests with a link to the tracking issue. |
Copilot's findings
- Files reviewed: 3/3 changed files
- Comments generated: 1
The [Ignore] for the crashing AndroidMessageHandlerTests .ServerCertificateCustomValidationCallback_* tests (dotnet#8608) belongs with the trimmable-typemap work; move it to dotnet#11801. This PR is now scoped to the MTP streaming-results change only. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Address review feedback: PublishStatusAsync treated any INSTRUMENTATION_STATUS block with a "test" key as a finished test (defaulting a missing outcome to Passed) and added it to reportedFinal. For instrumentation that doesn't use this streaming protocol (e.g. AndroidJUnitRunner), that would mis-report tests as passed and, by making reportedFinal non-empty, suppress the TRX fallback. Now only blocks with event=="start"/"finish" are handled, and a finish must carry a non-empty outcome; anything else is ignored so ReportFromTrxAsync can handle the run. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Research finding: MTP's built-in crash handling (ITestHostProcessLifetimeHandler, CrashDump/hang-dump) only covers the *test host* process — i.e. Microsoft.Android.Run itself — not the app-under-test on the device. A device-side native crash is invisible to MTP, so a synthetic report is the only option. MTP does, however, provide ErrorTestNodeStateProperty for infrastructure errors (distinct from an assertion Failed), which is the right state for a crash. Improvements: * Track the in-flight test (the one whose "start" arrived without a "finish"). On a crash, resolve *that* test from InProgress to ErrorTestNodeStateProperty so it isn't left dangling and the crash is attributed to the offending test. Only when no test was in flight (crash between tests) do we publish the synthetic "<instrumentation>.ProcessCrashed" error node. Either way the run is clearly marked failed, and we never emit two failures for one crash. * Use ErrorTestNodeStateProperty (not FailedTestNodeStateProperty) for crash and unrecognized-outcome nodes — these are infrastructure errors, not assertion failures. * Error handling: a single unreportable status block is logged and skipped instead of aborting the whole run; OperationCanceledException is rethrown, not swallowed. * Always observe the consumer task: `await consumer` moved into the `finally` so it runs even when StartProcess throws or is cancelled (previously it was skipped on the exception path, leaving the task unobserved). No unit tests are added: Microsoft.Android.Run has no test project, and the adapter is exercised end-to-end by every on-device test run in CI. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
/review |
|
✅ Android PR Reviewer completed successfully! |
There was a problem hiding this comment.
🤖 Code Review — ⚠️ Needs Changes (non-blocking)
Solid, well-reasoned fix. The root-cause analysis (buffered-until-Finish() reporting + host ignoring per-test INSTRUMENTATION_STATUS) is convincing, and the streaming design is genuinely defensive:
- Single-reader/single-writer channel keeps the synchronous stdout read loop off the async
PublishAsyncpath. - In-flight tracking resolves a crashed test to an
ErrorTestNodeStatePropertyinstead of leaving a dangling in-progress node; a synthetic node is published when no test was in flight. - The multi-signal
Crashedheuristic is safe against normal failing runs, since the device still callsFinish(Result.Ok, ...)(soexitCode == 0+resultsPathpresent) when tests merely fail. - Base64-encoding the message/stack on the device to keep the line-based protocol single-line is the right call, and the TRX pull is correctly retained as a fallback.
I verified the threading: ProcessUtils.StartProcess awaits all read tasks before returning, so the post-return stdout.Flush() in the finally can't race the reader, and all MessageBus publishes are serialized through the single consumer (crash/TRX publishes only run after await consumer). No correctness bug found there. The added using Xamarin.Android.Tools; correctly resolves the external ProcessUtils (same 4-arg overload AdbHelper.RunAsync already uses).
Findings
| Severity | Category | Location |
|---|---|---|
| Testing | StatusStreamParser / LineWriter — no host-side unit tests for new pure parsing logic |
|
| 💡 suggestion | Documentation | StreamingState.ReportedFinal — comment describes TRX dedup that never runs |
| 💡 suggestion | Formatting | Double blank line before enum TrxOutcome |
0 errors · 1 warning · 2 suggestions. No merge-blocking bugs. The warning is the one worth acting on: the new parser/line-splitter is exactly the kind of device-independent logic that should have off-device regression tests locking in this crash-recovery fix.
CI
At review time the checks on add9f4f were still pending (0 reported), so I can't confirm the build/on-device matrix is green — please make sure it goes green before merging.
Automated review via the android-reviewer skill. Verify suggestions before applying.
Generated by Android PR Reviewer for #11833 · 299.4 AIC · ⌖ 23.5 AIC · ⊞ 6.8K
Comment /review to run again
Instrumentation.SendStatus's resultCode parameter is enumified to Android.App.Result (like Finish), so the raw int status codes could not be passed implicitly. Cast explicitly to preserve the intended instrumentation report codes (1/0/-2/-3). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Update the ReportedFinal comments to describe its real role (a streamed-anything / authoritative-streaming signal) instead of TRX dedup, which never actually runs. - Remove a double blank line before enum TrxOutcome. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Thanks for the review. Addressed in `3e7d39c12` + `c25c09ee0`:
Also folded in the build fix for the `SendStatus` status codes (enumified to `Android.App.Result`). |
Echo every finished test (including passes) to the console as it streams in, with a running count, so a long device run's log grows steadily and progress stays visible instead of appearing to hang. MTP captures the test host's console output by default, so this stays silent for local runs. CI opts in via -p:TestingPlatformCaptureOutput=false on the dotnet test invocation, which lets the lines stream live into the build log. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Summary
The MTP test adapter (
Microsoft.Android.Run) reports on-device test results all-or-nothing at the very end of a run, which means a mid-run crash discards the entire run.The bug
While investigating a CoreCLRTrimmable CI failure (build 1487907, PR #11802) that reported
Zero tests ranafter 52s, I found the app had actually run 263 tests (251 passed) and then crashed with a native SIGSEGV in the CoreCLR GC. Despite that, MTP reported nothing.Root cause — reporting is buffered until the end:
TestInstrumentation.OnStart): runs the whole suite, buffers every result, and only writes the TRX + reportsresultsPathin the finalFinish()call.AndroidTestAdapter): blocks onam instrument -w, parses only the final bundle, then publishes allTestNodeUpdateMessages in one batch.So when the process crashes mid-run,
Finish()never runs → no TRX, noresultsPath→ host throws "Instrumentation did not report a resultsPath in the bundle." → Zero tests ran, with every passed test discarded and no sign a crash happened.An
INSTRUMENTATION_STATUSwas already sent per test (that's the[PASSED]/[FAILED]in logcat), but the host used-wwithout-rand ignored it.The fix — stream results live
TestInstrumentation.TestListener): emit an enrichedINSTRUMENTATION_STATUSblock per test on start and finish, carrying class/name/outcome and (Base64-encoded, to stay single-line) failure message + stack trace.AndroidTestAdapter): runam instrument -w -r, parse status blocks line-by-line, and publish each test to MTP as it finishes. ALineWritersplits process output into lines; aStatusStreamParserhands completed blocks to an async consumer via a channel (the synchronous read loop never blocks onPublishAsync).Now a mid-run crash only loses the in-flight test — every test completed beforehand is already reported. When the run doesn't finish cleanly, a synthetic failed test node is published so the run is clearly marked failed instead of silently empty. The TRX-pull path is retained as a fallback for instrumentation that doesn't stream.
Also
[Ignore]theAndroidMessageHandlerTests.ServerCertificateCustomValidationCallback_*tests, which crash the test process with a native SIGSEGV (#8608). The root cause will be diagnosed separately; this unblocks the CoreCLR/Trimmable device test runs.Testing
Host adapter builds clean. Full validation requires the on-device CI matrix (device build not run locally).