Add telemetry for task execution reasons in TaskHost#14211
Conversation
There was a problem hiding this comment.
Pull request overview
Adds finer-grained telemetry for out-of-proc task execution by classifying why a task ran in a TaskHost, enabling analysis of task-host routing drivers (explicit request vs MT isolation vs runtime/architecture mismatch) via the existing build/tasks telemetry event.
Changes:
- Introduces
TaskHostReasonand threads it from routing decisions intoProjectTelemetry.AddTaskExecution(...). - Emits per-reason TaskHost execution counters on
build/tasks. - Adds unit tests and documentation describing the new telemetry fields.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| src/Build/Instance/TaskFactories/AssemblyTaskFactory.cs | Determines a TaskHost routing reason for AssemblyTaskFactory-created tasks and passes it to telemetry. |
| src/Build/BackEnd/TaskExecutionHost/TaskExecutionHost.cs | Tracks TaskHost routing reason for non-AssemblyTaskFactory task factories executed out-of-proc. |
| src/Build/BackEnd/Components/RequestBuilder/TaskRouter.cs | Defines TaskHostReason enum used to classify TaskHost routing. |
| src/Build/BackEnd/Components/Logging/ProjectTelemetry.cs | Extends telemetry to record per-reason TaskHost counts on build/tasks. |
| src/Build.UnitTests/BackEnd/ProjectTelemetry_Tests.cs | Adds unit coverage for the new reason-count telemetry behavior. |
| documentation/VS-Telemetry-Data.md | Documents the new build/tasks TaskHost reason properties. |
| // Determine and track the reason a task is routed to a TaskHost for telemetry. | ||
| // If useTaskFactory is already true at this point, the task is forced out-of-process | ||
| // either because TaskHostFactory was explicitly requested, or because a specific | ||
| // MSBuildRuntime/MSBuildArchitecture was requested that doesn't match the current process. | ||
| TaskHostReason taskHostReason = TaskHostReason.None; |
There was a problem hiding this comment.
The runtime mismatch is actually causes _loadedType.LoadedViaMetadataLoadContext to be true. Worrisome that forced via env. var. seems not fail into correct bucket.
|
Caution agentic threat detected Test Coverage gaps — MODERATEThe new 1.
|
There was a problem hiding this comment.
Caution
agentic threat detected
Threat detection flagged this output in warn mode. Manual review is REQUIRED before any follow-up automation.
24-Dimension MSBuild Code Review — PR #14211
20 / 24 dimensions clean. Four MODERATE findings, no BLOCKING issues.
Findings Summary
| Dimension | Severity | Finding |
|---|---|---|
| ☐ Correctness | MODERATE | LoadedViaMetadataLoadContext=true + empty identity params → silently classified as RuntimeOrArchitectureMismatch instead of the actual MetadataLoadContext reason, inflating that telemetry counter in design-time builds |
| ☐ Test Coverage | MODERATE | Clean() reset path for the three new counters has no test; counts would silently bleed across projects if any reset line were dropped |
| ☐ Test Coverage | MODERATE | The TaskExecutionHost non-AssemblyTaskFactory reason ternary (MultiThreaded ? NonMultiThreadedTask : ExplicitlyRequested) has zero test coverage |
| ☐ Documentation Accuracy | MODERATE | TaskHostNonMultiThreadedTaskCount description (both in VS-Telemetry-Data.md and the TaskHostReason.NonMultiThreadedTask XML doc) restricts the counter to "multi-threaded mode only", but it also fires in MSBuild server mode (IsLongLivedHost=true) via the RequiresTransientTaskHost path |
Inline comments with concrete scenarios and suggested fixes have been attached above.
Clean Dimensions (20/24)
Concurrency & Thread Safety, Backwards Compatibility, ChangeWave Discipline, Performance & Allocations, Error Message Quality, Logging, String Comparison, API Surface, Cross-Platform Correctness, Code Simplification, Naming, SDK Integration, Null Safety, Error Handling, Design & Architecture, Evaluation Model Integrity, Security, Build Infrastructure, Target Authoring, Dependency Management, Code Style & Formatting.
Overall: The approach is sound — telemetry is purely observational and does not alter task routing. The Correctness finding is the most substantive: it inflates RuntimeOrArchitectureMismatch with a genuinely distinct cause, undermining the data quality this PR is designed to improve. The documentation and test gaps are straightforward to address.
Generated by Expert Code Review (on open) for #14211 · 1.8K AIC · ⊞ 30.3K
| | `TaskHostTasksExecutedCount` | int | Tasks executed out-of-process in a TaskHost | | ||
| | `TaskHostExplicitlyRequestedCount` | int | TaskHost tasks routed out-of-proc because the task/project explicitly opted in (e.g. `TaskHostFactory`, or force-out-of-proc knobs for inline task factories) | | ||
| | `TaskHostRuntimeOrArchitectureMismatchCount` | int | TaskHost tasks routed out-of-proc because they requested an `MSBuildRuntime`/`MSBuildArchitecture` that does not match the current process | | ||
| | `TaskHostNonMultiThreadedTaskCount` | int | TaskHost tasks routed out-of-proc in multi-threaded mode for isolation: tasks that are not MT-enlightened (lack `MSBuildMultiThreadableTaskAttribute`), including those force-routed to a transient TaskHost as a temporary workaround (e.g. NuGet's `RestoreTask`) | |
There was a problem hiding this comment.
Documentation Accuracy — MODERATE: Description restricts TaskHostNonMultiThreadedTaskCount to multi-threaded mode only, but it also fires in MSBuild server mode
Current description: "TaskHost tasks routed out-of-proc in multi-threaded mode for isolation..."
The counter is also incremented when BuildParameters.IsLongLivedHost == true (MSBuild server / long-lived host) even with MultiThreaded == false, via the RequiresTransientTaskHost code path in AssemblyTaskFactory.cs:
// fires when (isMultiThreaded || isLongLivedHost)
taskHostReason = TaskHostReason.NonMultiThreadedTask;Suggested description:
TaskHostNonMultiThreadedTaskCount— Tasks executed out-of-process for isolation: tasks that are not MT-enlightened (lackMSBuildMultiThreadableTaskAttribute) in multi-threaded mode, plus tasks force-routed to a transient TaskHost (e.g. NuGet'sRestoreTask) in multi-threaded mode or MSBuild server mode.
| { | ||
| taskHostReason = (_factoryIdentityParameters.TaskHostFactoryExplicitlyRequested ?? false) | ||
| ? TaskHostReason.ExplicitlyRequested | ||
| : TaskHostReason.RuntimeOrArchitectureMismatch; |
There was a problem hiding this comment.
Correctness — MODERATE: LoadedViaMetadataLoadContext is silently misclassified as RuntimeOrArchitectureMismatch
The comment above this block states the two cases are "TaskHostFactory explicitly requested, or a specific MSBuildRuntime/MSBuildArchitecture was requested that doesn't match the current process" — but it misses a third case.
Concrete trigger: _loadedType.LoadedViaMetadataLoadContext = true, _factoryIdentityParameters.IsEmpty = true, taskIdentityParameters.IsEmpty = true.
- Line 312:
useTaskFactory = true(MetadataLoadContext) - Lines 320–326:
(!IsEmpty || !IsEmpty)→false— skipped entirely - Line 333:
if (useTaskFactory)→ enters block - Line 335:
_factoryIdentityParameters.TaskHostFactoryExplicitlyRequestedisnull(becauseIsEmpty == truemeans all fields are null — confirmed by theEquals(new())implementation ofIsEmpty) null ?? false→false→RuntimeOrArchitectureMismatchassigned, but the true reason isLoadedViaMetadataLoadContext
This scenario is real: tasks targeting .NET Framework loaded via MetadataLoadContext when MSBuild runs on .NET Core, with no explicit identity parameters. The telemetry will inflate TaskHostRuntimeOrArchitectureMismatchCount with design-time build executions that aren't architecture mismatches at all.
Recommended fix — guard the MetadataLoadContext case before the ternary:
if (useTaskFactory)
{
taskHostReason = (_factoryIdentityParameters.TaskHostFactoryExplicitlyRequested ?? false)
? TaskHostReason.ExplicitlyRequested
: (_loadedType.LoadedViaMetadataLoadContext
&& (_factoryIdentityParameters.IsEmpty && taskIdentityParameters.IsEmpty
|| TaskHostParametersMatchCurrentProcess(mergedParameters)))
? TaskHostReason.RuntimeOrArchitectureMismatch // or a new MetadataLoadContext value
: TaskHostReason.RuntimeOrArchitectureMismatch;
}Or more directly, add a TaskHostReason.LoadedViaMetadataLoadContext enum value and check _loadedType.LoadedViaMetadataLoadContext first. The simplest minimal fix is an ordering change in the ternary to ensure MetadataLoadContext-only cases are handled explicitly.
| /// TaskHost for isolation. This covers tasks that are not MT-enlightened (they lack the | ||
| /// <c>MSBuildMultiThreadableTaskAttribute</c>), as well as tasks force-routed to a | ||
| /// transient TaskHost as a temporary workaround (e.g. NuGet's <c>RestoreTask</c>). | ||
| /// </summary> |
There was a problem hiding this comment.
Documentation Accuracy — MODERATE: XML doc omits the IsLongLivedHost (server-mode) trigger
The summary says "The build is running in multi-threaded mode..." but the code in AssemblyTaskFactory.cs assigns NonMultiThreadedTask in two distinct branches:
- MT mode:
TaskRouter.NeedsTaskHostInMultiThreadedModereturnstrue - Server mode (no MT required):
TaskRouter.RequiresTransientTaskHostreturnstrueANDBuildParameters.IsLongLivedHost == true— even whenMultiThreaded == false
Telemetry analysts reading the telemetry dashboard will see nonzero TaskHostNonMultiThreadedTaskCount on builds with MSBuild server active but not in multi-threaded mode, and the current doc gives them no way to explain that.
Suggested replacement:
/// <summary>
/// The task is routed to an out-of-proc TaskHost for isolation. This covers:
/// (a) Tasks that are not MT-enlightened (lack <c>MSBuildMultiThreadableTaskAttribute</c>)
/// when running in multi-threaded mode.
/// (b) Tasks force-routed to a transient TaskHost as a temporary workaround
/// (e.g. NuGet's <c>RestoreTask</c>) when running in multi-threaded mode
/// <em>or</em> in MSBuild server / long-lived-host mode.
/// </summary>| |----------|------|-------------| | ||
| | `TasksExecutedCount` | int | Total number of tasks executed | | ||
| | `TaskHostTasksExecutedCount` | int | Tasks executed out-of-process in a TaskHost | | ||
| | `TaskHostExplicitlyRequestedCount` | int | TaskHost tasks routed out-of-proc because the task/project explicitly opted in (e.g. `TaskHostFactory`, or force-out-of-proc knobs for inline task factories) | |
There was a problem hiding this comment.
conflating taskhostfactory and inline task factories sounds suspect
Fixes #13882
Context
When MSBuild runs a task out-of-process in a TaskHost, we previously only recorded that it happened ( TaskHostTasksExecutedCount ), not why. The team wants to understand the distribution of task-host routing reasons across real-world builds to inform multi-threaded-mode work.
Changes Made
Introduces a TaskHostReason enum and threads it from the routing decision into ProjectTelemetry , which emits per-reason integer counts on the existing build/tasks telemetry event.
Testing
Added unit tests