Reconcile bare/rich component Ids in DependencyGraphs#1784
Conversation
After PR #1760 dropped bare entries from ComponentsFound when rich counterparts exist, DependencyGraphs still referenced bare Ids as graph nodes — breaking the contract between the two outputs. This adds a post-processing step in DefaultGraphTranslationService that merges bare graph nodes into their rich counterparts: - Merges outbound edges (bare→rich targets) - Rewrites inbound edges from other nodes - Migrates metadata set membership (Explicit/DevDep/Dep) - Filters self-edges introduced by rewriting - Handles multiple rich variants (cross-product merge) - Leaves bare-only nodes (no rich counterpart) unchanged Includes 10 unit tests covering Paul's scenario, multi-rich variants, edge rewriting, self-edge prevention, leaf preservation, null/empty collections, and multi-location independence. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR extends the bare-vs-rich component identity reconciliation (previously applied to ComponentsFound) to also reconcile node IDs inside DependencyGraphs, ensuring graph keys and metadata sets reference IDs that still exist after component merging.
Changes:
- Adds a post-processing reconciliation step in
DefaultGraphTranslationServiceto merge bare graph nodes into rich counterparts (per-location), rewriting edges and metadata sets. - Introduces unit tests validating bare→rich merging behavior across multiple scenarios (multi-rich variants, inbound rewrite, self-edge prevention, no-op cases, multi-location).
Show a summary per file
| File | Description |
|---|---|
src/Microsoft.ComponentDetection.Orchestrator/Services/GraphTranslation/DefaultGraphTranslationService.cs |
Builds dependency graphs once, then reconciles bare IDs to rich IDs by rewriting nodes/edges and metadata sets. |
test/Microsoft.ComponentDetection.Orchestrator.Tests/Services/ReconcileDependencyGraphIdsTests.cs |
Adds targeted unit tests covering reconciliation behavior and edge cases. |
Copilot's findings
Comments suppressed due to low confidence (1)
src/Microsoft.ComponentDetection.Orchestrator/Services/GraphTranslation/DefaultGraphTranslationService.cs:212
- Merging bare outbound edges into rich nodes can create an empty edge-set after rewrite/self-edge filtering, but the graph currently keeps an empty HashSet. This differs from the existing convention where leaf nodes serialize with a null value (not an empty array). After merging, consider setting the value back to null when the resulting edge set is empty.
if (bareEdges != null)
{
newGraph[richId] ??= [];
foreach (var edge in bareEdges)
{
foreach (var rewritten in RewriteId(edge))
{
if (rewritten != richId)
{
newGraph[richId].Add(rewritten);
}
}
}
}
- Files reviewed: 2/2 changed files
- Comments generated: 2
- Change RewriteId to return IEnumerable<string> using Enumerable.Repeat instead of allocating a new HashSet for every non-bare id - Normalize empty edge sets to null after self-edge filtering to match the existing serialization convention in GraphTranslationUtility Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…cy-graph-identity-reconciliation
|
👋 Hi! It looks like you modified some files in the
If none of the above scenarios apply, feel free to ignore this comment 🙂 |
|
The Local scan times on a clean build:
Both well within the 75% drift threshold (28.87 + 21.65 = 50.52s). CI runners just had higher variance. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1784 +/- ##
============================
============================
☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Summary
Follows up on PR #1760 which reconciled bare/rich component identities in \ComponentsFound. This PR applies the same reconciliation to \DependencyGraphs.
Problem
After #1760, when a detector produces rich Ids (with DownloadUrl/SourceUrl) and another detector produces bare Ids for the same package at the same location, \ComponentsFound\ correctly drops the bare entries. However, \DependencyGraphs\ still contains both bare and rich nodes as separate graph keys — referencing Ids that no longer exist in \ComponentsFound.
Solution
Adds a post-processing step in \DefaultGraphTranslationService\ (after \AccumulateAndConvertToContract) that reconciles bare graph nodes into their rich counterparts:
Testing
Related