Skip to content

Reconcile bare/rich component Ids in DependencyGraphs#1784

Merged
AMaini503 merged 5 commits intomainfrom
user/aamaini/dependency-graph-identity-reconciliation
Apr 21, 2026
Merged

Reconcile bare/rich component Ids in DependencyGraphs#1784
AMaini503 merged 5 commits intomainfrom
user/aamaini/dependency-graph-identity-reconciliation

Conversation

@AMaini503
Copy link
Copy Markdown
Contributor

Summary

Follows up on PR #1760 which reconciled bare/rich component identities in \ComponentsFound. This PR applies the same reconciliation to \DependencyGraphs.

Problem

After #1760, when a detector produces rich Ids (with DownloadUrl/SourceUrl) and another detector produces bare Ids for the same package at the same location, \ComponentsFound\ correctly drops the bare entries. However, \DependencyGraphs\ still contains both bare and rich nodes as separate graph keys — referencing Ids that no longer exist in \ComponentsFound.

Solution

Adds a post-processing step in \DefaultGraphTranslationService\ (after \AccumulateAndConvertToContract) that reconciles bare graph nodes into their rich counterparts:

  • Merges outbound edges: bare node's edges are unioned into all rich counterparts
  • Rewrites inbound edges: other nodes' edges pointing to bare Ids are rewritten to rich Ids
  • Migrates metadata sets: ExplicitlyReferenced, DevDependencies, Dependencies
  • Filters self-edges introduced by rewriting
  • Handles multiple rich variants (e.g., same package from different registries)
  • Leaves bare-only nodes unchanged (no rich counterpart = no reconciliation needed)

Testing

  • 10 unit tests covering: Paul's duplication scenario, multi-rich variants, bare-only (no-op), rich-only (no-op), inbound edge rewriting, self-edge prevention, leaf preservation, null/empty collections, multiple locations
  • All 151 Orchestrator tests pass
  • Benchmark scan on GregTestRepo: identical \ComponentsFound\ and \DependencyGraphs\ vs main (safe no-op when no detectors produce rich Ids yet)

Related

After PR #1760 dropped bare entries from ComponentsFound when rich
counterparts exist, DependencyGraphs still referenced bare Ids as
graph nodes — breaking the contract between the two outputs.

This adds a post-processing step in DefaultGraphTranslationService
that merges bare graph nodes into their rich counterparts:
- Merges outbound edges (bare→rich targets)
- Rewrites inbound edges from other nodes
- Migrates metadata set membership (Explicit/DevDep/Dep)
- Filters self-edges introduced by rewriting
- Handles multiple rich variants (cross-product merge)
- Leaves bare-only nodes (no rich counterpart) unchanged

Includes 10 unit tests covering Paul's scenario, multi-rich variants,
edge rewriting, self-edge prevention, leaf preservation, null/empty
collections, and multi-location independence.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the bare-vs-rich component identity reconciliation (previously applied to ComponentsFound) to also reconcile node IDs inside DependencyGraphs, ensuring graph keys and metadata sets reference IDs that still exist after component merging.

Changes:

  • Adds a post-processing reconciliation step in DefaultGraphTranslationService to merge bare graph nodes into rich counterparts (per-location), rewriting edges and metadata sets.
  • Introduces unit tests validating bare→rich merging behavior across multiple scenarios (multi-rich variants, inbound rewrite, self-edge prevention, no-op cases, multi-location).
Show a summary per file
File Description
src/Microsoft.ComponentDetection.Orchestrator/Services/GraphTranslation/DefaultGraphTranslationService.cs Builds dependency graphs once, then reconciles bare IDs to rich IDs by rewriting nodes/edges and metadata sets.
test/Microsoft.ComponentDetection.Orchestrator.Tests/Services/ReconcileDependencyGraphIdsTests.cs Adds targeted unit tests covering reconciliation behavior and edge cases.

Copilot's findings

Comments suppressed due to low confidence (1)

src/Microsoft.ComponentDetection.Orchestrator/Services/GraphTranslation/DefaultGraphTranslationService.cs:212

  • Merging bare outbound edges into rich nodes can create an empty edge-set after rewrite/self-edge filtering, but the graph currently keeps an empty HashSet. This differs from the existing convention where leaf nodes serialize with a null value (not an empty array). After merging, consider setting the value back to null when the resulting edge set is empty.
                if (bareEdges != null)
                {
                    newGraph[richId] ??= [];
                    foreach (var edge in bareEdges)
                    {
                        foreach (var rewritten in RewriteId(edge))
                        {
                            if (rewritten != richId)
                            {
                                newGraph[richId].Add(rewritten);
                            }
                        }
                    }
                }
  • Files reviewed: 2/2 changed files
  • Comments generated: 2

Aayush Maini and others added 2 commits April 17, 2026 10:11
- Change RewriteId to return IEnumerable<string> using Enumerable.Repeat
  instead of allocating a new HashSet for every non-bare id
- Normalize empty edge sets to null after self-edge filtering to match
  the existing serialization convention in GraphTranslationUtility

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 17, 2026 17:14
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 17, 2026

👋 Hi! It looks like you modified some files in the Detectors folder.
You may need to bump the detector versions if any of the following scenarios apply:

  • The detector detects more or fewer components than before
  • The detector generates different parent/child graph relationships than before
  • The detector generates different devDependencies values than before

If none of the above scenarios apply, feel free to ignore this comment 🙂

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 2/2 changed files
  • Comments generated: 1

@AMaini503
Copy link
Copy Markdown
Contributor Author

The CheckDetectorsRunTimesAndCounts verification test failure is a flaky timing assertion — it passed on a previous run of the same commit (run 24577433955).

Local scan times on a clean build:

  • Main: 28.87s
  • This branch: 39.53s

Both well within the 75% drift threshold (28.87 + 21.65 = 50.52s). CI runners just had higher variance.

Copilot AI review requested due to automatic review settings April 20, 2026 20:39
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 2/2 changed files
  • Comments generated: 1

@AMaini503 AMaini503 merged commit 7ce401b into main Apr 21, 2026
19 checks passed
@AMaini503 AMaini503 deleted the user/aamaini/dependency-graph-identity-reconciliation branch April 21, 2026 18:29
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 21, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 0.0%. Comparing base (4eb4071) to head (62bca0d).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@     Coverage Diff      @@
##   main   #1784   +/-   ##
============================
============================

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants