Skip to content

Reconcile bare/rich component identities in ComponentsFound#1760

Merged
AMaini503 merged 5 commits intomainfrom
user/aamaini/component-identity-merging-level1
Apr 14, 2026
Merged

Reconcile bare/rich component identities in ComponentsFound#1760
AMaini503 merged 5 commits intomainfrom
user/aamaini/component-identity-merging-level1

Conversation

@AMaini503
Copy link
Copy Markdown
Contributor

Within a single detector, components registered under bare Ids (no DownloadUrl/SourceUrl) are now merged into their rich counterparts sharing the same BaseId.

Changes:

  • ComponentRecorder.GetDetectedComponents(): group by BaseId, merge bare metadata (licenses, suppliers, containers) into all rich entries
  • DefaultGraphTranslationService.GatherSetOfDetectedComponentsUnmerged(): extend graph lookup to match on BaseId so rich components absorb graph data (roots, ancestors, devDep, scope, file paths) from bare-Id graphs
  • Tests for both reconciliation levels

Addresses work item #2372676.

Copilot AI review requested due to automatic review settings April 2, 2026 22:37
@AMaini503 AMaini503 requested a review from a team as a code owner April 2, 2026 22:37
@AMaini503 AMaini503 requested a review from schmittjoseph April 2, 2026 22:37
@AMaini503 AMaini503 force-pushed the user/aamaini/component-identity-merging-level1 branch from eefbb60 to 0ec6835 Compare April 2, 2026 22:41
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 2, 2026

👋 Hi! It looks like you modified some files in the Detectors folder.
You may need to bump the detector versions if any of the following scenarios apply:

  • The detector detects more or fewer components than before
  • The detector generates different parent/child graph relationships than before
  • The detector generates different devDependencies values than before

If none of the above scenarios apply, feel free to ignore this comment 🙂

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the orchestrator’s component aggregation so that, within a single detector run, components registered under a “bare” identity (no provenance URLs) are reconciled into their corresponding “rich” identities (same BaseId, but with DownloadUrl/SourceUrl). This ensures ComponentsFound reflects a single coherent identity while retaining important metadata and graph-derived fields.

Changes:

  • Reconcile detected components by BaseId, merging bare metadata into rich entries and dropping the bare entries.
  • Extend graph translation to associate graph data with components by either Id or BaseId, so rich components pick up graph info from bare-id graphs.
  • Add unit tests covering reconciliation behavior at both the ComponentRecorder and graph translation layers.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
src/Microsoft.ComponentDetection.Common/DependencyGraph/ComponentRecorder.cs Groups detected components by BaseId and merges bare metadata into rich components.
src/Microsoft.ComponentDetection.Orchestrator/Services/GraphTranslation/DefaultGraphTranslationService.cs Applies graph roots/ancestors/dev-dep/scope/locations using Id or BaseId matching.
test/Microsoft.ComponentDetection.Common.Tests/ComponentRecorderTests.cs Adds tests validating bare/rich reconciliation behavior for component aggregation.
test/Microsoft.ComponentDetection.Orchestrator.Tests/Services/DefaultGraphTranslationServiceTests.cs Adds tests validating rich components absorb graph data from bare-id graphs.

Comment thread src/Microsoft.ComponentDetection.Common/DependencyGraph/ComponentRecorder.cs Outdated
@AMaini503 AMaini503 force-pushed the user/aamaini/component-identity-merging-level1 branch from 0ec6835 to f1ee3e5 Compare April 2, 2026 23:03
Copilot AI review requested due to automatic review settings April 6, 2026 18:33
@AMaini503 AMaini503 force-pushed the user/aamaini/component-identity-merging-level1 branch from f1ee3e5 to 699abfc Compare April 6, 2026 18:33
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses identity fragmentation in ComponentsFound when the same package is registered with both a bare Id (no DownloadUrl/SourceUrl) and a rich Id (with URL data) within a single detector, by reconciling on BaseId so rich components absorb metadata/graph enrichment from their bare counterparts.

Changes:

  • Reconcile ComponentRecorder.GetDetectedComponents() output by grouping on BaseId and merging bare component metadata into all rich entries.
  • Extend DefaultGraphTranslationService enrichment to match dependency graphs by either full Id or BaseId, allowing rich entries to pick up roots/ancestors/scope/devDep/locations from bare-Id graphs.
  • Add unit tests and supporting spike/sample artifacts documenting npm lockfile behavior and identity reconciliation rationale.
Show a summary per file
File Description
src/Microsoft.ComponentDetection.Common/DependencyGraph/ComponentRecorder.cs Reconciles bare vs rich detected components by BaseId and merges selected metadata into rich entries.
src/Microsoft.ComponentDetection.Orchestrator/Services/GraphTranslation/DefaultGraphTranslationService.cs Updates graph enrichment to treat rich components as present when graphs contain the bare BaseId.
test/Microsoft.ComponentDetection.Common.Tests/ComponentRecorderTests.cs Adds tests validating bare→rich subsumption and metadata merge semantics in GetDetectedComponents().
test/Microsoft.ComponentDetection.Orchestrator.Tests/Services/DefaultGraphTranslationServiceTests.cs Adds tests ensuring graph-derived data is transferred from bare-Id graphs to rich components.
docs/component-identity-reconciliation-design.md Design doc describing reconciliation points and semantics for ComponentsFound vs DependencyGraphs.
docs/component-identity-merging.md Expanded design discussion and scenarios for bare/rich merging behavior.
docs/npm-detector-spike-plan.md Work breakdown and analysis plan for npm detector metadata population (spike).
docs/npm-detector-spike-findings.md Spike findings summarizing lockfile field availability and detector gaps.
docs/npm-lockfile-samples/README.md Documentation of npm lockfile structural differences and detector read paths.
docs/npm-lockfile-samples/v1-lockfile-sample.json Trimmed v1 lockfile excerpt used as documentation reference.
docs/npm-lockfile-samples/v2-lockfile-sample.json Trimmed v2 lockfile excerpt used as documentation reference.
docs/npm-lockfile-samples/v3-lockfile-sample.json Trimmed v3 lockfile excerpt used as documentation reference.
test-npm-spike/package.json Spike project input for npm v3 lockfile generation/repro.
test-npm-spike/package-lock.json Spike npm v3 lockfile (full) captured for repro.
test-npm-spike/baseline-output/GovCompDisc_Log_20260313102052505_25872.log Captured run log artifact for the spike.
test-npm-spike/baseline-output/ScanManifest_20260313102052512.json Captured scan manifest artifact for the spike run.
test-npm-spike-v1/package.json Spike project input for npm v2 lockfile generation/repro.
test-npm-spike-v1/package-lock.json Spike npm v2 lockfile (full) captured for repro.
test-npm-spike-v1/baseline-output/GovCompDisc_Log_20260313102109400_18928.log Captured run log artifact for the spike.
test-npm-spike-v1/baseline-output/ScanManifest_20260313102109408.json Captured scan manifest artifact for the spike run.
test-npm-spike-v1-only/package.json Spike project input for npm v1 lockfile generation/repro.
test-npm-spike-v1-only/package-lock.json Spike npm v1 lockfile (full) captured for repro.
test-npm-spike-v1-only/baseline-output/GovCompDisc_Log_20260313102122587_32472.log Captured run log artifact for the spike.
test-npm-spike-v1-only/baseline-output/ScanManifest_20260313102122594.json Captured scan manifest artifact for the spike run.

Copilot's findings

Files not reviewed (3)
  • test-npm-spike-v1-only/package-lock.json: Language not supported
  • test-npm-spike-v1/package-lock.json: Language not supported
  • test-npm-spike/package-lock.json: Language not supported
  • Files reviewed: 16/24 changed files
  • Comments generated: 1

Within a single detector, components registered under bare Ids (no
DownloadUrl/SourceUrl) are now merged into their rich counterparts
sharing the same BaseId.

Changes:
- ComponentRecorder.GetDetectedComponents(): group by BaseId, merge
  bare metadata (licenses, suppliers, containers) into all rich entries
- DefaultGraphTranslationService.GatherSetOfDetectedComponentsUnmerged():
  extend graph lookup to match on BaseId so rich components absorb
  graph data (roots, ancestors, devDep, scope, file paths) from
  bare-Id graphs
- Tests for both reconciliation levels

Addresses work item #2372676.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@AMaini503 AMaini503 force-pushed the user/aamaini/component-identity-merging-level1 branch from 699abfc to b06a7d2 Compare April 6, 2026 18:43
Copilot AI review requested due to automatic review settings April 13, 2026 20:40
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR reconciles “bare” and “rich” component identities so that bare-Id registrations are merged into rich counterparts (and rich components can still pick up dependency-graph data recorded under bare Ids).

Changes:

  • Update ComponentRecorder.GetDetectedComponents() to group by BaseId and merge bare metadata into all rich entries sharing that BaseId.
  • Update DefaultGraphTranslationService to apply dependency-graph data to rich components even when the graph stored the component under BaseId.
  • Add tests covering reconciliation behavior at both the recorder and graph-translation layers.
Show a summary per file
File Description
test/Microsoft.ComponentDetection.Orchestrator.Tests/Services/DefaultGraphTranslationServiceTests.cs Adds tests ensuring rich components absorb graph data recorded under bare Ids.
test/Microsoft.ComponentDetection.Common.Tests/ComponentRecorderTests.cs Adds tests verifying bare/rich reconciliation and metadata merging in GetDetectedComponents().
src/Microsoft.ComponentDetection.Orchestrator/Services/GraphTranslation/DefaultGraphTranslationService.cs Extends graph lookup to match on BaseId and plumbs the resolved graph id into graph queries.
src/Microsoft.ComponentDetection.Common/DependencyGraph/ComponentRecorder.cs Implements grouping by BaseId, merges bare metadata into rich components, and refactors merge helpers.

Copilot's findings

  • Files reviewed: 4/4 changed files
  • Comments generated: 4

Aayush Maini and others added 3 commits April 13, 2026 16:53
Copilot AI review requested due to automatic review settings April 14, 2026 00:10
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR reconciles component identities when the same package is registered under both a “bare” identity (BaseId only) and one or more “rich” identities (Id includes provenance like DownloadUrl/SourceUrl), ensuring ComponentsFound and graph-derived metadata are attributed to the rich component(s).

Changes:

  • Update ComponentRecorder.GetDetectedComponents() to group by BaseId, merge bare metadata into rich entries, and drop bare entries when rich exists.
  • Update DefaultGraphTranslationService.GatherSetOfDetectedComponentsUnmerged() to match dependency-graph nodes by Id or BaseId so rich components can absorb graph data from bare-Id graphs.
  • Add unit tests covering reconciliation in both the recorder and graph translation layers.
Show a summary per file
File Description
src/Microsoft.ComponentDetection.Common/DependencyGraph/ComponentRecorder.cs Reconciles detected components by BaseId, merging bare metadata into rich entries and preserving multiple distinct rich identities.
src/Microsoft.ComponentDetection.Orchestrator/Services/GraphTranslation/DefaultGraphTranslationService.cs Extends graph membership checks to fall back to BaseId for rich components so they inherit roots/ancestors/devDep/scope/locations from bare-Id graphs.
test/Microsoft.ComponentDetection.Common.Tests/ComponentRecorderTests.cs Adds tests validating bare↔rich reconciliation behavior and metadata merging in GetDetectedComponents().
test/Microsoft.ComponentDetection.Orchestrator.Tests/Services/DefaultGraphTranslationServiceTests.cs Adds tests validating that rich components pick up graph-derived data from graphs keyed by bare Ids.

Copilot's findings

  • Files reviewed: 4/4 changed files
  • Comments generated: 0 new

@AMaini503 AMaini503 merged commit 445a1a2 into main Apr 14, 2026
27 of 29 checks passed
@AMaini503 AMaini503 deleted the user/aamaini/component-identity-merging-level1 branch April 14, 2026 00:50
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 14, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 0.0%. Comparing base (d8663f2) to head (4fb67ce).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@     Coverage Diff      @@
##   main   #1760   +/-   ##
============================
============================

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants