Skip to content

update SQL query in GetByAssetVersionPaged#1636

Merged
timbastin merged 6 commits into
mainfrom
fix-dependency-vulns-list-sql-query
Feb 5, 2026
Merged

update SQL query in GetByAssetVersionPaged#1636
timbastin merged 6 commits into
mainfrom
fix-dependency-vulns-list-sql-query

Conversation

@refoo0
Copy link
Copy Markdown
Member

@refoo0 refoo0 commented Feb 4, 2026

No description provided.

Signed-off-by: rafi <refaei.shikho@hotmail.com>
Copilot AI review requested due to automatic review settings February 4, 2026 14:36
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the SQL/GORM query used by GetByAssetVersionPaged to derive the paged/ordered list of component PURLs (packages) for an asset version.

Changes:

  • Switch package_name selection/grouping from components.id to dependency_vulns.component_purl.
  • Change the join between components and dependency_vulns from INNER JOIN to RIGHT JOIN to include dependency vulns even when a matching component row is missing.
  • Enable GORM query debugging on the packageNameQuery scan.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 247 to 250
Select("SUM(dependency_vulns.raw_risk_assessment) as total_risk, AVG(dependency_vulns.raw_risk_assessment) as avg_risk, MAX(dependency_vulns.raw_risk_assessment) as max_risk, MAX(\"CVE\".cvss) as max_cvss, COUNT(dependency_vulns.id) as dependency_vuln_count, dependency_vulns.component_purl as package_name").
Joins("RIGHT JOIN dependency_vulns ON components.id = dependency_vulns.component_purl AND dependency_vulns.asset_id = ? AND dependency_vulns.asset_version_name = ?", assetID, assetVersionName).
Joins("LEFT JOIN artifact_dependency_vulns ON artifact_dependency_vulns.dependency_vuln_id = dependency_vulns.id").
Joins("INNER JOIN cves \"CVE\" ON dependency_vulns.cve_id = \"CVE\".cve").
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The aggregates in this grouped query can be inflated because of the LEFT JOIN artifact_dependency_vulns (a dependency vuln can have multiple artifacts, duplicating rows). This affects SUM, AVG, and COUNT(dependency_vulns.id) and can change sort order when clients sort by total_risk, avg_risk, or dependency_vuln_count. Consider aggregating over a de-duplicated set of dependency_vulns (e.g., via a DISTINCT subquery keyed by dependency_vulns.id) and/or using COUNT(DISTINCT dependency_vulns.id) for the count.

Copilot uses AI. Check for mistakes.
Comment on lines 269 to 270
if err := packageNameQuery.Debug().Scan(&res).Error; err != nil {
return shared.Paged[models.DependencyVuln]{}, map[string]int{}, err
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debug() should not be left enabled in repository code; it can significantly increase log volume and may expose SQL/parameters if logging is enabled in some environments. Please remove Debug() or gate it behind an explicit, environment-driven debug flag used consistently across the codebase.

Copilot uses AI. Check for mistakes.
Comment on lines 246 to 249
packageNameQuery := repository.GetDB(tx).Table("components").
Select("SUM(dependency_vulns.raw_risk_assessment) as total_risk, AVG(dependency_vulns.raw_risk_assessment) as avg_risk, MAX(dependency_vulns.raw_risk_assessment) as max_risk, MAX(\"CVE\".cvss) as max_cvss, COUNT(dependency_vulns.id) as dependency_vuln_count, components.id as package_name").
Joins("INNER JOIN dependency_vulns ON components.id = dependency_vulns.component_purl AND dependency_vulns.asset_id = ? AND dependency_vulns.asset_version_name = ?", assetID, assetVersionName).
Select("SUM(dependency_vulns.raw_risk_assessment) as total_risk, AVG(dependency_vulns.raw_risk_assessment) as avg_risk, MAX(dependency_vulns.raw_risk_assessment) as max_risk, MAX(\"CVE\".cvss) as max_cvss, COUNT(dependency_vulns.id) as dependency_vuln_count, dependency_vulns.component_purl as package_name").
Joins("RIGHT JOIN dependency_vulns ON components.id = dependency_vulns.component_purl AND dependency_vulns.asset_id = ? AND dependency_vulns.asset_version_name = ?", assetID, assetVersionName).
Joins("LEFT JOIN artifact_dependency_vulns ON artifact_dependency_vulns.dependency_vuln_id = dependency_vulns.id").
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

packageNameQuery uses RIGHT JOIN while starting from Table("components"). This is harder to reason about and can lead to less optimal plans; the query is fundamentally driven by dependency_vulns (all WHERE predicates are on that table). Consider rewriting it to start from dependency_vulns and LEFT JOIN components instead, which preserves the "include vulns even if component is missing" behavior without relying on a RIGHT JOIN.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Go ahead

Signed-off-by: rafi <refaei.shikho@hotmail.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread normalize/sbom_graph.go
Comment on lines +225 to +231
if comp.PackageURL != "" {
// Unescape URL-encoded characters (e.g., %2B -> +) to match the format stored in the database
unescapedPurl, err := url.PathUnescape(comp.PackageURL)
if err == nil {
comp.PackageURL = unescapedPurl
}
}
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unescaping comp.PackageURL while keeping comp.BOMRef unchanged can create multiple component nodes (distinct BOMRefs) that collapse to the same Component.PackageURL after unescaping (e.g., one added as ...%2B... and another as ...+...). Downstream code uses Component.PackageURL as an identifier/key (e.g., ToMinimalTree builds dependencies keyed by parentNode.Component.PackageURL), so collisions will overwrite edges and produce incorrect/minimally-lossy output. Consider canonicalizing the node ID (BOMRef) to the same normalized PURL (and updating edges/dep refs accordingly), or keep the raw PackageURL and store a separate normalized/canonical PURL field used only for DB matching/export.

Suggested change
if comp.PackageURL != "" {
// Unescape URL-encoded characters (e.g., %2B -> +) to match the format stored in the database
unescapedPurl, err := url.PathUnescape(comp.PackageURL)
if err == nil {
comp.PackageURL = unescapedPurl
}
}

Copilot uses AI. Check for mistakes.
Comment on lines +1261 to +1263
node := g.nodes[encodedPurl]
assert.NotNil(t, node)
assert.Equal(t, expectedPurl, node.Component.PackageURL)
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests reach into the unexported g.nodes map. This makes the tests brittle to internal refactors and also bypasses graph scoping/reachability rules (e.g., g.Node(id) would return nil unless the component is linked). Prefer asserting via public behavior (e.g., link the component into the graph with AddEdge and then assert via Components()/Node()), or iterate g.Components() and match on BOMRef as done later in this test file.

Copilot uses AI. Check for mistakes.
Comment thread normalize/sbom_graph.go
Comment on lines +225 to +229
if comp.PackageURL != "" {
// Unescape URL-encoded characters (e.g., %2B -> +) to match the format stored in the database
unescapedPurl, err := url.PathUnescape(comp.PackageURL)
if err == nil {
comp.PackageURL = unescapedPurl
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR title mentions "update SQL query in GetByAssetVersionPaged", but the changes here are focused on normalizing/unescaping component PURLs in SBOMGraph.AddComponent plus adding tests. Either the PR title/description should be updated to reflect the actual change, or the intended SQL/query changes are missing from this PR.

Copilot uses AI. Check for mistakes.
…s interface declaration

Signed-off-by: rafi <refaei.shikho@hotmail.com>
…move unused code

Signed-off-by: rafi <refaei.shikho@hotmail.com>
…r improved handling of package URLs

Signed-off-by: rafi <refaei.shikho@hotmail.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 5 comments.

Comments suppressed due to low confidence (1)

integrations/commonint/integration_helper.go:417

  • FindAllComponentOnlyPathsToPURL returns component-only paths (no ROOT/artifact/info-source nodes), but the code below/comment expects to render a Mermaid graph that includes those structural nodes. With the current call, the rendered graph will omit root/artifact/info-source nodes.

Consider using a full-path function for visualization (or add an option/method that returns full paths including structural nodes).

	paths := bom.FindAllComponentOnlyPathsToPURL(pURL, 0)
	// we want to show fake nodes in the mermaid graph (root, artifact, info sources)
	pathWithFakeNodes := make([][]string, 0, len(paths))
	for _, path := range paths {
		pathWithFakeNodes = append(pathWithFakeNodes, path.ToStringSlice())

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 33 to 35
dependencyVulnRouter := assetVersionGroup.Group.Group("/dependency-vulns")
dependencyVulnRouter.GET("/", dependencyVulnController.ListPaged)
dependencyVulnRouter.GET("/sync/", dependencyVulnController.ListByAssetIDWithoutHandledExternalEventsPaged)
dependencyVulnRouter.GET("/:dependencyVulnID/", dependencyVulnController.Read)
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change removes the GET /dependency-vulns/sync/ route entirely. If this endpoint is consumed by the UI, integrations, or external clients, this is a breaking API change.

If removal is intended, consider leaving a deprecated alias/redirect for at least one release cycle (or ensure all callers are migrated) and update any API/docs references accordingly.

Copilot uses AI. Check for mistakes.

if componentPurl.String() != "" {
pathPattern = dtos.PathPattern{componentPurl.String(), dtos.PathPatternWildcard, purl.ToString()}
pathPattern = dtos.PathPattern{componentPurl.String(), dtos.PathPatternWildcard, purlString}
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

purlString is normalized/unescaped, but the metadata/root componentPurl is still inserted into the PathPattern via componentPurl.String(). If the root PURL contains URL-escaped characters (e.g. %2B), this can prevent PathPattern.MatchesSuffix from matching DependencyVuln.VulnerabilityPath (which now appears to use unescaped PURLs).

Normalize componentPurl the same way (or otherwise ensure all path elements use one canonical string form).

Suggested change
pathPattern = dtos.PathPattern{componentPurl.String(), dtos.PathPatternWildcard, purlString}
componentPurlString, err := normalize.PURLToString(componentPurl)
if err != nil {
slog.Info("failed to unescape component purl for path pattern, continuing anyway", "componentPurl", componentPurl.String(), "error", err)
componentPurlString = componentPurl.String()
}
pathPattern = dtos.PathPattern{componentPurlString, dtos.PathPatternWildcard, purlString}

Copilot uses AI. Check for mistakes.

func TestFindAllPathsToPURL(t *testing.T) {
func TestFindAllComponentOnlyPathsToPURL(t *testing.T) {
t.Run("multiple information sources pointing to same component - should return multiple paths", func(t *testing.T) {
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The t.Run description says "should return multiple paths", but the test now asserts a single deduplicated path (Len(paths, 1)). Please rename the test case (and the assertion message at line 628) to match the intended behavior so failures are not misleading.

Copilot uses AI. Check for mistakes.
assert.Equal(t, paths[0].ToStringSliceComponentOnly(), paths[1].ToStringSliceComponentOnly())
assert.Equal(t, []string{"pkg:npm/lodash@4.17.21"}, paths[0].ToStringSliceComponentOnly())
// Should return a single path (reachable through different info sources)
assert.Len(t, paths, 1, "Should return separate paths for each info source")
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assertion message is inconsistent with the expectation: it asserts Len(paths, 1) but the message says "separate paths for each info source". Please update the message to reflect that deduplication is expected.

Copilot uses AI. Check for mistakes.
Comment on lines 659 to +660
// Should return two paths (one through each artifact)
assert.Len(t, paths, 2, "Should return separate paths through different artifacts")

// Both paths should have the same component-only representation
assert.Equal(t, paths[0].ToStringSliceComponentOnly(), paths[1].ToStringSliceComponentOnly())
assert.Equal(t, []string{"pkg:npm/express@4.18.0"}, paths[0].ToStringSliceComponentOnly())
assert.Len(t, paths, 1, "Should return separate paths through different artifacts")
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment and assertion message still refer to multiple/separate paths, but the test expects a single deduplicated path (Len(paths, 1)). Please update the comment/message (and optionally the t.Run name) to match the new behavior.

Copilot uses AI. Check for mistakes.
@timbastin timbastin merged commit f01c8b7 into main Feb 5, 2026
18 of 19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants