feat: port search, deletion, lineage, and observability features#244
Conversation
…and ES improvements - Add pagination support (offset) and included fields to SearchAssets API - Enable empty text search with MatchAllQuery for listing/filtering without keywords - Add disable fuzziness flag and highlight support via SearchFlags - Improve search relevance with phrase match, AND operator, and exact match boosting - Add English stemmer to ES analyzer for better token matching - Add ES password and username config support - Add ignore_malformed to ES index settings - Fix delete queries to use universe alias instead of _all with ignore_unavailable - Handle duplicate create index race condition gracefully - Simplify search query building using olivere/elastic SearchSource
- Add refreshed_at and is_deleted columns to assets table (migration 000017) - Add SoftDeleteByID and SoftDeleteByURN to asset repository - Add HardDeleteByURNs for permanent cleanup of soft-deleted assets - Add GetCountByIsDeleted for counting deleted/active assets - Add SoftDeleteByURN to ES discovery repository using UpdateByQuery - Add DeleteByURNs batch lineage deletion method - Reset is_deleted to false on asset upsert (re-sync restores deleted assets) - Set refreshed_at on insert and update operations - Update mocks for new interface methods
… DB write - Deduplicate owners by resolved user ID in createOrFetchUsers to prevent duplicate owner entries when different identifiers resolve to same user - Fix version history sort order by casting version to int[] instead of string comparison (0.10 now correctly sorts after 0.9) - Set created_at and updated_at on asset struct before DB write so downstream consumers (version history, ES) get consistent timestamps - Preserve created_at from old asset during updates
- Order asset probes by timestamp DESC instead of created_at ASC for correct chronological ordering - Add indexes on asset_probes for asset_urn and timestamp columns to improve query performance
…zation When WithAttributes is false, skip fetching node attributes (probes) from the database, significantly reducing the cost of lineage queries that only need edge data. Default to true for backward compatibility.
Remove the StatsD metrics client and all references. StatsD was used for basic counter/histogram metrics in user service and gRPC interceptor. This clears the path for adopting OpenTelemetry as the observability standard.
…s_deleted, and group assets - Update PROTON_COMMIT to include new proto fields - Wire SearchFlags (disable_fuzzy, enable_highlight, is_column_search) from proto to handler - Wire with_attributes (from proto optional bool) and include_deleted in lineage handler - Add update_only support to UpsertAsset and UpsertPatchAsset handlers - Add is_deleted filter to GetAllAssets and is_deleted field to Asset proto - Add IsDeleted to asset.Filter with builder method and postgres filter query - Add GroupAssets RPC definition (handler implementation pending) - Remove remaining StatsD references from handler layer
- Add GroupConfig, GroupResult, GroupField types to core/asset/discovery.go - Add GroupAssets to DiscoveryRepository interface - Implement GroupAssets in ES using composite aggregation with top_hits - Add GroupAssets gRPC handler with filter and include_fields support - Add GroupAssets to AssetService interface and service delegate - Update mocks for new interface methods
…porters - Create pkg/telemetry package with OTLP gRPC trace and metric exporters - Add configurable trace sampling and periodic metric read interval - Add otelgrpc.UnaryServerInterceptor to gRPC middleware chain - Add Telemetry config section to CLI config - Initialize OTel on server start with graceful shutdown cleanup
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 15 minutes and 12 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (5)
📝 WalkthroughWalkthroughThis PR replaces StatsD with OpenTelemetry and adds telemetry config and init/cleanup plumbing. It implements asset soft-delete across DB, search index, repositories, service, and APIs, adds GroupAssets (asset grouping) with Elasticsearch composite aggregations, enhances search with pagination/field selection/flags, extends lineage queries with attribute/include-deleted flags, adds asset types (query, metric, experiment), and includes related DB migrations and generated proto/validation updates. Sequence Diagram(s)sequenceDiagram
participant Client
participant APIServer
participant AssetService
participant DiscoveryRepo
participant Elasticsearch
Client->>APIServer: GroupAssetsRequest(groupby, filters, include_fields, size)
APIServer->>AssetService: GroupAssets(ctx, GroupConfig)
AssetService->>DiscoveryRepo: GroupAssets(ctx, cfg)
DiscoveryRepo->>Elasticsearch: composite aggregation (group_by, top_hits, include_fields, size)
Elasticsearch-->>DiscoveryRepo: aggregation buckets
DiscoveryRepo-->>AssetService: []GroupResult
AssetService-->>APIServer: []GroupResult
APIServer-->>Client: GroupAssetsResponse(asset_groups)
sequenceDiagram
participant Server
participant TelemetryPkg
participant OTLPCollector
Server->>TelemetryPkg: Init(ctx, cfg.Telemetry, logger)
alt OpenTelemetry enabled
TelemetryPkg->>OTLPCollector: create OTLP trace & metric exporters
TelemetryPkg-->>Server: return cleanup func, nil
else disabled
TelemetryPkg-->>Server: return no-op cleanup, nil
end
Server->>Server: defer cleanup() on shutdown
Server->>TelemetryPkg: use otelgrpc.UnaryServerInterceptor() in gRPC interceptors
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 12
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/server/v1beta1/asset.go (1)
295-306:⚠️ Potential issue | 🟠 MajorPotential nil dereference after NotFoundError when
update_onlyis false.When
GetAssetByIDreturns aNotFoundErrorandupdate_onlyis false, the code falls through to line 305-306 whereast.Patch(patchAssetMap)is called. However,astwould be the zero-valueasset.Asset{}since the error path doesn't return early. This will proceed with patching an empty asset, which may not be the intended behavior.Also,
errors.As(err, &asset.NotFoundError{})should use a pointer to a variable, not a pointer to a literal.🐛 Proposed fix
ast, err := server.assetService.GetAssetByID(ctx, urn) - if err != nil { - if errors.As(err, &asset.NotFoundError{}) { - if req.GetUpdateOnly() { - return &compassv1beta1.UpsertPatchAssetResponse{Id: ""}, nil - } - } else { - return nil, internalServerError(server.logger, err.Error()) - } + if err != nil { + if errors.As(err, new(asset.NotFoundError)) { + if req.GetUpdateOnly() { + return &compassv1beta1.UpsertPatchAssetResponse{Id: ""}, nil + } + // For patch, we cannot create new assets - the asset must exist + return nil, status.Error(codes.NotFound, err.Error()) + } + return nil, internalServerError(server.logger, err.Error()) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/server/v1beta1/asset.go` around lines 295 - 306, The code can dereference or operate on a zero-value ast when GetAssetByID returns a NotFoundError and req.GetUpdateOnly() is false; also errors.As is misused by passing a pointer to a literal. Fix by changing the errors.As call to use a typed variable (e.g., var nf asset.NotFoundError; if errors.As(err, &nf) { ... }) and explicitly handle the NotFoundError branch: if update_only is true return empty response, otherwise return a clear not-found error (via internalServerError or a dedicated NotFound gRPC error) instead of falling through to compute patchAssetMap and calling ast.Patch; ensure decodePatchAssetToMap(baseAsset) and ast.Patch are only invoked when ast (from GetAssetByID) was successfully retrieved.
🧹 Nitpick comments (9)
pkg/telemetry/telemetry.go (1)
36-42: Consider making TLS configurable for production environments.The exporter uses
WithInsecure()which is fine for internal collectors but may not be appropriate for all deployment scenarios. Consider adding a TLS configuration option toConfig.OpenTelemetryfor production use.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/telemetry/telemetry.go` around lines 36 - 42, The code currently always calls otlptracegrpc.New(...) with otlptracegrpc.WithInsecure(), which disables TLS; update the OpenTelemetry config (cfg.OpenTelemetry) to include TLS settings (e.g., enableTLS, CACertPath, serverName or use system certs) and change the otlptracegrpc.New call to choose credentials based on that config: when enableTLS is true, create appropriate gRPC transport credentials (e.g., via credentials.NewClientTLSFromCert or grpc.WithTransportCredentials) and pass them to the exporter instead of WithInsecure(); when false, keep the existing otlptracegrpc.WithInsecure() behavior. Ensure references to otlptracegrpc.New and otlptracegrpc.WithInsecure() are updated to use the conditional credential path tied to cfg.OpenTelemetry.Makefile (1)
6-6: Use the full commit SHA for reproducibility.Replace the short hash
409f146with the full SHA409f146db954c2654b8d24488011685913c458e9to ensure reproducible builds and avoid potential issues as the repository grows.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@Makefile` at line 6, The Makefile currently sets PROTON_COMMIT to a short hash ("409f146"); replace that value with the full commit SHA "409f146db954c2654b8d24488011685913c458e9" so PROTON_COMMIT uses the complete commit identifier for reproducible builds—update the PROTON_COMMIT assignment in the Makefile accordingly.internal/store/postgres/migrations/000017_add_soft_deletion.up.sql (1)
2-3: Enforce two-state soft-delete flags withNOT NULL.Line 2 and Line 3 add defaults, but nullable booleans still allow tri-state values. Make these columns
NOT NULLto prevent ambiguousNULLdeletion state.Suggested migration adjustment
-ALTER TABLE assets ADD COLUMN IF NOT EXISTS is_deleted BOOLEAN DEFAULT false; -ALTER TABLE assets_versions ADD COLUMN IF NOT EXISTS is_deleted BOOLEAN DEFAULT false; +ALTER TABLE assets ADD COLUMN IF NOT EXISTS is_deleted BOOLEAN NOT NULL DEFAULT false; +ALTER TABLE assets_versions ADD COLUMN IF NOT EXISTS is_deleted BOOLEAN NOT NULL DEFAULT false;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/store/postgres/migrations/000017_add_soft_deletion.up.sql` around lines 2 - 3, The new soft-delete flags on tables assets.is_deleted and assets_versions.is_deleted must be enforced as two-state booleans; update the migration so the added columns are declared NOT NULL (e.g., ADD COLUMN ... BOOLEAN NOT NULL DEFAULT false) or perform an explicit ALTER TABLE ... ALTER COLUMN ... SET NOT NULL after populating defaults, ensuring no NULLs remain before adding the constraint; target the statements that add is_deleted on assets and assets_versions.core/asset/service_test.go (1)
811-812: Add explicit coverage forWithAttributes: falsepath.Line 811 validates only the enriched-path invocation. Add one test case with
asset.LineageQuery{WithAttributes: false}and assert probe fetch is skipped andNodeAttrsstays empty to protect the performance optimization.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/asset/service_test.go` around lines 811 - 812, Add a test that exercises the non-enriched path by calling svc.GetLineage(ctx, "urn-source-1", asset.LineageQuery{WithAttributes: false}) and asserting no probe attribute fetch occurs and that returned nodes have empty NodeAttrs; update the existing table-driven test or add a new case next to the existing enriched case, verify the probe/mock used to fetch attributes was not invoked (assert call count or expectation), and assert the returned lineage nodes’ NodeAttrs remain empty to protect the performance optimization.core/asset/mocks/asset_repository.go (1)
756-829: Regenerate this mockery file.These new methods don't have the matching
EXPECT()helpers/call-wrapper types that the rest of this--with-expectermock exposes, so the generated API is still incomplete for the newly added repository surface.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/asset/mocks/asset_repository.go` around lines 756 - 829, The mock file is out-of-date: new methods (AssetRepository.SoftDeleteByID, SoftDeleteByURN, GetCountByIsDeleted, HardDeleteByURNs) were added but the generated expecter helpers are missing; regenerate the mock with mockery using the same flags you used originally (include --with-expecter) so the EXPECT() call-wrappers and typed return helpers for these functions are produced; replace core/asset/mocks/asset_repository.go with the newly generated file and verify the new methods have matching EXPECT() helper types and call wrappers.core/asset/mocks/discovery_repository.go (1)
270-303: Regenerate this mockery file.The new methods were added without the matching
EXPECT()helpers/call-wrapper types that the rest of this--with-expectermock exposes, so the generated API is inconsistent and a future mockery run will rewrite these manual additions.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/asset/mocks/discovery_repository.go` around lines 270 - 303, The mock file is out of sync: new methods GroupAssets and SoftDeleteByURN were added but the corresponding EXPECT() helpers/call-wrapper types are missing; regenerate the mock for DiscoveryRepository with mockery (including the --with-expecter flag used by the project) so the generated file adds the EXPECT() helpers and proper call-wrapper types for GroupAssets and SoftDeleteByURN, or re-run the project's mock generation command to replace this manual file; ensure the regenerated mock preserves the DiscoveryRepository type and includes the Expecter/EXPECT helpers for those two methods.proto/compass.swagger.yaml (1)
900-906: Consider addingmaxItemsto thegroupbyarray parameter.The static analysis tool flagged that arrays should have a maximum number of items to prevent abuse. Adding a reasonable
maxItemsconstraint would improve API robustness.Proposed addition
- name: groupby in: query required: false type: array items: type: string collectionFormat: multi + maxItems: 10🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@proto/compass.swagger.yaml` around lines 900 - 906, The groupby query parameter currently allows an unbounded array; add a sensible maxItems constraint to its definition to prevent abuse. Update the OpenAPI schema for the parameter named "groupby" (type: array, items.type: string, collectionFormat: multi) by adding a "maxItems" property (e.g., 10 or another agreed limit) so the parameter becomes: name: groupby, in: query, required: false, type: array, items: { type: string }, collectionFormat: multi, maxItems: <n>.internal/store/elasticsearch/discovery_search_repository.go (2)
381-390:sizeparameter controls both group count and assets-per-group.The same
sizevalue is used for both the composite aggregation (number of groups) andtop_hits(assets per group). This may be confusing for API consumers who might expect separate controls. Consider whether this is the intended UX or if separate parameters should be introduced.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/store/elasticsearch/discovery_search_repository.go` around lines 381 - 390, The composite aggregation uses the same variable size for both the number of groups and the number of assets per group (see "top_assets" -> "top_hits" and the composite aggregation "size"), which conflates group count and assets-per-group; change the API and code to accept two distinct parameters (e.g., groupSize and assetsPerGroup), update the call sites that currently pass size to supply both new params, and replace usages of size in the composite aggregation and in the "top_hits" section (and any references to includedFields remain unchanged) so groupSize controls the composite bucket count and assetsPerGroup controls the top_hits size.
249-277: Consider the exact match boost placement.The exact match boost on
name.keywordis added as ashouldclause withBoost(100). However, since this is added to the bool query that already hasMinimumShouldMatch("1")set bybuildTextQuery, this boost will only contribute to scoring when text is provided. This is likely the intended behavior, but worth noting for clarity.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/store/elasticsearch/discovery_search_repository.go` around lines 249 - 277, The exact-match boost on name.keyword is currently added to the incoming bool (via buildFunctionScoreQuery) and will only affect scoring when buildTextQuery's MinimumShouldMatch allows it; to make the boost behavior explicit and not gated by the original bool's should/minimum settings, wrap the existing query into a new bool that includes the boosted term as a separate should and then use that wrapped bool as the Query passed into elastic.NewFunctionScoreQuery in buildFunctionScoreQuery; reference buildFunctionScoreQuery and buildTextQuery when locating where to stop mutating the incoming bool and instead construct a new BoolQuery that Always includes elastic.NewTermQuery("name.keyword", text).Boost(100) as a should before creating the FunctionScoreQuery.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@core/asset/lineage.go`:
- Line 37: The DeleteByURNs implementation currently builds SQL with fmt.Sprintf
and is vulnerable to injection; replace the string-interpolation predicate in
the DeleteByURNs method with parameterized predicates using squirrel's Eq/Or
(e.g. build a slice of predicates like sq.Or{sq.Eq{"source": urn},
sq.Eq{"target": urn}} for each urn or accumulate with sq.Or across urns) and
pass that predicate into the repository delete/query builder instead of
concatenating strings; locate the DeleteByURNs implementation in
lineage_repository.go and remove use of fmt.Sprintf(...) so all urn values are
bound via squirrel parameters (Eq/Or) and executed safely.
In `@core/asset/service.go`:
- Around line 161-165: The current check treats query.WithAttributes as false
when the pointer is nil, dropping NodeAttrs for callers that omitted the field;
update the conditional so a nil WithAttributes defaults to true. Specifically,
in the function returning Lineage (where you currently have "if
!query.WithAttributes { ... }"), change it to check the pointer: "if
query.WithAttributes != nil && !*query.WithAttributes { return Lineage{Edges:
edges}, nil }" so that nil means include attributes; this preserves the previous
response shape for callers that don't set WithAttributes (see WithAttributes,
Lineage, and NodeAttrs and the request construction in
internal/server/v1beta1/lineage.go).
- Around line 101-113: The current flow calls s.assetRepository.SoftDeleteByID /
SoftDeleteByURN and then calls s.discoveryRepository.SoftDeleteByURN, which can
leave Postgres changed but ES unchanged if the discovery call fails; that causes
future retries by ID to hit ErrAssetAlreadyDeleted and prevents repairing
search. Change the handler so DB soft-delete is authoritative and the discovery
update is done asynchronously or via an outbox/queue: after successful
s.assetRepository.SoftDeleteByID / SoftDeleteByURN, enqueue a compensating
discovery delete job (e.g., via an outbox service or
s.outbox.EnqueueAssetDelete(ns, urn)) instead of returning an error when
s.discoveryRepository.SoftDeleteByURN fails; if you don’t have outbox
infrastructure, spawn a background retry worker/goroutine with exponential
backoff to call s.discoveryRepository.SoftDeleteByURN and log or persist
failures for retry, and ensure the API returns success for the DB soft-delete
rather than propagating ErrAssetAlreadyDeleted.
In `@go.mod`:
- Around line 39-47: Update the vulnerable dependency versions in go.mod: bump
google.golang.org/grpc to at least v1.79.3,
go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc to
at least v0.46.0, and google.golang.org/protobuf to at least v1.33.0 (also
ensure go.opentelemetry.io/otel and related otlp/otel packages remain mutually
compatible); run go mod tidy and go test to verify module resolution and
compatibility after changing the versions for grpc, otelgrpc, and protobuf.
In `@internal/store/elasticsearch/discovery_repository.go`:
- Around line 120-125: The UpdateByQuery call in discovery_repository.go is
using defaultSearchIndexAlias and lacks a namespace filter, causing soft-deletes
to affect the same URN across tenants; modify the call to scope by tenant by
replacing defaultSearchIndexAlias with BuildAliasNameFromNamespace(ns) (the same
alias used by writes) or ensure the query body includes a term filter on
namespace_id (e.g., add a must term namespace_id: ns) before calling
repo.cli.client.UpdateByQuery; update references to defaultSearchIndexAlias and
the request body construction to guarantee the operation only targets the
intended namespace.
- Around line 138-145: deleteWithQuery currently deletes from the global alias
(defaultSearchIndexAlias) and has no namespace parameter, allowing cross-tenant
hard deletes; change deleteWithQuery to accept a namespace parameter (e.g., ns
string) and use either the namespaced index/alias or add a namespace filter to
the DeleteByQuery body so the query only targets documents where namespace ==
ns, then update callers DeleteByID and DeleteByURN to pass the ns through to
deleteWithQuery; reference functions: deleteWithQuery, DeleteByID, DeleteByURN
and the global defaultSearchIndexAlias when making these changes.
In `@internal/store/elasticsearch/es.go`:
- Around line 140-153: The RequestTimeout configuration is read but not applied
to the Elasticsearch client because elasticsearch.Config.Transport is set to
nrelasticsearch.NewRoundTripper(nil) without timeout settings; update the
transport used by esCfg (the elasticsearch.Config used to create esClient) to
include an http.Transport with timeouts derived from RequestTimeout (e.g., set
DialContext/IdleConnTimeout/ResponseHeaderTimeout/Timeout as appropriate) or
replace NewRoundTripper(nil) with a configured round tripper that wraps an
http.Transport honoring RequestTimeout, and ensure RequestTimeout is passed into
that transport before calling elasticsearch.NewClient.
In `@internal/store/postgres/asset_model.go`:
- Around line 50-55: toVersionedAsset() is missing propagation of the new
IsDeleted and RefreshedAt fields, causing GetByVersion* to disagree with
GetByID; update the toVersionedAsset() function to set the returned versioned
asset's IsDeleted = a.IsDeleted and RefreshedAt = a.RefreshedAt (and ensure any
related fields like UpdatedBy are already mapped the same way as in toAsset())
so versioned assets carry the same deletion/refresh state as the non-versioned
mapping.
In `@internal/store/postgres/lineage_repository.go`:
- Around line 82-93: The code builds a SQL WHERE by interpolating urns
(orClauses/whereClause) which is SQL-injectable and fails on quotes; instead
build a parameterized predicate using squirrel (e.g. create a predicate like
sq.Or{sq.Eq{"source": urns}, sq.Eq{"target": urns}} or equivalent per-urn sq.Eq
entries), use sq.Delete("lineage_graph").Where(predicate).ToSql() to get (query,
args, err) and then call repo.client.ExecContext(ctx, query, args...) so the
URNs are passed as arguments rather than interpolated; also handle the
empty-urns case before building the query.
In `@pkg/telemetry/telemetry.go`:
- Around line 61-67: If otlpmetricgrpc.New fails, the already-created
traceExporter and tracerProvider must be shut down to avoid leaking resources;
update the error path after calling otlpmetricgrpc.New to call
traceExporter.Shutdown(ctx) and tracerProvider.Shutdown(ctx)
(handling/aggregating any returned errors) before returning the wrapped
fmt.Errorf. Locate the variables traceExporter, tracerProvider and the
otlpmetricgrpc.New call in the telemetry initialization function and perform the
shutdown/cleanup there on metric exporter creation failure.
In `@proto/raystack/compass/v1beta1/service.pb.validate.go`:
- Around line 2794-2810: The SuggestAssetsRequest.validate currently allows
empty queries because the Text field has no non-empty rule; update the service
proto by adding a non-empty validation rule for the text field (e.g., make text
required or add a "min_len" / "not_blank" validator) in the SuggestAssetsRequest
message in service.proto, then re-run the protobuf code generation to regenerate
service.pb.validate.go so that SuggestAssetsRequest.validate (and any generated
helper) will return an error when Text is empty instead of allowing empty
suggest requests to pass through.
- Around line 3002-3035: The GroupAssetsRequest.validate method currently only
checks Groupby length and skips IncludeFields, allowing empty selectors (e.g.,
groupby: [""]) to pass; add item-level non-empty string validation for both
Groupby and IncludeFields by updating the service.proto validation rules for
GroupAssetsRequest (add required/non-empty constraints on repeated string items
for groupby and include_fields), then regenerate the protobuf Go files so the
validate method enforces that each element of Groupby and IncludeFields is
non-empty and returns a GroupAssetsRequestValidationError with field names
"Groupby" or "IncludeFields" when violated.
---
Outside diff comments:
In `@internal/server/v1beta1/asset.go`:
- Around line 295-306: The code can dereference or operate on a zero-value ast
when GetAssetByID returns a NotFoundError and req.GetUpdateOnly() is false; also
errors.As is misused by passing a pointer to a literal. Fix by changing the
errors.As call to use a typed variable (e.g., var nf asset.NotFoundError; if
errors.As(err, &nf) { ... }) and explicitly handle the NotFoundError branch: if
update_only is true return empty response, otherwise return a clear not-found
error (via internalServerError or a dedicated NotFound gRPC error) instead of
falling through to compute patchAssetMap and calling ast.Patch; ensure
decodePatchAssetToMap(baseAsset) and ast.Patch are only invoked when ast (from
GetAssetByID) was successfully retrieved.
---
Nitpick comments:
In `@core/asset/mocks/asset_repository.go`:
- Around line 756-829: The mock file is out-of-date: new methods
(AssetRepository.SoftDeleteByID, SoftDeleteByURN, GetCountByIsDeleted,
HardDeleteByURNs) were added but the generated expecter helpers are missing;
regenerate the mock with mockery using the same flags you used originally
(include --with-expecter) so the EXPECT() call-wrappers and typed return helpers
for these functions are produced; replace core/asset/mocks/asset_repository.go
with the newly generated file and verify the new methods have matching EXPECT()
helper types and call wrappers.
In `@core/asset/mocks/discovery_repository.go`:
- Around line 270-303: The mock file is out of sync: new methods GroupAssets and
SoftDeleteByURN were added but the corresponding EXPECT() helpers/call-wrapper
types are missing; regenerate the mock for DiscoveryRepository with mockery
(including the --with-expecter flag used by the project) so the generated file
adds the EXPECT() helpers and proper call-wrapper types for GroupAssets and
SoftDeleteByURN, or re-run the project's mock generation command to replace this
manual file; ensure the regenerated mock preserves the DiscoveryRepository type
and includes the Expecter/EXPECT helpers for those two methods.
In `@core/asset/service_test.go`:
- Around line 811-812: Add a test that exercises the non-enriched path by
calling svc.GetLineage(ctx, "urn-source-1", asset.LineageQuery{WithAttributes:
false}) and asserting no probe attribute fetch occurs and that returned nodes
have empty NodeAttrs; update the existing table-driven test or add a new case
next to the existing enriched case, verify the probe/mock used to fetch
attributes was not invoked (assert call count or expectation), and assert the
returned lineage nodes’ NodeAttrs remain empty to protect the performance
optimization.
In `@internal/store/elasticsearch/discovery_search_repository.go`:
- Around line 381-390: The composite aggregation uses the same variable size for
both the number of groups and the number of assets per group (see "top_assets"
-> "top_hits" and the composite aggregation "size"), which conflates group count
and assets-per-group; change the API and code to accept two distinct parameters
(e.g., groupSize and assetsPerGroup), update the call sites that currently pass
size to supply both new params, and replace usages of size in the composite
aggregation and in the "top_hits" section (and any references to includedFields
remain unchanged) so groupSize controls the composite bucket count and
assetsPerGroup controls the top_hits size.
- Around line 249-277: The exact-match boost on name.keyword is currently added
to the incoming bool (via buildFunctionScoreQuery) and will only affect scoring
when buildTextQuery's MinimumShouldMatch allows it; to make the boost behavior
explicit and not gated by the original bool's should/minimum settings, wrap the
existing query into a new bool that includes the boosted term as a separate
should and then use that wrapped bool as the Query passed into
elastic.NewFunctionScoreQuery in buildFunctionScoreQuery; reference
buildFunctionScoreQuery and buildTextQuery when locating where to stop mutating
the incoming bool and instead construct a new BoolQuery that Always includes
elastic.NewTermQuery("name.keyword", text).Boost(100) as a should before
creating the FunctionScoreQuery.
In `@internal/store/postgres/migrations/000017_add_soft_deletion.up.sql`:
- Around line 2-3: The new soft-delete flags on tables assets.is_deleted and
assets_versions.is_deleted must be enforced as two-state booleans; update the
migration so the added columns are declared NOT NULL (e.g., ADD COLUMN ...
BOOLEAN NOT NULL DEFAULT false) or perform an explicit ALTER TABLE ... ALTER
COLUMN ... SET NOT NULL after populating defaults, ensuring no NULLs remain
before adding the constraint; target the statements that add is_deleted on
assets and assets_versions.
In `@Makefile`:
- Line 6: The Makefile currently sets PROTON_COMMIT to a short hash ("409f146");
replace that value with the full commit SHA
"409f146db954c2654b8d24488011685913c458e9" so PROTON_COMMIT uses the complete
commit identifier for reproducible builds—update the PROTON_COMMIT assignment in
the Makefile accordingly.
In `@pkg/telemetry/telemetry.go`:
- Around line 36-42: The code currently always calls otlptracegrpc.New(...) with
otlptracegrpc.WithInsecure(), which disables TLS; update the OpenTelemetry
config (cfg.OpenTelemetry) to include TLS settings (e.g., enableTLS, CACertPath,
serverName or use system certs) and change the otlptracegrpc.New call to choose
credentials based on that config: when enableTLS is true, create appropriate
gRPC transport credentials (e.g., via credentials.NewClientTLSFromCert or
grpc.WithTransportCredentials) and pass them to the exporter instead of
WithInsecure(); when false, keep the existing otlptracegrpc.WithInsecure()
behavior. Ensure references to otlptracegrpc.New and
otlptracegrpc.WithInsecure() are updated to use the conditional credential path
tied to cfg.OpenTelemetry.
In `@proto/compass.swagger.yaml`:
- Around line 900-906: The groupby query parameter currently allows an unbounded
array; add a sensible maxItems constraint to its definition to prevent abuse.
Update the OpenAPI schema for the parameter named "groupby" (type: array,
items.type: string, collectionFormat: multi) by adding a "maxItems" property
(e.g., 10 or another agreed limit) so the parameter becomes: name: groupby, in:
query, required: false, type: array, items: { type: string }, collectionFormat:
multi, maxItems: <n>.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 2e68fb52-a816-4336-86a0-b2963961c46c
⛔ Files ignored due to path filters (4)
go.sumis excluded by!**/*.sumproto/raystack/compass/v1beta1/service.pb.gois excluded by!**/*.pb.goproto/raystack/compass/v1beta1/service.pb.gw.gois excluded by!**/*.pb.gw.goproto/raystack/compass/v1beta1/service_grpc.pb.gois excluded by!**/*.pb.go
📒 Files selected for processing (48)
Makefilebuf.gen.yamlcli/config.gocli/server.gocore/asset/asset.gocore/asset/discovery.gocore/asset/discovery_test.gocore/asset/filter.gocore/asset/lineage.gocore/asset/mocks/asset_repository.gocore/asset/mocks/discovery_repository.gocore/asset/mocks/lineage_repository.gocore/asset/service.gocore/asset/service_test.gocore/asset/type.gocore/user/service.gocore/user/service_test.gogo.modinternal/server/server.gointernal/server/v1beta1/asset.gointernal/server/v1beta1/lineage.gointernal/server/v1beta1/lineage_test.gointernal/server/v1beta1/mocks/asset_service.gointernal/server/v1beta1/mocks/statsd_monitor.gointernal/server/v1beta1/option.gointernal/server/v1beta1/search.gointernal/server/v1beta1/search_test.gointernal/server/v1beta1/server.gointernal/server/v1beta1/type_test.gointernal/store/elasticsearch/discovery_repository.gointernal/store/elasticsearch/discovery_search_repository.gointernal/store/elasticsearch/discovery_search_repository_test.gointernal/store/elasticsearch/es.gointernal/store/elasticsearch/schema.gointernal/store/postgres/asset_model.gointernal/store/postgres/asset_repository.gointernal/store/postgres/lineage_repository.gointernal/store/postgres/migrations/000017_add_soft_deletion.down.sqlinternal/store/postgres/migrations/000017_add_soft_deletion.up.sqlinternal/store/postgres/migrations/000018_index_asset_probes.down.sqlinternal/store/postgres/migrations/000018_index_asset_probes.up.sqlpkg/grpc_interceptor/mocks/statsd_monitor.gopkg/grpc_interceptor/statsd.gopkg/grpc_interceptor/statsd_test.gopkg/telemetry/config.gopkg/telemetry/telemetry.goproto/compass.swagger.yamlproto/raystack/compass/v1beta1/service.pb.validate.go
💤 Files with no reviewable changes (9)
- buf.gen.yaml
- internal/server/v1beta1/server.go
- internal/store/elasticsearch/discovery_search_repository_test.go
- core/user/service_test.go
- pkg/grpc_interceptor/statsd.go
- pkg/grpc_interceptor/statsd_test.go
- internal/server/v1beta1/option.go
- internal/server/v1beta1/mocks/statsd_monitor.go
- pkg/grpc_interceptor/mocks/statsd_monitor.go
| GetGraph(ctx context.Context, urn string, query LineageQuery) (LineageGraph, error) | ||
| Upsert(ctx context.Context, ns *namespace.Namespace, urn string, upstreams, downstreams []string) error | ||
| DeleteByURN(ctx context.Context, urn string) error | ||
| DeleteByURNs(ctx context.Context, urns []string) error |
There was a problem hiding this comment.
Critical: DeleteByURNs implementation path currently enables SQL injection.
This new interface method is implemented in internal/store/postgres/lineage_repository.go using string interpolation (fmt.Sprintf("source='%s' or target='%s'", urn, urn) around Line 82), which allows crafted URNs to break query boundaries. Switch to parameterized predicates via squirrel Eq/Or instead of raw string assembly.
Suggested safe query pattern (in internal/store/postgres/lineage_repository.go)
- orClauses := make([]string, 0, len(urns))
- for _, urn := range urns {
- orClauses = append(orClauses, fmt.Sprintf("source='%s' or target='%s'", urn, urn))
- }
- whereClause := strings.Join(orClauses, " or ")
- deleteQuery, _, err := sq.Delete("lineage_graph").Where(whereClause).ToSql()
+ ors := sq.Or{}
+ for _, urn := range urns {
+ ors = append(ors, sq.Eq{"source": urn})
+ ors = append(ors, sq.Eq{"target": urn})
+ }
+ deleteQuery, args, err := sq.Delete("lineage_graph").Where(ors).ToSql()
if err != nil {
return fmt.Errorf("error building delete query: %w", err)
}
- _, err = repo.client.ExecContext(ctx, deleteQuery)
+ _, err = repo.client.ExecContext(ctx, deleteQuery, args...)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@core/asset/lineage.go` at line 37, The DeleteByURNs implementation currently
builds SQL with fmt.Sprintf and is vulnerable to injection; replace the
string-interpolation predicate in the DeleteByURNs method with parameterized
predicates using squirrel's Eq/Or (e.g. build a slice of predicates like
sq.Or{sq.Eq{"source": urn}, sq.Eq{"target": urn}} for each urn or accumulate
with sq.Or across urns) and pass that predicate into the repository delete/query
builder instead of concatenating strings; locate the DeleteByURNs implementation
in lineage_repository.go and remove use of fmt.Sprintf(...) so all urn values
are bound via squirrel parameters (Eq/Or) and executed safely.
| if isValidUUID(id) { | ||
| urn, err = s.assetRepository.SoftDeleteByID(ctx, id) | ||
| } else { | ||
| urn = id | ||
| _, err = s.assetRepository.SoftDeleteByURN(ctx, id) | ||
| } | ||
| if err != nil { | ||
| return err | ||
| } | ||
|
|
||
| if err := s.discoveryRepository.SoftDeleteByURN(ctx, ns, urn); err != nil { | ||
| return fmt.Errorf("error soft deleting asset in discovery: %w", err) | ||
| } |
There was a problem hiding this comment.
This can return an error after the source-of-truth delete already committed.
If Postgres succeeds and the discovery update fails, the asset is soft-deleted in DB but still searchable in ES. A retry by ID will then hit ErrAssetAlreadyDeleted instead of repairing search. This needs compensation or an async retry/outbox path.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@core/asset/service.go` around lines 101 - 113, The current flow calls
s.assetRepository.SoftDeleteByID / SoftDeleteByURN and then calls
s.discoveryRepository.SoftDeleteByURN, which can leave Postgres changed but ES
unchanged if the discovery call fails; that causes future retries by ID to hit
ErrAssetAlreadyDeleted and prevents repairing search. Change the handler so DB
soft-delete is authoritative and the discovery update is done asynchronously or
via an outbox/queue: after successful s.assetRepository.SoftDeleteByID /
SoftDeleteByURN, enqueue a compensating discovery delete job (e.g., via an
outbox service or s.outbox.EnqueueAssetDelete(ns, urn)) instead of returning an
error when s.discoveryRepository.SoftDeleteByURN fails; if you don’t have outbox
infrastructure, spawn a background retry worker/goroutine with exponential
backoff to call s.discoveryRepository.SoftDeleteByURN and log or persist
failures for retry, and ensure the API returns success for the DB soft-delete
rather than propagating ErrAssetAlreadyDeleted.
| if !query.WithAttributes { | ||
| return Lineage{ | ||
| Edges: edges, | ||
| }, nil | ||
| } |
There was a problem hiding this comment.
This makes with_attributes opt-in for existing callers.
internal/server/v1beta1/lineage.go only sets WithAttributes when the request field is non-nil, so omitted requests now land here with false and lose NodeAttrs entirely. Default the nil case to true if you want to preserve the prior response shape.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@core/asset/service.go` around lines 161 - 165, The current check treats
query.WithAttributes as false when the pointer is nil, dropping NodeAttrs for
callers that omitted the field; update the conditional so a nil WithAttributes
defaults to true. Specifically, in the function returning Lineage (where you
currently have "if !query.WithAttributes { ... }"), change it to check the
pointer: "if query.WithAttributes != nil && !*query.WithAttributes { return
Lineage{Edges: edges}, nil }" so that nil means include attributes; this
preserves the previous response shape for callers that don't set WithAttributes
(see WithAttributes, Lineage, and NodeAttrs and the request construction in
internal/server/v1beta1/lineage.go).
| go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc v0.42.0 | ||
| go.opentelemetry.io/otel v1.16.0 | ||
| go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetricgrpc v0.39.0 | ||
| go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc v1.16.0 | ||
| go.opentelemetry.io/otel/sdk v1.16.0 | ||
| go.opentelemetry.io/otel/sdk/metric v0.39.0 | ||
| google.golang.org/genproto v0.0.0-20230306155012-7f2fa6fef1f4 | ||
| google.golang.org/grpc v1.55.0 | ||
| google.golang.org/protobuf v1.30.0 |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
What are the latest versions of google.golang.org/grpc go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc google.golang.org/protobuf March 2026
💡 Result:
Latest versions as of March 2026 (2026-03-27): - google.golang.org/grpc: v1.79.3 - go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc: v0.67.0 - google.golang.org/protobuf: v1.36.11
Citations:
- 1: https://pkg.go.dev/google.golang.org/grpc
- 2: https://pkg.go.dev/go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc
- 3: https://pkg.go.dev/google.golang.org/protobuf
🏁 Script executed:
# Check for CVE/advisory information about the specific versions
python3 << 'EOF'
import json
# Define the versions and CVEs to verify
vulnerabilities = {
"grpc v1.55.0": [
"GHSA-p77j-4mvh-x3m3", # authorization bypass
"GHSA-m425-mq94-257g", # HTTP/2 Rapid Reset DoS
"GO-2023-2153",
"GO-2026-4762"
],
"otelgrpc v0.42.0": [
"GO-2023-2331", # DoS vulnerability
"GHSA-8pgv-569h-w5rw" # unbounded cardinality metrics
],
"protobuf v1.30.0": [
"GHSA-8r3f-844c-mc37", # Infinite loop in JSON unmarshaling
"GO-2024-2611"
]
}
print("CVEs/Advisories mentioned in review:")
for dep, cves in vulnerabilities.items():
print(f"\n{dep}:")
for cve in cves:
print(f" - {cve}")
print("\n\nVersion gaps to latest (as of March 2026):")
gaps = {
"grpc": ("v1.55.0", "v1.79.3"),
"otelgrpc": ("v0.42.0", "v0.67.0"),
"protobuf": ("v1.30.0", "v1.36.11")
}
for lib, (current, latest) in gaps.items():
print(f"{lib}: {current} → {latest}")
EOFRepository: raystack/compass
Length of output: 439
🏁 Script executed:
# Look for security-related files
fd -i 'security\|cve\|vulnerability\|advisory' -t f 2>/dev/null | head -20
# Check for SECURITY.md or similar files
find . -maxdepth 2 -iname '*security*' -o -iname '*advisory*' 2>/dev/null | head -10
# Check the go.mod file to see the exact dependency versions
head -50 go.modRepository: raystack/compass
Length of output: 2050
🌐 Web query:
GHSA-p77j-4mvh-x3m3 grpc authorization bypass vulnerability
💡 Result:
GHSA-p77j-4mvh-x3m3 is a critical authorization bypass vulnerability (CVE-2026-33186, CVSS 9.1) in gRPC-Go (google.golang.org/grpc) versions prior to 1.79.3. The vulnerability stems from improper input validation of the HTTP/2 :path pseudo-header. The gRPC-Go server accepts requests missing the mandatory leading slash in :path (e.g., "Service/Method" instead of "/Service/Method"). It routes these to the correct handler, but authorization interceptors (like google.golang.org/grpc/authz or custom ones using info.FullMethod) check the raw non-canonical path. This causes "deny" rules for canonical paths (starting with "/") to fail matching, allowing bypass if there's a fallback "allow" rule. Impacted servers use path-based authorization with specific deny rules and fallback allow. Exploitation requires sending raw HTTP/2 frames with malformed :path, needing network access. Patched in v1.79.3: rejects invalid :path with codes.Unimplemented before interceptors/handlers. Upgrade to 1.79.3 or later. Workarounds: validating interceptor checking FullMethod starts with "/", infrastructure-level HTTP/2 path normalization.
Citations:
- 1: GHSA-p77j-4mvh-x3m3
- 2: https://nvd.nist.gov/vuln/detail/CVE-2026-33186
- 3: GHSA-p77j-4mvh-x3m3
- 4: https://www.resolvedsecurity.com/vulnerability-catalog/GHSA-p77j-4mvh-x3m3
- 5: https://advisories.gitlab.com/pkg/golang/google.golang.org/grpc/CVE-2026-33186/
🌐 Web query:
GHSA-m425-mq94-257g HTTP/2 Rapid Reset DoS grpc
💡 Result:
GHSA-m425-mq94-257g is a high-severity (CVSS 7.5) vulnerability in the gRPC-Go library (package google.golang.org/grpc) related to the HTTP/2 Rapid Reset Denial of Service attack (CVE-2023-44487). The HTTP/2 Rapid Reset attack exploits the protocol's RST_STREAM frame, allowing attackers to send HTTP/2 requests, immediately cancel them with RST_STREAM, and repeat rapidly. This is valid per HTTP/2 spec (RFC 7540) but overwhelms servers by bypassing concurrency limits like MAX_CONCURRENT_STREAMS, leading to excessive resource consumption and DoS. In gRPC-Go specifically, affected versions allow attackers to launch more concurrent method handlers than the configured grpc.MaxConcurrentStreams limit per connection, exhausting server resources. Affected versions: - < 1.56.3 - >= 1.57.0 and < 1.57.1 - >= 1.58.0 and < 1.58.3 Patched versions: 1.56.3, 1.57.1, 1.58.3 (and later, including 1.59.0+). The fix (PR #6703) prohibits more than MaxConcurrentStreams handlers from running simultaneously per connection. Users should also explicitly set grpc.MaxConcurrentStreams (default is unlimited) and ensure handlers clean up resources promptly. No workarounds exist besides patching. As of 2026-03-27, latest gRPC-Go releases (e.g., v1.79+) include the fix. This gRPC-Go issue is a specific manifestation of the broader HTTP/2 Rapid Reset vulnerability affecting many servers (e.g., nginx, Apache), disclosed October 2023 after attacks peaking at 398M rps.
Citations:
- 1: GHSA-m425-mq94-257g
- 2: https://osv.dev/vulnerability/GHSA-m425-mq94-257g
- 3: https://blog.cloudflare.com/technical-breakdown-http2-rapid-reset-ddos-attack/
- 4: https://cloud.google.com/blog/products/identity-security/how-it-works-the-novel-http2-rapid-reset-ddos-attack
- 5: GHSA-m425-mq94-257g
- 6: server: prohibit more than MaxConcurrentStreams handlers from running at once grpc/grpc-go#6703
🌐 Web query:
GO-2023-2331 otelgrpc DoS cardinality vulnerability
💡 Result:
GO-2023-2331 is a denial-of-service (DoS) vulnerability in the otelgrpc package (go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc). The grpc Unary Server Interceptor adds labels net.peer.sock.addr and net.peer.sock.port with unbounded cardinality to metrics, leading to potential memory exhaustion when handling many malicious requests from different peers. Aliases: CVE-2023-47108, GHSA-8pgv-569h-w5rw. Affected versions: v0.37.0 through v0.45.x (fixed in v0.46.0). To be vulnerable, the application must: configure a metrics pipeline, use UnaryServerInterceptor (or similar), and not filter client IPs/ports via proxies/middleware. Workarounds: Use a view to remove these attributes, or disable metrics with otelgrpc.WithMeterProvider(noop.NewMeterProvider). Fix: Upgrade to v0.46.0 or later, which removes these high-cardinality attributes (via open-telemetry/opentelemetry-go-contrib#4322). CVSS score: 7.5 (High) per NVD. Published: 2023-11-10 (GHSA), Go vuln DB entry: 2024-06-27. No evidence of active exploitation as of 2026-03-27.
Citations:
- 1: https://pkg.go.dev/vuln/GO-2023-2331
- 2: https://osv.dev/vulnerability/GO-2023-2331
- 3: GHSA-8pgv-569h-w5rw
- 4: https://nvd.nist.gov/vuln/detail/cve-2023-47108
- 5: https://advisories.gitlab.com/pkg/golang/go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc/CVE-2023-47108
🌐 Web query:
GHSA-8r3f-844c-mc37 protobuf infinite loop JSON unmarshaling
💡 Result:
GHSA-8r3f-844c-mc37 is a security vulnerability (CVE-2024-24786, GO-2024-2611) in the google.golang.org/protobuf Go package, affecting versions before 1.33.0, including subpackages google.golang.org/protobuf/encoding/protojson and google.golang.org/protobuf/internal/encoding/json. The protojson.Unmarshal function enters an infinite loop when unmarshaling certain invalid JSON inputs, specifically when the target message contains a google.protobuf.Any field or when UnmarshalOptions.DiscardUnknown is enabled. This causes a denial-of-service (DoS) via CPU exhaustion. Severity: Moderate (CVSS 6.6, availability impact high). Patched in v1.33.0 (released March 2024). Upgrade to 1.33.0 or later. Official sources: GitHub Advisory Database (published Mar 6, 2024, updated Nov 7, 2024), OSV.dev, Go vuln database.
Citations:
- 1: GHSA-8r3f-844c-mc37
- 2: https://osv.dev/vulnerability/GHSA-8r3f-844c-mc37
- 3: https://securityvulnerability.io/vulnerability/CVE-2024-24786
Upgrade gRPC, otelgrpc, and protobuf dependencies to address critical security vulnerabilities.
Multiple high-severity and critical vulnerabilities exist in the current dependencies:
-
google.golang.org/grpc v1.55.0:
- Authorization bypass (GHSA-p77j-4mvh-x3m3, CVSS 9.1): Missing validation of HTTP/2 :path pseudo-header allows bypass of path-based authorization rules
- HTTP/2 Rapid Reset DoS (GHSA-m425-mq94-257g, CVSS 7.5): Unlimited concurrent stream resets exhaust server resources
-
go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc v0.42.0:
- DoS via unbounded cardinality metrics (GO-2023-2331, CVE-2023-47108, CVSS 7.5): Unfiltered peer socket attributes cause memory exhaustion
-
google.golang.org/protobuf v1.30.0:
- Infinite loop in JSON unmarshaling (GHSA-8r3f-844c-mc37, CVE-2024-24786, CVSS 6.6): Malformed JSON triggers CPU exhaustion
Upgrade to:
- google.golang.org/grpc ≥ v1.79.3
- go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc ≥ v0.46.0
- google.golang.org/protobuf ≥ v1.33.0
🧰 Tools
🪛 OSV Scanner (2.3.5)
[HIGH] 39-39: go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc 0.42.0: Denial of service in go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc
(GO-2023-2331)
[HIGH] 39-39: go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc 0.42.0: otelgrpc DoS vulnerability due to unbound cardinality metrics
[CRITICAL] 39-39: google.golang.org/grpc 1.55.0: Denial of service from HTTP/2 Rapid Reset in google.golang.org/grpc
(GO-2023-2153)
[CRITICAL] 39-39: google.golang.org/grpc 1.55.0: Authorization bypass in gRPC-Go via missing leading slash in :path in google.golang.org/grpc
(GO-2026-4762)
[CRITICAL] 39-39: google.golang.org/grpc 1.55.0: gRPC-Go HTTP/2 Rapid Reset vulnerability
[CRITICAL] 39-39: google.golang.org/grpc 1.55.0: gRPC-Go has an authorization bypass via missing leading slash in :path
[HIGH] 47-47: google.golang.org/protobuf 1.30.0: Infinite loop in JSON unmarshaling in google.golang.org/protobuf
(GO-2024-2611)
[HIGH] 47-47: google.golang.org/protobuf 1.30.0: Golang protojson.Unmarshal function infinite loop when unmarshaling certain forms of invalid JSON
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@go.mod` around lines 39 - 47, Update the vulnerable dependency versions in
go.mod: bump google.golang.org/grpc to at least v1.79.3,
go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc to
at least v0.46.0, and google.golang.org/protobuf to at least v1.33.0 (also
ensure go.opentelemetry.io/otel and related otlp/otel packages remain mutually
compatible); run go mod tidy and go test to verify module resolution and
compatibility after changing the versions for grpc, otelgrpc, and protobuf.
| res, err := repo.cli.client.UpdateByQuery( | ||
| []string{defaultSearchIndexAlias}, | ||
| repo.cli.client.UpdateByQuery.WithBody(strings.NewReader(body)), | ||
| repo.cli.client.UpdateByQuery.WithContext(ctx), | ||
| repo.cli.client.UpdateByQuery.WithRefresh(true), | ||
| repo.cli.client.UpdateByQuery.WithIgnoreUnavailable(true), |
There was a problem hiding this comment.
Soft deletes need tenant scoping.
This UpdateByQuery hits defaultSearchIndexAlias and never constrains by ns, so the same URN in another namespace gets soft-deleted too. Writes already use BuildAliasNameFromNamespace(ns), so this path should hit that alias or add a namespace_id filter.
Minimal fix
- []string{defaultSearchIndexAlias},
+ []string{BuildAliasNameFromNamespace(ns)},🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/store/elasticsearch/discovery_repository.go` around lines 120 - 125,
The UpdateByQuery call in discovery_repository.go is using
defaultSearchIndexAlias and lacks a namespace filter, causing soft-deletes to
affect the same URN across tenants; modify the call to scope by tenant by
replacing defaultSearchIndexAlias with BuildAliasNameFromNamespace(ns) (the same
alias used by writes) or ensure the query body includes a term filter on
namespace_id (e.g., add a must term namespace_id: ns) before calling
repo.cli.client.UpdateByQuery; update references to defaultSearchIndexAlias and
the request body construction to guarantee the operation only targets the
intended namespace.
| IsDeleted: a.IsDeleted, | ||
| Version: a.Version, | ||
| UpdatedBy: a.UpdatedBy.toUser(), | ||
| CreatedAt: a.CreatedAt, | ||
| UpdatedAt: a.UpdatedAt, | ||
| RefreshedAt: a.RefreshedAt, |
There was a problem hiding this comment.
Propagate the new fields in versioned assets too.
toAsset() now sets IsDeleted and RefreshedAt, but toVersionedAsset() still returns the zero values. GetByVersion* will disagree with GetByID for deleted or freshly refreshed assets.
Minimal follow-up in `toVersionedAsset()`
return asset.Asset{
ID: latestAssetVersion.ID,
URN: latestAssetVersion.URN,
Type: latestAssetVersion.Type,
+ IsDeleted: latestAssetVersion.IsDeleted,
+ RefreshedAt: latestAssetVersion.RefreshedAt,
Name: a.Name,
Service: latestAssetVersion.Service,📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| IsDeleted: a.IsDeleted, | |
| Version: a.Version, | |
| UpdatedBy: a.UpdatedBy.toUser(), | |
| CreatedAt: a.CreatedAt, | |
| UpdatedAt: a.UpdatedAt, | |
| RefreshedAt: a.RefreshedAt, | |
| return asset.Asset{ | |
| ID: latestAssetVersion.ID, | |
| URN: latestAssetVersion.URN, | |
| Type: latestAssetVersion.Type, | |
| IsDeleted: latestAssetVersion.IsDeleted, | |
| RefreshedAt: latestAssetVersion.RefreshedAt, | |
| Name: a.Name, | |
| Service: latestAssetVersion.Service, |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/store/postgres/asset_model.go` around lines 50 - 55,
toVersionedAsset() is missing propagation of the new IsDeleted and RefreshedAt
fields, causing GetByVersion* to disagree with GetByID; update the
toVersionedAsset() function to set the returned versioned asset's IsDeleted =
a.IsDeleted and RefreshedAt = a.RefreshedAt (and ensure any related fields like
UpdatedBy are already mapped the same way as in toAsset()) so versioned assets
carry the same deletion/refresh state as the non-versioned mapping.
| orClauses := make([]string, 0, len(urns)) | ||
| for _, urn := range urns { | ||
| orClauses = append(orClauses, fmt.Sprintf("source='%s' or target='%s'", urn, urn)) | ||
| } | ||
| whereClause := strings.Join(orClauses, " or ") | ||
|
|
||
| deleteQuery, _, err := sq.Delete("lineage_graph").Where(whereClause).ToSql() | ||
| if err != nil { | ||
| return fmt.Errorf("error building delete query: %w", err) | ||
| } | ||
|
|
||
| _, err = repo.client.ExecContext(ctx, deleteQuery) |
There was a problem hiding this comment.
Parameterize the batch delete predicate.
Interpolating urn into whereClause makes this SQL-injectable and also breaks on URNs containing '. Use squirrel predicates/args here and pass them into ExecContext.
Safe shape
- orClauses := make([]string, 0, len(urns))
- for _, urn := range urns {
- orClauses = append(orClauses, fmt.Sprintf("source='%s' or target='%s'", urn, urn))
- }
- whereClause := strings.Join(orClauses, " or ")
-
- deleteQuery, _, err := sq.Delete("lineage_graph").Where(whereClause).ToSql()
+ conditions := sq.Or{}
+ for _, urn := range urns {
+ conditions = append(conditions, sq.Or{
+ sq.Eq{"source": urn},
+ sq.Eq{"target": urn},
+ })
+ }
+
+ deleteQuery, args, err := sq.Delete("lineage_graph").
+ PlaceholderFormat(sq.Dollar).
+ Where(conditions).
+ ToSql()
if err != nil {
return fmt.Errorf("error building delete query: %w", err)
}
- _, err = repo.client.ExecContext(ctx, deleteQuery)
+ _, err = repo.client.ExecContext(ctx, deleteQuery, args...)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/store/postgres/lineage_repository.go` around lines 82 - 93, The code
builds a SQL WHERE by interpolating urns (orClauses/whereClause) which is
SQL-injectable and fails on quotes; instead build a parameterized predicate
using squirrel (e.g. create a predicate like sq.Or{sq.Eq{"source": urns},
sq.Eq{"target": urns}} or equivalent per-urn sq.Eq entries), use
sq.Delete("lineage_graph").Where(predicate).ToSql() to get (query, args, err)
and then call repo.client.ExecContext(ctx, query, args...) so the URNs are
passed as arguments rather than interpolated; also handle the empty-urns case
before building the query.
| metricExporter, err := otlpmetricgrpc.New(ctx, | ||
| otlpmetricgrpc.WithEndpoint(cfg.OpenTelemetry.CollectorAddr), | ||
| otlpmetricgrpc.WithInsecure(), | ||
| ) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("failed to create metric exporter: %w", err) | ||
| } |
There was a problem hiding this comment.
Resource leak if metric exporter creation fails.
If otlpmetricgrpc.New fails, the previously created traceExporter and tracerProvider are not cleaned up, leading to a resource leak on this error path.
Proposed fix
metricExporter, err := otlpmetricgrpc.New(ctx,
otlpmetricgrpc.WithEndpoint(cfg.OpenTelemetry.CollectorAddr),
otlpmetricgrpc.WithInsecure(),
)
if err != nil {
+ // Clean up trace provider before returning
+ shutdownCtx, cancel := context.WithTimeout(ctx, 5*time.Second)
+ defer cancel()
+ _ = tracerProvider.Shutdown(shutdownCtx)
return nil, fmt.Errorf("failed to create metric exporter: %w", err)
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| metricExporter, err := otlpmetricgrpc.New(ctx, | |
| otlpmetricgrpc.WithEndpoint(cfg.OpenTelemetry.CollectorAddr), | |
| otlpmetricgrpc.WithInsecure(), | |
| ) | |
| if err != nil { | |
| return nil, fmt.Errorf("failed to create metric exporter: %w", err) | |
| } | |
| metricExporter, err := otlpmetricgrpc.New(ctx, | |
| otlpmetricgrpc.WithEndpoint(cfg.OpenTelemetry.CollectorAddr), | |
| otlpmetricgrpc.WithInsecure(), | |
| ) | |
| if err != nil { | |
| // Clean up trace provider before returning | |
| shutdownCtx, cancel := context.WithTimeout(ctx, 5*time.Second) | |
| defer cancel() | |
| _ = tracerProvider.Shutdown(shutdownCtx) | |
| return nil, fmt.Errorf("failed to create metric exporter: %w", err) | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/telemetry/telemetry.go` around lines 61 - 67, If otlpmetricgrpc.New
fails, the already-created traceExporter and tracerProvider must be shut down to
avoid leaking resources; update the error path after calling otlpmetricgrpc.New
to call traceExporter.Shutdown(ctx) and tracerProvider.Shutdown(ctx)
(handling/aggregating any returned errors) before returning the wrapped
fmt.Errorf. Locate the variables traceExporter, tracerProvider and the
otlpmetricgrpc.New call in the telemetry initialization function and perform the
shutdown/cleanup there on metric exporter creation failure.
| func (m *SuggestAssetsRequest) validate(all bool) error { | ||
| if m == nil { | ||
| return nil | ||
| } | ||
|
|
||
| var errors []error | ||
|
|
||
| if m.GetText() != "" { | ||
|
|
||
| } | ||
|
|
||
| if len(errors) > 0 { | ||
| return SuggestAssetsRequestMultiError(errors) | ||
| } | ||
|
|
||
| return nil | ||
| } |
There was a problem hiding this comment.
Blank suggest queries still validate successfully.
SuggestAssetsRequest{} is valid here because Text never produces an error. The empty-query fallback in this PR is scoped to search, not suggest, so this should fail fast at the API boundary instead of being pushed downstream. Please add a non-empty rule for text in service.proto and regenerate this file.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@proto/raystack/compass/v1beta1/service.pb.validate.go` around lines 2794 -
2810, The SuggestAssetsRequest.validate currently allows empty queries because
the Text field has no non-empty rule; update the service proto by adding a
non-empty validation rule for the text field (e.g., make text required or add a
"min_len" / "not_blank" validator) in the SuggestAssetsRequest message in
service.proto, then re-run the protobuf code generation to regenerate
service.pb.validate.go so that SuggestAssetsRequest.validate (and any generated
helper) will return an error when Text is empty instead of allowing empty
suggest requests to pass through.
| func (m *GroupAssetsRequest) validate(all bool) error { | ||
| if m == nil { | ||
| return nil | ||
| } | ||
|
|
||
| var errors []error | ||
|
|
||
| if len(m.GetGroupby()) < 1 { | ||
| err := GroupAssetsRequestValidationError{ | ||
| field: "Groupby", | ||
| reason: "value must contain at least 1 item(s)", | ||
| } | ||
| if !all { | ||
| return err | ||
| } | ||
| errors = append(errors, err) | ||
| } | ||
|
|
||
| // no validation rules for Filter | ||
|
|
||
| if m.GetSize() != 0 { | ||
|
|
||
| if m.GetSize() < 0 { | ||
| err := GroupAssetsRequestValidationError{ | ||
| field: "Size", | ||
| reason: "value must be greater than or equal to 0", | ||
| } | ||
| if !all { | ||
| return err | ||
| } | ||
| errors = append(errors, err) | ||
| } | ||
|
|
||
| } |
There was a problem hiding this comment.
Add item-level validation for Groupby and IncludeFields.
This only checks that Groupby has at least one element, so groupby: [""] still passes, and IncludeFields is skipped entirely in this request path. Both are caller-supplied field selectors and should be rejected when empty before the request reaches the grouping flow. Please add the rules in service.proto and regenerate this file.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@proto/raystack/compass/v1beta1/service.pb.validate.go` around lines 3002 -
3035, The GroupAssetsRequest.validate method currently only checks Groupby
length and skips IncludeFields, allowing empty selectors (e.g., groupby: [""])
to pass; add item-level non-empty string validation for both Groupby and
IncludeFields by updating the service.proto validation rules for
GroupAssetsRequest (add required/non-empty constraints on repeated string items
for groupby and include_fields), then regenerate the protobuf Go files so the
validate method enforces that each element of Groupby and IncludeFields is
non-empty and returns a GroupAssetsRequestValidationError with field names
"Groupby" or "IncludeFields" when violated.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/server/v1beta1/asset.go`:
- Around line 295-303: The code path in GetAssetByID mishandles
asset.NotFoundError: when asset.NotFoundError is returned and
req.GetUpdateOnly() is false the function falls through and operates on a
zero-value ast (used later via ast.Patch(patchAssetMap)), causing incorrect
upsert behavior; change the error handling in the GetAssetByID error branch
inside the upsert flow to explicitly return a gRPC NotFound error (or an
appropriate error response) when errors.As(err, new(asset.NotFoundError)) is
true and req.GetUpdateOnly() is false, and keep the existing early-return for
update-only true; also normalize the errors.As pattern to use
new(asset.NotFoundError) (matching the other occurrence) so the type check is
consistent.
- Around line 63-65: The current filter only sets fb.IsDeleted(true) when
req.GetIsDeleted() is true, so by default soft-deleted assets are returned;
update the filtering in the GetAllAssets handler so that when the request does
not explicitly request deleted assets you apply fb = fb.IsDeleted(false) (i.e.,
call fb.IsDeleted(false) when req.GetIsDeleted() is false or absent) to return
non-deleted assets by default; if you need a tri-state later, switch
req.GetIsDeleted() to an explicit enum/wrapper, but for now change the logic
around req.GetIsDeleted() and fb.IsDeleted(...) to default to non-deleted.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 21a445bf-4af8-4f2e-91fa-ceae6ff88f3e
📒 Files selected for processing (1)
internal/server/v1beta1/asset.go
| if req.GetIsDeleted() { | ||
| fb = fb.IsDeleted(true) | ||
| } |
There was a problem hiding this comment.
Default behavior may expose soft-deleted assets unexpectedly.
When req.GetIsDeleted() is false (the default), no IsDeleted filter is applied, causing GetAllAssets to return all assets including soft-deleted ones. This may break existing clients expecting only non-deleted assets by default.
Consider defaulting to non-deleted assets:
🛡️ Proposed fix to default to non-deleted assets
Data(req.GetData())
-if req.GetIsDeleted() {
- fb = fb.IsDeleted(true)
-}
+// Default to non-deleted assets unless explicitly requested
+fb = fb.IsDeleted(req.GetIsDeleted())If you need a tri-state (deleted-only, non-deleted-only, all), consider using a proto enum or wrapper type instead of a bare bool.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if req.GetIsDeleted() { | |
| fb = fb.IsDeleted(true) | |
| } | |
| // Default to non-deleted assets unless explicitly requested | |
| fb = fb.IsDeleted(req.GetIsDeleted()) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/server/v1beta1/asset.go` around lines 63 - 65, The current filter
only sets fb.IsDeleted(true) when req.GetIsDeleted() is true, so by default
soft-deleted assets are returned; update the filtering in the GetAllAssets
handler so that when the request does not explicitly request deleted assets you
apply fb = fb.IsDeleted(false) (i.e., call fb.IsDeleted(false) when
req.GetIsDeleted() is false or absent) to return non-deleted assets by default;
if you need a tri-state later, switch req.GetIsDeleted() to an explicit
enum/wrapper, but for now change the logic around req.GetIsDeleted() and
fb.IsDeleted(...) to default to non-deleted.
| if err != nil { | ||
| if errors.As(err, &asset.NotFoundError{}) { | ||
| if req.GetUpdateOnly() { | ||
| return &compassv1beta1.UpsertPatchAssetResponse{Id: ""}, nil | ||
| } | ||
| } else { | ||
| return nil, internalServerError(server.logger, err.Error()) | ||
| } | ||
| } |
There was a problem hiding this comment.
Logic bug: unhandled fall-through when asset not found and update_only is false.
When GetAssetByID returns NotFoundError and update_only is false, the code falls through without returning, continuing execution with a zero-value ast. The subsequent ast.Patch(patchAssetMap) operates on an empty asset, which may cause incorrect data to be upserted.
Additionally, line 296 uses &asset.NotFoundError{} while line 262 uses new(asset.NotFoundError) — prefer the consistent pattern.
🐛 Proposed fix
ast, err := server.assetService.GetAssetByID(ctx, urn)
if err != nil {
- if errors.As(err, &asset.NotFoundError{}) {
+ if errors.As(err, new(asset.NotFoundError)) {
if req.GetUpdateOnly() {
return &compassv1beta1.UpsertPatchAssetResponse{Id: ""}, nil
}
+ // Asset not found and not update_only: initialize empty asset for creation
+ ast = asset.Asset{
+ URN: baseAsset.GetUrn(),
+ Type: asset.Type(baseAsset.GetType()),
+ Service: baseAsset.GetService(),
+ }
} else {
return nil, internalServerError(server.logger, err.Error())
}
}Alternatively, if creation is not intended here, return a NotFound error when update_only is false.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/server/v1beta1/asset.go` around lines 295 - 303, The code path in
GetAssetByID mishandles asset.NotFoundError: when asset.NotFoundError is
returned and req.GetUpdateOnly() is false the function falls through and
operates on a zero-value ast (used later via ast.Patch(patchAssetMap)), causing
incorrect upsert behavior; change the error handling in the GetAssetByID error
branch inside the upsert flow to explicitly return a gRPC NotFound error (or an
appropriate error response) when errors.As(err, new(asset.NotFoundError)) is
true and req.GetUpdateOnly() is false, and keep the existing early-return for
update-only true; also normalize the errors.As pattern to use
new(asset.NotFoundError) (matching the other occurrence) so the type check is
consistent.
- Add .golangci.yml with errcheck and staticcheck enabled - Exclude Close() errcheck via text pattern and exclude-functions - Exclude QF1008 (embedded field selector) staticcheck suggestion - Fix S1009: remove redundant nil checks before len() in discussion.go - Fix QF1001: apply De Morgan's law in user.go - Fix QF1008: remove embedded API field from ES test selectors
Upgrade from golangci-lint v1.53 to v2.1 in CI workflow and use v2 config format. Also update checkout and setup-go actions to latest.
- actions/checkout: v2/v3 -> v4 - actions/setup-go: v2/v4 -> v5 - docker/login-action: v1 -> v3 - docker/build-push-action: v2 -> v6 - goreleaser/goreleaser-action: v4 -> v6 - golangci/golangci-lint-action: already at v6
- actions/checkout: v4 -> v6 - actions/setup-go: v5 -> v6 - docker/login-action: v3 -> v4 - docker/build-push-action: v6 -> v7 - goreleaser/goreleaser-action: v6 -> v7 - golangci/golangci-lint-action: v7 -> v9
Summary
Ports key features and fixes from the goto/compass fork back to the main repo. Changes are adapted to work with the main repo's multi-tenancy (RLS/namespaces) architecture.
Search Enhancements
Asset Deletion Lifecycle
Upsert/Patch Fixes
Lineage Enhancements
Observability
Proto & Other
Test plan
go build ./...passesgo testpasses for all non-integration test packages