Skip to content

feat: Add debug output to MaxRecursionDepthError when debug header is set#3070

Merged
tstirrat15 merged 13 commits intomainfrom
tstirrat/continuing-lookup-cycle-debug-work
Apr 29, 2026
Merged

feat: Add debug output to MaxRecursionDepthError when debug header is set#3070
tstirrat15 merged 13 commits intomainfrom
tstirrat/continuing-lookup-cycle-debug-work

Conversation

@tstirrat15
Copy link
Copy Markdown
Contributor

@tstirrat15 tstirrat15 commented Apr 23, 2026

Description

Opening as an alternative/continuation to #3036. This takes a slightly different approach, which is to wait until a max depth error is encountered and then build the trace as the dispatch stack is unwound. It tracks across dispatch boundaries as well.

Notes

This code is a no-op if the request flag isn't present or if a max recursion depth error isn't encountered.

Changes

Will annotate.

Testing

Review. See that tests pass. See authzed/zed#682 and its testing section for an example of code that drives this.

@tstirrat15 tstirrat15 requested a review from a team as a code owner April 23, 2026 20:36
@github-actions github-actions Bot added area/api v1 Affects the v1 API area/tooling Affects the dev or user toolchain (e.g. tests, ci, build tools) area/dispatch Affects dispatching of requests labels Apr 23, 2026
@tstirrat15 tstirrat15 force-pushed the tstirrat/continuing-lookup-cycle-debug-work branch 3 times, most recently from f710fda to 301e035 Compare April 23, 2026 20:42
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 23, 2026

Codecov Report

❌ Patch coverage is 80.00000% with 23 lines in your changes missing coverage. Please review.
✅ Project coverage is 75.49%. Comparing base (e677c6a) to head (b3baf82).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
internal/dispatch/tracing.go 68.63% 12 Missing and 4 partials ⚠️
internal/dispatch/errors.go 73.34% 3 Missing and 1 partial ⚠️
internal/graph/lookupresources2.go 78.58% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##             main    #3070       +/-   ##
===========================================
+ Coverage   50.15%   75.49%   +25.34%     
===========================================
  Files         460      488       +28     
  Lines       57073    59629     +2556     
===========================================
+ Hits        28618    45008    +16390     
+ Misses      25305    11329    -13976     
- Partials     3150     3292      +142     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@tstirrat15 tstirrat15 force-pushed the tstirrat/continuing-lookup-cycle-debug-work branch 15 times, most recently from 27a7069 to 3af61b7 Compare April 24, 2026 00:01
@tstirrat15 tstirrat15 changed the title Tstirrat/continuing lookup cycle debug work Add debug output to MaxRecursionDepthError when debug header is set Apr 24, 2026
@tstirrat15 tstirrat15 changed the title Add debug output to MaxRecursionDepthError when debug header is set feat: Add debug output to MaxRecursionDepthError when debug header is set Apr 24, 2026
@tstirrat15 tstirrat15 force-pushed the tstirrat/continuing-lookup-cycle-debug-work branch from 3af61b7 to 157d239 Compare April 24, 2026 21:21
Comment thread internal/dispatch/tracing.go Outdated
Copy link
Copy Markdown
Contributor

@miparnisari miparnisari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please include in the testing section of the PR description:

  • the schema + relationships that would produce a cycle.
  • the CURL requests we can use to test out the thing e2e for LR and LS.

With this command and this data (plus another rel group:secondgroup#member@user:maria), I get an error as expected (400), but I don't get any debug info. Can you look into why it doesn't work with http?

curl --location --request POST 'http://localhost:8443/v1/permissions/resources' \
--header 'Content-Type: application/json' \
--header 'Accept: application/json' \
--header 'io.spicedb.requestdebuginfo: true' \
--header 'Authorization: Bearer super-secret-key' \
--data '{
  "resourceObjectType": "resource",
  "permission": "view",
  "subject": {
    "object": {
      "objectType": "user",
      "objectId": "maria"
    }
  }
}'

Comment thread internal/services/v1/permissions.go
Comment thread internal/services/v1/permissions.go Outdated
Comment thread pkg/datastore/test/relationships.go
@tstirrat15 tstirrat15 force-pushed the tstirrat/continuing-lookup-cycle-debug-work branch from f0d3fd9 to 109bd9e Compare April 28, 2026 18:42
@github-actions github-actions Bot added the area/schema Affects the Schema Language label Apr 28, 2026
@tstirrat15 tstirrat15 force-pushed the tstirrat/continuing-lookup-cycle-debug-work branch 4 times, most recently from c0b79ee to 8f50a91 Compare April 28, 2026 19:47
Comment thread CHANGELOG.md Outdated
debugInfo := &dispatch.LookupDebugInfo{}
req.NoError(prototext.Unmarshal([]byte(traceStr), debugInfo), "trace must parse as LookupDebugTrace")
req.Len(debugInfo.CycleMembers, 1, "there should be one cycle member found at the final depth")
// NOTE: this value will change if the max recursion depth changes, since that determines
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

// We'll only attach debug info if cyclemembers is non-empty. Otherwise we're
// in a normal recursion-too-deep scenario and the debuginfo doesn't help.
if len(debugInfo.CycleMembers) > 0 {
logging.Warn().Strs("cycle-members", debugInfo.CycleMembers).Msg("The following resource/relation pairs were found in a cycle in LookupResources:")
Copy link
Copy Markdown
Contributor

@miparnisari miparnisari Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: instead of creating a new log line can we not enrich the finished call log line with this information?

image

that log is very easy to get lost in the forest because it's not linked to a particular request id.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think our logging is currently set up for that - we'd need to pass the event through the tree in a way that we don't currently do. It's why I opted to make it a warn instead of an info.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, in that case, i think i'd also add a span attribute or a span event to the ongoing span indicating that a cycle was found. But can be done in a followup.

miparnisari
miparnisari previously approved these changes Apr 29, 2026
Copy link
Copy Markdown
Contributor

@miparnisari miparnisari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but you gotta fix conflicts

Jdepp007004 and others added 13 commits April 29, 2026 07:41
…ubjects

Fixes #2056

- Add LookupDebugTrace proto message to all 3 Lookup dispatch responses
- Add enable_debug_trace bool to all 3 Lookup dispatch requests
- Add internal/graph/debug.go with zero-cost traversalTracker
- Wire trackVisit in lookupresources2.go, lookupresources3.go, lookupsubjects.go
- Wire debug trace enable via requestmeta header in permissions.go
- Add unit tests in internal/graph/cycle_detection_test.go
- Add integration tests in internal/dispatch/graph/cycle_detection_test.go
- remove all trailer-based debug tracing (responsemeta, SerializeLookupDebugTrace)
- fix streaming logic (buffer lastResp, no double send)
- attach DebugInformationV2 only on final streamed response
- update tests to validate DebugInformationV2 instead of trailers
- clean up unused imports and legacy code

Note: build errors expected until authzed/api and authzed-go updates land
- add root node to DebugTree for clearer traversal structure
- ensure stable node identifiers for debug annotations
- minor test validation improvement
- no behavioral changes
- replace traversal map with stack-based trace
- capture per-edge traversal frames (resource, relation, permission)
- emit trace on MaxRecursionDepthExceeded when debug is enabled
- ensure concurrency safety via CloneTraversalStack
- remove batch-based sampling and fix incorrect traversal representation
- update tests for stack-based validation

Note: current implementation prioritizes trace correctness over batching by
splitting dispatches per ID. This may introduce additional overhead and can
be optimized in a follow-up by preserving batching while maintaining per-edge trace accuracy.
Co-authored-by: Maria Ines Parnisari <maria.ines.parnisari@authzed.com>
@tstirrat15 tstirrat15 force-pushed the tstirrat/continuing-lookup-cycle-debug-work branch from 78f6883 to b3baf82 Compare April 29, 2026 13:45
@tstirrat15 tstirrat15 merged commit 76fc153 into main Apr 29, 2026
45 checks passed
@tstirrat15 tstirrat15 deleted the tstirrat/continuing-lookup-cycle-debug-work branch April 29, 2026 17:19
@github-actions github-actions Bot locked and limited conversation to collaborators Apr 29, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

area/api v1 Affects the v1 API area/dependencies Affects dependencies area/dispatch Affects dispatching of requests area/schema Affects the Schema Language area/tooling Affects the dev or user toolchain (e.g. tests, ci, build tools)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants