Skip to content

Add fork choice write lock tracing context#9306

Open
peter941221 wants to merge 1 commit into
sigp:unstablefrom
peter941221:issue/sigp-lighthouse-8147
Open

Add fork choice write lock tracing context#9306
peter941221 wants to merge 1 commit into
sigp:unstablefrom
peter941221:issue/sigp-lighthouse-8147

Conversation

@peter941221
Copy link
Copy Markdown

@peter941221 peter941221 commented May 15, 2026

Issue Addressed

Part of #8147.

Proposed Changes

Adds source-specific tracing spans around production fork-choice write-lock/update paths so lock wait spans have useful parent context in traces.

This is intentionally behavior-neutral:

  • does not change fork choice locking semantics
  • does not move DB writes outside the lock
  • does not introduce a custom lock wrapper
  • does not alter fork choice, payload, or block import logic

Covered paths include block import, payload envelope import, attestation/slashing application, recompute-head, finalization prune, and execution payload validity updates.

The instrumentation uses a frequency-aware span level split:

  • low-frequency or issue-critical paths use info_span! so parent/source context is visible alongside the existing default #[instrument] fork-choice write-lock span in info-level tracing configurations
  • high-volume attestation application paths use debug_span! to avoid adding info-level trace noise on gossip-heavy paths

Additional Info

This follows the instrumentation-first direction discussed in #8147.

Verifications Performed

  • cargo fmt --check
  • cargo check -p beacon_chain
  • git diff --check

@peter941221 peter941221 force-pushed the issue/sigp-lighthouse-8147 branch from 13fb58f to 5182232 Compare May 15, 2026 01:43
@peter941221 peter941221 force-pushed the issue/sigp-lighthouse-8147 branch from 5182232 to 2612423 Compare May 15, 2026 01:48
@michaelsproul
Copy link
Copy Markdown
Member

Have you done any monitoring of a live node to see whether these spans are working/informative?

@peter941221
Copy link
Copy Markdown
Author

@michaelsproul will run it now and have it test.

@peter941221
Copy link
Copy Markdown
Author

@michaelsproul

I validated this on a running local node rather than only from tests.

I used a reduced Kurtosis setup in WSL with 2 Lighthouse CL nodes, 2 Geth EL nodes, and Tempo, then confirmed the spans from trace details via /api/traces/<trace_id>.

I was able to observe these spans on a healthy local live run:

  1. payload_envelope_import_fork_choice_update
    trace: 70577bcc3be8291f02f7c8fda14458b9

  2. valid_payload_fork_choice_update
    trace: c9a1ee725fb116b7473eda5100a2e033

  3. finalization_fork_choice_prune
    trace: 13ce417dfae7251730a1abd89adbbeaa

The remaining span, invalid_payload_fork_choice_update, did not occur naturally on the healthy reduced net, so I validated that path separately with a local validation-only one-shot InvalidBlockHash harness in notify_forkchoice_updated(...).

With that harness in place, I was also able to observe:

  1. invalid_payload_fork_choice_update
    trace: 6a78a5d864c28a9b3d4047f7ba7cad87

So the precise summary is:

  1. the first three spans were observed on a healthy local live run
  2. the invalid span was observed on a running node too, but only under a forced invalidation validation scenario

Relevant code anchors for the span definitions:

  • payload_envelope_import_fork_choice_update: beacon_node/beacon_chain/src/payload_envelope_verification/import.rs:240
  • valid_payload_fork_choice_update: beacon_node/beacon_chain/src/beacon_chain.rs:6535
  • invalid_payload_fork_choice_update: beacon_node/beacon_chain/src/beacon_chain.rs:6156
  • finalization_fork_choice_prune: beacon_node/beacon_chain/src/canonical_head.rs:1065

One tooling note: in this setup, /api/search?tags=name=... was not reliable for child spans, so I used /api/traces/<trace_id> detail pulls for the actual confirmation.

@peter941221
Copy link
Copy Markdown
Author

PR #9306 Runtime Validation (local running node)

setup

WSL
reduced Kurtosis net
2x Lighthouse CL
2x Geth EL
Tempo trace backend

validation method

child spans confirmed from:
/api/traces/<trace_id>

note:
/api/search?tags=name=... was useful for coarse search,
but not reliable as the final child-span existence check.

span matrix

[OK] payload_envelope_import_fork_choice_update
trace id : 70577bcc3be8291f02f7c8fda14458b9
status : natural local live proof
path : lh_process_execution_payload_envelope
-> process_execution_payload_envelope
-> import_execution_payload_envelope
-> payload_envelope_import_fork_choice_update
code : beacon_node/beacon_chain/src/payload_envelope_verification/import.rs:240

[OK] valid_payload_fork_choice_update
trace id : c9a1ee725fb116b7473eda5100a2e033
status : natural local live proof
path : lh_process_execution_payload_envelope
-> process_execution_payload_envelope
-> lh_recompute_head_at_slot
-> spawn_execution_layer_updates
-> lh_update_el_forkchoice
-> valid_payload_fork_choice_update
code : beacon_node/beacon_chain/src/beacon_chain.rs:6535

[OK] finalization_fork_choice_prune
trace id : 13ce417dfae7251730a1abd89adbbeaa
status : natural local live proof
path : lh_fork_choice_advance
-> lh_recompute_head_at_slot
-> after_finalization
-> finalization_fork_choice_prune
code : beacon_node/beacon_chain/src/canonical_head.rs:1065

[OK] invalid_payload_fork_choice_update
trace id : 6a78a5d864c28a9b3d4047f7ba7cad87
status : forced local live proof
parent : lh_update_el_forkchoice
code : beacon_node/beacon_chain/src/beacon_chain.rs:6156
trigger : local validation-only one-shot InvalidBlockHash
harness : beacon_node/execution_layer/src/lib.rs

bottom line

3 / 4 spans were observed on a healthy local live run.
1 / 4 span was observed on a running node only under a forced invalidation scenario.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants