Skip to content

Local channel announcements can be silently dropped, leaving channels invisible to the pathfinder #10802

@ziggie1984

Description

@ziggie1984

Originally reported by the Zeus team after observing zero-conf channels that were open and operational at the link layer but missing from the local channel graph.

Problem

The funding manager adds a newly-opened channel to the local graph via addToGraph (funding/manager.go:3724), which sends the ChannelAnnouncement to the gossiper fire-and-forget. The channel-open state machine then advances to addedToGraph regardless of whether the edge was actually persisted. There is no recovery on restart — the state machine considers the work done.

If the announcement is dropped for any reason, the channel is healthy at the link layer but invisible to routing. This is especially severe for private zero-conf channels: no 6-confirmation re-announce path exists, so the channel can never recover.

Known silent-failure paths

  1. Crash between addToGraph send and saveChannelOpeningState(addedToGraph)funding/manager.go:4430-4441.
  2. Premature ChannelUpdate evicted from the bounded prematureChannelUpdates LRU before the announcement is persisted — discovery/gossiper.go:3276-3330.
  3. Previously zombified alias SCID causes addEdge to return ErrIgnored without re-validation — graph/builder.go:1112-1115.
  4. Zero-conf peer alias missing in handleChannelReadyReceived — silent early return, no addToGraph call — funding/manager.go:4412-4428.

Impact on routing

The pathfinder reads the channel graph DB exclusively; a missing edge is not compensated for by channeldb or the link layer. Concretely:

  • Outbound payments: newBandwidthManager (routing/bandwidth.go:67-79) and nodeEdgeUnifier.addGraphPolicies (routing/unified_edges.go:119) iterate ForEachNodeDirectedChannel. A missing edge is never enumerated, so the channel cannot be a first hop.
  • Outbound balance accounting: getOutgoingBalance (routing/pathfind.go:512-590) excludes the channel's balance, surfacing as FAILURE_REASON_NO_ROUTE / insufficient_balance despite healthy local_balance in listchannels.
  • Inbound payments: chanCanBeHopHint (routing/lnrpc/invoicesrpc/addinvoice.go:635-699) calls FetchChannelEdgesByID; on failure the channel is dropped from invoice route hints, so payers cannot construct a path to the node.

Proposed direction

Two complementary changes, both small:

  1. Verify-after-write in the open path. After addToGraph returns, confirm graphDB.HasChannelEdge(scid) before transitioning to addedToGraph. Failure → don't advance, retry.
  2. Startup reconciliation sweep. Walk FetchAllOpenChannels after authGossiper.Start(), check each owned channel's edge, re-announce via an exported ReannounceChannel wrapper for any missing/zombified edge. Config-flag-gated, bounded concurrency.

Plus hardening:

  • graph/builder.go:addEdge should treat "edge already present, same content" as idempotent success, not ErrIgnored.
  • A fresh local announcement should clear a stale zombie marker on an alias SCID we own.

Open questions

  • Should reconciliation also refresh a stale local ChannelUpdate, or strictly fill missing edges?
  • Default-on or default-off for the reconciler flag on first release?
  • For zero-conf with a missing peer alias on startup: wait, retry, or log-and-skip?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions