feat(cbf): add auto-restart with exponential backoff#9
Closed
febyeji wants to merge 21 commits into
Closed
Conversation
…ll-request/patch Automated nightly rustfmt (2026-03-29)
Extract the connection logic into `do_connect_peer_internal` and have `do_connect_peer` act as a thin wrapper that always calls `propagate_result_to_subscribers` with the result. This removes the need to manually propagate at every error site, making the code less error-prone. Co-Authored-By: HAL 9000
Replace the split setuptools configuration (pyproject.toml + setup.cfg) with a unified hatchling-based setup. This adds a [build-system] section pointing to hatchling and a build hook (hatch_build.py) that marks wheels as platform-specific since we bundle native shared libraries. Hatchling includes all files in the package directory by default, which also fixes the missing *.dll glob that setup.cfg had for Windows. Bump requires-python from >=3.6 to >=3.8 as 3.6/3.7 are long EOL. Co-Authored-By: HAL 9000
…ripts Add `python_build_wheel.sh` which generates bindings and builds a platform-specific wheel via `uv build`, and `python_publish_package.sh` which publishes collected wheels via `uv publish`. The intended workflow is to run the build script on each target platform (Linux, macOS), collect the wheels, and then publish them in one go. Co-Authored-By: HAL 9000
Replace `actions/setup-python` with `astral-sh/setup-uv` and use `uv run` to run tests. Co-Authored-By: HAL 9000
Replace the synchronous, blocking `std::net::ToSocketAddrs::to_socket_addrs()` calls with async `tokio::net::lookup_host` to avoid blocking the tokio runtime during DNS resolution. Additionally, instead of only using the first resolved address, we now iterate over all resolved addresses and try connecting to each in sequence until one succeeds. This improves connectivity for hostnames that resolve to multiple addresses (e.g., dual-stack IPv4/IPv6). Co-Authored-By: HAL 9000
Enforce HTTPS for non-localhost URLs per LNURL spec and disable redirect following since the auth flow is a single GET request. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…v-python-build Switch to `uv` python build system
…sync-lookup Switch to async DNS resolution
.. we previously dropped the pin when moving MSRV to 1.85, but it seems that is not sufficient anymore..
…adapter-pin Re-pin `idna_adapter` to for MSRV builds
Move the GC pass from after insertion to before, so that stale entries are reclaimed before allocating a new bucket. This avoids unnecessary growth of the user map between GC cycles. AI tools were used in preparing this commit.
Run rate limiter garbage collection before inserting new user
Harden LNURL-auth request handling
Squashed base CBF commits (rebased onto upstream/main): - Add optional fee source from esplora/electrum - Add BIP 157 compact block filter chain source - Add CBF integration tests and documentation - Fix CBF chain source build errors and UniFFI bindings - Remove last_synced_height from cbf
Preparation for auto-restart: extract bip157 node build logic into a reusable helper method, add chain_state() from wallet checkpoint to avoid genesis re-sync, and thread Arc<Wallet> through start(). AI: claude
When node.run() exits (e.g. NoReachablePeers from kyoto lightningdevkit#558), the background task rebuilds the node, swaps the requester, and respawns channel processing tasks, up to MAX_RESTART_RETRIES (5) attempts with doubling backoff starting at 500ms. - Change cbf_runtime_status from Mutex<> to Arc<Mutex<>> so it can be shared with the async restart loop - Extract build_cbf_node_static() that takes explicit params instead of &self, enabling calls from 'static async blocks - Move all task spawning (info/warn/event + node.run) into a single restart loop inside spawn_background_task AI: claude
requester() now checks is_running() to give callers an immediate failure signal instead of waiting for SendError to propagate through the channel.
Extract cleanup_scan_state() helper and call it on error paths in run_filter_scan() to prevent stale watched_scripts, matched_block_hashes, and filter_skip_height from leaking between scans.
Wrap individual get_block() calls in tokio::time::timeout using the existing per_request_timeout_secs config. Previously only the overall sync had a timeout; individual block fetches could hang indefinitely (kyoto lightningdevkit#556).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Extract build_cbf_node helper, auto-restart with backoff,
liveness check, scan state cleanup, per-block timeout, required_peers default to 1, test fixes