Fully switch to async `KVStore` persistence by tnull · Pull Request #919 · lightningdevkit/ldk-node

tnull · 2026-06-03T10:59:55Z

We planned to make the full switch to the async KVStore for a while. Here we do just that: first step-by-step moving the remaining dependencies over to use KVStore, then finally dropping KVStoreSync and all implementations.

Depends on lightningdevkit/rust-lightning#4658 (will drop the DROPME commit once that's merged).

ldk-reviews-bot · 2026-06-03T10:59:58Z

👋 Thanks for assigning @joostjager as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

joostjager

Overall happy with this simplification. Big changeset though, even though not all that much is happening. Risk nonetheless.

Bigger open question for me is what the latest is on runtime abstraction and how that is going to work in combination with potential wasm support.

Also, are there sync public api calls that can now be made async such as export_pathfinding_scores?

joostjager · 2026-06-03T14:06:33Z

+	signer_provider: &'a test_utils::TestKeysInterface,
+}
+
+impl<K: KVStore + Sync> TestMonitorUpdatePersister<'_, K> {


Commit message says dropping the trait, but this looks more substantial?

Ah right, move it to another commit for now, but will also put up a minor change in LDK so we can reuse the test utils wihtout rewriting this.

That should allow us to avoid it: lightningdevkit/rust-lightning#4665

Will allow us to simply do:

type TestMonitorPersister<K> = MonitorUpdatingPersisterAsync< TestStoreRef<K>, TestFutureSpawner, &'static test_utils::TestLogger, &'static test_utils::TestKeysInterface, &'static test_utils::TestKeysInterface, &'static test_utils::TestBroadcaster, &'static test_utils::TestFeeEstimator, >; type TestAsyncPersister<K> = lightning::chain::chainmonitor::AsyncPersister< TestStoreRef<K>, TestFutureSpawner, &'static test_utils::TestLogger, &'static test_utils::TestKeysInterface, &'static test_utils::TestKeysInterface, &'static test_utils::TestBroadcaster, &'static test_utils::TestFeeEstimator, >; (..) let monitor_persister = TestMonitorPersister::new( store, TestFutureSpawner::new(Arc::clone(&runtime)), &chanmon_cfg.logger, max_pending_updates, &chanmon_cfg.keys_manager, &chanmon_cfg.keys_manager, &chanmon_cfg.tx_broadcaster, &chanmon_cfg.fee_estimator, ); let event_notifier = Arc::new(Notifier::new()); let persister = lightning::chain::chainmonitor::AsyncPersister::new_test( monitor_persister, Arc::clone(&event_notifier), ); (...) test_utils::TestChainMonitor::new_with_event_notifier( Some(&chanmon_cfg.chain_source), &chanmon_cfg.tx_broadcaster, &chanmon_cfg.logger, &chanmon_cfg.fee_estimator, &persister.persister, &chanmon_cfg.keys_manager, Arc::clone(&persister.event_notifier), )

.. rather than the custom TestMonitorPersister

Will await the discussion there. Would be nice to use the incremental persister in tests.

joostjager · 2026-06-03T14:08:34Z

-pub(crate) fn create_persister<'a, K: KVStoreSync + Sync>(
-	store: &'a K, chanmon_cfg: &'a TestChanMonCfg, max_pending_updates: u64,
+pub(crate) fn create_persister<'a, K: KVStore + Sync>(
+	store: &'a K, chanmon_cfg: &'a TestChanMonCfg, _max_pending_updates: u64,


Wasn't max pending updates exercised in a test?

Yes, but we can't use it right now as LDK seems not to expose the right APIs to use async path with the test utils, will open a fix in a bit.

Is it tracked somewhere?

Yes lightningdevkit/rust-lightning#4665

joostjager · 2026-06-03T14:11:31Z

-		);
-
-		let runtime_handle = internal_runtime.handle();
-		let schema_version = tokio::task::block_in_place(|| {


Isn't it a problem that setup can't fail fast here anymore?

We fail fast elsewhere via setup_schema_version.

Can't the schema_version then be stored as a normal field? It seems the code is currently halfway between lazy init and explicit setup.

Well, no, we need to discover it based on the server-side state for backwards compatibility, as unfortunately schema version wasn't always there.

I see you removed the lazy init and made it a field? That was the suggestion indeed

I see you removed the lazy init and made it a field? That was the suggestion indeed

Well, we just returned to the previous behavior using the internal runtime to block_on on the version negotiation. We'll need to return to the lazy init likely in the next step though.

benthecarman

the vss ci job hung for 2 hours hopefully a transient issue

benthecarman · 2026-06-03T19:14:12Z

codex seems to agree on the locks

[P2] Serialize peer-store persistence updates — /home/ben/projects/ldk-node/src/peer_store.rs:43-52
When two peer-store mutations overlap, this now releases the peer map lock before the async write completes, so an
older snapshot can persist after a newer one. For example, concurrent add_peer calls can leave memory with both
peers but the stored peer list with only the first one, causing reconnect peers to be missing after restart. Please
serialize the mutation plus persistence, as DataStore now does with a separate async mutation lock.
[P2] Preserve node-metrics update ordering — /home/ben/projects/ldk-node/src/io/utils.rs:346-351
If two node-metrics updates run concurrently, the lock is released before the async KV write, so an earlier encoded
snapshot can finish after a later one and overwrite it. This can drop one of the timestamp fields on disk and make
the node believe a sync/broadcast task has not run after restart. The previous code explicitly held the write lock
through persistence to avoid this, so this path needs equivalent serialization for the async write.

tnull · 2026-06-05T08:25:40Z

As discussed offline, now updated:

Re-added the internal runtimes for Postgres / VSS to avoid tokio getting stuck if we end up using block_on while the same thread holds the IO driver. We'll be able to revert this/drop the runtimes once we go fully async API in a following PR.
Added mutation locks for node metrics and peer store persistence for now, but we'll be able to simplify more once we switch to tokio::sync, once we go fully-async.
Added a commit that enables eager handoff of the IO driver if we're built under cfg(tokio_unstable) and enables that for bindings builds.

tnull · 2026-06-05T08:41:29Z

Hmm, seems Github CI might have an outage, at least it's not running right now..

joostjager · 2026-06-05T09:49:45Z

 	Handle(tokio::runtime::Handle),
 }

+pub(crate) struct StoreRuntime {


This can also use some serious documentation on why it is needed, even if temporary.

Also wondering if it isn't possible to unify this with the existing Runtime wrapper? That might also make the various modes more clear: current runtime, passed runtime, new runtime, or some combo.

Also wondering if it isn't possible to unify this with the existing Runtime wrapper? That might also make the various modes more clear: current runtime, passed runtime, new runtime, or some combo.

Considered that, but given the plan is to remove it ASAP, I decided to keep it separate which will make the cleanup much more straightforward.

I don't think clean up is really much different, but in the mean time reusing the Runtime does make it more clear that we have different variations of the same concept. Maybe it takes longer than we'd like to get rid of it. And can reuse the thread naming thing too.

I assume my last comment didn't change your view on it. It's temporary, so it's fine.

joostjager · 2026-06-05T09:50:35Z

+				format!("{}-{}", thread_name_prefix, id)
+			})
+			.worker_threads(worker_threads)
+			.max_blocking_threads(worker_threads)


What is the reason for this setting?

It shouldn't matter as we shouldn't hit blocking calls in the persistence pipeline anyways now.

Still curious why it is needed

Still curious why it is needed

Well, I don't think it is (given we should never use blocking threads on these runtimes), but also doesn't hurt to bound the number to be more in-line with the API-configured limit.

joostjager · 2026-06-05T09:57:04Z

-		tokio::task::block_in_place(move || drop(internal_runtime));
+		if let Some(runtime) = self.internal_runtime.take() {
+			if let Ok(runtime) = Arc::try_unwrap(runtime) {
+				runtime.shutdown_background();


What does it mean if we skip shutdown here?

I'm not sure I'm following what you mean with 'skip' in this context?

I mean the Drop path conditionally does not call shutdown_background_tasks, so the question is what state/resources/tasks may remain when that branch is taken?

joostjager · 2026-06-05T10:00:41Z

+	runtime: Option<tokio::runtime::Runtime>,
+}
+
+impl StoreRuntime {


Name describes usage, not really what it does. IsolatedRuntime?

Hmm, the thing itself also doesn't isolate the runtime, no? It's just used in an isolated manner?

Not sure this improves anything? Note it's not the object that isolates anything really, but its usage.

I just mean that if you want to use this generally-usable object for something else, you need to rename it even though the code stays the same.

benthecarman · 2026-06-08T20:47:33Z

+			task.await.map_err(|e| {
+				io::Error::new(
+					io::ErrorKind::Other,
+					format!("PostgreSQL runtime task failed: {}", e),


we have the same message for all of these, would be better if it had the function name in it for easier debugging. however this is removed in the next pr so not very important

Agreed on the duplication

Well we had helper previously, but Joost asked to drop it and move it to StoreRuntime, which however doesn't work well either (given the error messages are store-specific), so decided to simply inline them, esp. given it's just 5 lines, and hiding them in a helper just obfuscates what happens concretely, i.e., what runtime exactly is used to await what etc..

benthecarman · 2026-06-08T20:50:33Z

+					.await
+			});
+			task.await.map_err(|e| {
+				io::Error::new(io::ErrorKind::Other, format!("VSS runtime task failed: {}", e))


See #919 (comment)

benthecarman · 2026-06-08T20:54:45Z

 	}
 }

 struct TestSyncStoreInner {


Can this just be called TestStoreInner/TestStore since its all async now? Also noticed I missed adding postgres to this

Can this just be called TestStoreInner/TestStore since its all async now?

No, this would be pretty confusing, as the Sync part of it is important (i.e., the purpose is to assert all stores are kept in-sync). Note we also have TestStore in LDK test utils which we used to use here, but which unfortunately is now unusable for us as we decided that KVStore path it requires some additional per-operation confirmations that we won't be able to run in LDK Node, hence why we introduced InMemoryStore.

Also noticed I missed adding postgres to this

Ah, right, feels a bit orthogonal to this PR, feel free to open a quick follow-up.

joostjager

Generally happy with where this landed. No blockers.

One process point: the PR is quite large and that makes review and re-review inefficient. Also managing the many discussing threads becomes unwieldy and risks loosing track of something that might be important. In my opinion better to avoid in the future.

I think the extensive comments that you and/or AI added are good context for future humans or AIs reading it.

I’ll take a look at the follow-up PR too, to see how the final async end state takes shape.

tnull · 2026-06-09T07:58:05Z

Generally happy with where this landed. No blockers.

One process point: the PR is quite large and that makes review and re-review inefficient. Also managing the many discussing threads becomes unwieldy and risks loosing track of something that might be important. In my opinion better to avoid in the future.

I think the extensive comments that you and/or AI added are good context for future humans or AIs reading it.

I’ll take a look at the follow-up PR too, to see how the final async end state takes shape.

Squashed fixups without further changes. Given how invasive the changeset is I'd like to move on this one rather fast, any remaining comments / improvements can happen in follow-ups IMO.

tnull · 2026-06-09T07:59:09Z

Ah, grr, still blocked on lightningdevkit/rust-lightning#4661.

Bump LDK to 3dfcc4cca1866c5e5d4d4eaf3b82e09584e2ce5c Co-Authored-By: HAL 9000

Read and write BDK wallet state through async KVStore helpers while keeping the current WalletPersister entry points bridged through the node runtime. This reduces the wallet persistence surface that still depends on KVStoreSync. Co-Authored-By: HAL 9000

Static invoice persistence already runs from async handlers, so use KVStore directly instead of routing those reads and writes through the blocking KVStoreSync trait. Co-Authored-By: HAL 9000

Persist peer store updates through async KVStore operations. The synchronous node APIs keep bridging at their runtime boundary while async event handling awaits peer persistence directly. Co-Authored-By: HAL 9000

Persist DataStore mutations through async KVStore operations while keeping the existing synchronous APIs bridged through the node runtime. Async event handling now awaits payment store writes directly. Co-Authored-By: HAL 9000

Persist node metric updates through async KVStore writes and await them from the chain, gossip, and scoring tasks. This removes the remaining blocking metrics writer while keeping the helper name stable. Co-Authored-By: HAL 9000

Persist the on-chain wallet through BDK's AsyncWalletPersister so wallet state writes use the async KVStore path. Existing synchronous wallet APIs keep bridging through the node runtime until their callers are made async. Co-Authored-By: HAL 9000

Open filesystem stores through the async LDK migration helper so v1-to-v2 store migration no longer depends on the blocking KVStoreSync migration path. Co-Authored-By: HAL 9000

Move the existing in-memory test store into a shared module without changing its behavior. This lets integration tests reuse it while keeping the later async TestSyncStore change separate. Co-Authored-By: HAL 9000

Exercise async KVStore operations in TestSyncStore and filesystem migration tests while keeping the temporary sync comparison path until the final KVStoreSync removal. Co-Authored-By: HAL 9000

Use a local async KVStore-backed monitor persister in store tests so the later blocking KVStore removal can drop the old synchronous MonitorUpdatingPersister path without mixing the test adapter into that change. Co-Authored-By: HAL 9000

Drop the remaining synchronous KV store trait bounds and implementations. After the preceding migrations, custom stores only need to provide async KVStore persistence, and pathfinding score export also reads from the async store. Co-Authored-By: HAL 9000

Add a crate-local runtime wrapper for store backends that need to keep their I/O isolated while shutting down safely from async contexts. Co-Authored-By: HAL 9000

Co-Authored-By: HAL 9000

Enable Tokio's eager driver handoff when building with tokio_unstable so node-owned runtimes can use the dedicated driver handoff path where available. Build binding artifacts and selected CI coverage with tokio_unstable so the cfg-gated runtime path remains exercised. Co-Authored-By: HAL 9000

Give ldk-node-created runtime workers a stable ldk-node-prefixed thread name so runtime activity is easier to identify in debuggers and logs, matching the store runtime naming convention. Co-Authored-By: HAL 9000

tnull · 2026-06-09T08:17:13Z

Ah, grr, still blocked on lightningdevkit/rust-lightning#4661.

Okay, now converted the DROPME commit to an actual bump commit for the 0.3 revision post-merge of that PR.

joostjager

No CI. I assume the tests pass locally?

tnull marked this pull request as draft June 3, 2026 11:00

tnull force-pushed the 2026-06-async-kvstore-persistence branch 2 times, most recently from 42c16e2 to 4371f6d Compare June 3, 2026 11:25

tnull mentioned this pull request Jun 3, 2026

Full WASM support #902

Open

9 tasks

tnull force-pushed the 2026-06-async-kvstore-persistence branch 2 times, most recently from 05ebc28 to 911c169 Compare June 3, 2026 12:15

tnull added this to the 0.8 milestone Jun 3, 2026

tnull force-pushed the 2026-06-async-kvstore-persistence branch 5 times, most recently from 1b4d24f to c62a503 Compare June 3, 2026 12:28

tnull marked this pull request as ready for review June 3, 2026 12:28

tnull requested a review from joostjager June 3, 2026 12:28

tnull mentioned this pull request Jun 3, 2026

Add database persistence benchmarks #915

Open

joostjager reviewed Jun 3, 2026

View reviewed changes

tnull force-pushed the 2026-06-async-kvstore-persistence branch 2 times, most recently from 290b0a5 to 9ad8f2d Compare June 3, 2026 15:42

benthecarman reviewed Jun 3, 2026

View reviewed changes

Comment thread src/peer_store.rs

Comment thread src/io/utils.rs

tnull force-pushed the 2026-06-async-kvstore-persistence branch from 9ad8f2d to 8b93895 Compare June 5, 2026 08:04

tnull requested review from benthecarman and joostjager June 5, 2026 08:25

tnull force-pushed the 2026-06-async-kvstore-persistence branch from 326cbf7 to be2b74d Compare June 5, 2026 08:35

tnull removed the request for review from joostjager June 5, 2026 08:55

joostjager reviewed Jun 5, 2026

View reviewed changes

Comment thread src/runtime.rs

joostjager reviewed Jun 5, 2026

View reviewed changes

benthecarman reviewed Jun 8, 2026

View reviewed changes

joostjager previously approved these changes Jun 9, 2026

View reviewed changes

tnull force-pushed the 2026-06-async-kvstore-persistence branch from 9e095d1 to 78708a3 Compare June 9, 2026 07:54

tnull requested a review from joostjager June 9, 2026 07:58

tnull removed the request for review from joostjager June 9, 2026 07:59

tnull added 17 commits June 9, 2026 10:13

Bump LDK for async store migration

896b627

Bump LDK to 3dfcc4cca1866c5e5d4d4eaf3b82e09584e2ce5c Co-Authored-By: HAL 9000

Use async KV storage for static invoices

e67b7a1

Static invoice persistence already runs from async handlers, so use KVStore directly instead of routing those reads and writes through the blocking KVStoreSync trait. Co-Authored-By: HAL 9000

Move peer persistence onto async KV storage

d49b681

Persist peer store updates through async KVStore operations. The synchronous node APIs keep bridging at their runtime boundary while async event handling awaits peer persistence directly. Co-Authored-By: HAL 9000

Move DataStore persistence onto async KV storage

b3f947c

Persist DataStore mutations through async KVStore operations while keeping the existing synchronous APIs bridged through the node runtime. Async event handling now awaits payment store writes directly. Co-Authored-By: HAL 9000

Move node metrics persistence onto async KV storage

0d48be8

Persist node metric updates through async KVStore writes and await them from the chain, gossip, and scoring tasks. This removes the remaining blocking metrics writer while keeping the helper name stable. Co-Authored-By: HAL 9000

Use BDK's async wallet persister

5ac2ed1

Persist the on-chain wallet through BDK's AsyncWalletPersister so wallet state writes use the async KVStore path. Existing synchronous wallet APIs keep bridging through the node runtime until their callers are made async. Co-Authored-By: HAL 9000

Use async KVStore migration for filesystem stores

a72905c

Open filesystem stores through the async LDK migration helper so v1-to-v2 store migration no longer depends on the blocking KVStoreSync migration path. Co-Authored-By: HAL 9000

Move test InMemoryStore into shared module

98170fe

Move the existing in-memory test store into a shared module without changing its behavior. This lets integration tests reuse it while keeping the later async TestSyncStore change separate. Co-Authored-By: HAL 9000

Move test store checks onto async KV storage

d6fd1dc

Exercise async KVStore operations in TestSyncStore and filesystem migration tests while keeping the temporary sync comparison path until the final KVStoreSync removal. Co-Authored-By: HAL 9000

Add async test monitor persister

99fe33b

Use a local async KVStore-backed monitor persister in store tests so the later blocking KVStore removal can drop the old synchronous MonitorUpdatingPersister path without mixing the test adapter into that change. Co-Authored-By: HAL 9000

Add shared store runtime wrapper

431be00

Add a crate-local runtime wrapper for store backends that need to keep their I/O isolated while shutting down safely from async contexts. Co-Authored-By: HAL 9000

Isolate VSS persistence from the node runtime

0359c22

Co-Authored-By: HAL 9000

Isolate PostgreSQL persistence from the node runtime

81ba551

Co-Authored-By: HAL 9000

Name owned node runtime threads

a3e7653

Give ldk-node-created runtime workers a stable ldk-node-prefixed thread name so runtime activity is easier to identify in debuggers and logs, matching the store runtime naming convention. Co-Authored-By: HAL 9000

tnull dismissed joostjager’s stale review via a3e7653 June 9, 2026 08:15

tnull force-pushed the 2026-06-async-kvstore-persistence branch from 78708a3 to a3e7653 Compare June 9, 2026 08:15

tnull requested a review from joostjager June 9, 2026 08:17

joostjager approved these changes Jun 9, 2026

View reviewed changes

tnull merged commit 00dba45 into lightningdevkit:main Jun 9, 2026

Conversation

tnull commented Jun 3, 2026

Uh oh!

ldk-reviews-bot commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

joostjager left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

benthecarman left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

benthecarman commented Jun 3, 2026

Uh oh!

tnull commented Jun 5, 2026

Uh oh!

tnull commented Jun 5, 2026

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

ldk-reviews-bot commented Jun 3, 2026 •

edited

Loading

benthecarman left a comment •

edited

Loading