[ft] SMT Replenish with > 1000 lastrevealed index#1960
Conversation
|
Concept ACK, but need to handle cfg when std feature not enabled. |
|
Thanks for working on this! My main concern is the introduction of an internal We discussed this in the team meetings and the rough consensus for a solution is to cache spks in the changesets. We can structure these changesets in a way so that you can introduce spks for a keychain well in the future. For example: pub struct ChangeSet {
/// Contains for each descriptor_id the last revealed index of derivation
pub last_revealed: BTreeMap<DescriptorId, u32>,
/// Cache of script pubkeys.
///
/// We can choose to populate this or leave it empty. The caller can choose to populate this in
/// any manner they intend; multi-threaded, downloaded asynchronously, etc.
pub spk_cache: BTreeMap<(DescriptorId, u32), ScriptBuf>,
}I'm assuming we need an associated cache either in To understand your requirements better, is there a special way you are loading the wallet? Or is it loaded from a changeset (the typical way)? |
|
I suspect there's a way to do this without need of the I also support the caching approach, but I don't see why we can't do both with this mult-ithreaded solution going in first since it's a smaller change. Caching will still be useful for both std and no-std users. |
I currently have a version of this but the number of libraries that are modified and touched is high at 2 or 3, between bdk-wallet, bdk, and bdk-sqlx. Getting SMT working seems to be an overall much lighter solution that would also speed up the initial generation of the cache. |
7a334cf to
3bc325b
Compare
|
@notmandatory Okay I see what you are saying. To avoid having more complex changes, what do you think about creating a new constructor method that takes in the following: pub fn with_cache(lookahead: u32, cache: impl IntoIterator<Item = KeychainIndexed<K, ScriptBuf>>) -> Self { todo!() }This way, we don't need to be opinionated about how we construct the cache and can leave it up to the caller. To implement this cleaner, I suggest creating a However, I do wonder if it really is that much more difficult to cache spks in the changeset as, it seems like that more complete solution since you don't need to re-derive the spks. |
[CI] Cargo fmt applied [Req] Remove used of mutex and instead use returned results. [std/non-std] variations [fmt] Cargo fmt [Clean] dedup'd code [Clean] Cargo Fmt repair
568744e to
22573f6
Compare
|
@portlandhodl how many public keys are there and how long are the derivation paths for the descriptor in your benchmark. It must be quite a big one to actually hit those numbers. On my computer doing 100,000 derivations for a bip86 style derivation takes around 5 seconds! |
Currently using bip48 descriptors (depth = 6) along with miniscript with multiple spend conditions. |
|
Want me to close in favor of #1963 ? |
62767f0 fix(rusqlite_impl): Fix derived spks create table statement (valued mammal) 603f133 docs(chain): Adds docs for persisting spks in `KeychainTxOutIndex` (志宇) 3126cd2 feat(chain)!: Clean up ergonomics of `IndexedTxGraph` (志宇) a055581 feat(chain): `KeychainTxOutIndex`: Make spk cache optional (志宇) 19e3b5d feat(chain): Add `KeychainTxOutIndex::from_changeset` constructor (志宇) d761265 feat(chain): `KeychainTxOutIndex`: Debug build checks (志宇) 76875e7 fix(chain)!: API and logical issues in `KeychainTxOutIndex` (志宇) d299dae feat(chain)!: Persist spks derived from `KeychainTxOutIndex` (志宇) Pull request description: Replaces #1960 Fixes #1964 ### Description Users with large wallet and/or complex descriptors may experience slow startup of `KeychainTxOutIndex`. This PR addresses this problem by providing the option to persist derived spks so that they no longer need to be re-derived on startup. The `IndexedTxGraph` API has been changed for better ergonomics. Compared to #1960, this is a more longterm solution that does not depend on multi-threading logic. ### Changelog notice ```md Changed - `KeychainTxOutIndex::new` to take in an additional parameter `persist_spks` to control whether derived spks are cached and persisted across restarts. The default of `persist_spks` is false. - `KeychainTxOutIndex` methods (`lookahead_to_target, `next_unused_spk`, `reveal_next_spk`) now return changesets as they may derive spks to be persisted. - The `InsertDescriptorError` type now wraps descriptors in `Box` to reduce enum size. Added - `spk_cache` field to `indexer::keychain_txout::ChangeSet` which persists derived spks. - `IndexedTxGraph::from_changeset` - allows constructing from `indexed_tx_graph::ChangeSet` and only indexing when ready. - `IndexedTxGraph::reindex` method. Fixed - `KeychainTxOutIndex::reveal_to_target` so that it actually returns `None` if the `keychain` does not exist. ``` ### Checklists #### All Submissions: * [x] I've signed all my commits * [x] I followed the [contribution guidelines](https://github.com/bitcoindevkit/bdk/blob/master/CONTRIBUTING.md) * [x] I ran `cargo +nightly fmt` and `cargo clippy` before committing #### New Features: * [ ] I've added tests for the new feature * [x] I've added docs for the new feature ACKs for top commit: ValuedMammal: ACK 62767f0 notmandatory: ACK 62767f0 Tree-SHA512: dc1aa4308ffcc6d121e0d7a1ca4ff9f641ed5db63204747fde47ac02e45dae9b65da95554541705a98b69e59f741c043485a26db736966417061a4c9d220ba29
|
Closed as superseded by #1963 |
Currently in BDK there is a pretty big problem when loading a wallet where the last revealed index is into the value of 1000+ for even decent hardware because of the need to utilize the CPU to generate SPKs for each index and add them to the graph. The process will only use a single CPU core currently. Large wallets will cause timeouts and/or become a DoS vector.
This will if greater than 1000 last revealed use all available CPU cores to generate the SPKs,
Previous, ~3 minutes for 100000 SPKs

Notice the CPU usage is only a single core at a time.
New, 28 Seconds for 100000 SPKs

All cores are being used!
This speed up is linear to the number of CPU cores.