Skip to content

statement-store: new api implementation#11989

Open
DenzelPenzel wants to merge 42 commits into
masterfrom
denzelpenzel/statement-store-api
Open

statement-store: new api implementation#11989
DenzelPenzel wants to merge 42 commits into
masterfrom
denzelpenzel/statement-store-api

Conversation

@DenzelPenzel
Copy link
Copy Markdown
Contributor

@DenzelPenzel DenzelPenzel commented May 5, 2026

Impl #10997

Summary

In this PR, we add the unstable statement-store JSON-RPC surface and wire it into the parachain node RPC stack. This lets clients submit SCALE-encoded statements over RPC, open one long-lived statement subscription, and then attach or remove topic filters on that subscription without opening a new stream for each filter.

The subscription flow is split into statement_unstable_subscribe and statement_unstable_addFilter. A subscription starts empty, each added filter gets its own filterId, and live notifications carry the ids of the filters that matched the statement. That lets a client track several statement topics over a single RPC subscription while still knowing which filters produced each replay or live event.

RPC Shape

We add statement_unstable_submit, statement_unstable_subscribe, statement_unstable_addFilter, and statement_unstable_removeFilter under sc-rpc-spec-v2::statement.

statement_unstable_submit decodes submitted statement bytes and maps store results into RPC-level outcomes: new, known, rejected, or invalid (an expired statement is reported as invalid). Subscription state is scoped to the jsonrpsee connection that created it, so a filter can only be added to or removed from a subscription owned by the same connection.

For filters, the unstable RPC accepts any and matchAll. matchAny is rejected at the RPC boundary for now, which keeps the external API aligned with the current unstable contract while the store internals can still use the optimized filter representation.

Subscription Semantics

A subscription is recorded in the per-connection registry before the jsonrpsee subscription is accepted. The subscription id is read from the pending sink (PendingSubscriptionSink::subscription_id(), available since jsonrpsee 0.24.11) and registered first, so registration happens-before the id is handed to the client. As a result, an addFilter that arrives immediately after subscribe always resolves its subscription – there is no accept/addFilter race window and no timeout-based lookup. If the accept fails, the registry entry is dropped and the subscription is unregistered.

Multi-filter subscriptions are handled by the existing statement subscription matcher workers. addFilter validates capacity, allocates a filterId, queues an AddFilter message for the matcher, and returns without waiting for replay snapshot collection. The matcher then collects the replay snapshot and registers the filter in the same critical section, so live statements cannot slip between the snapshot and filter registration.

For each added filter the subscription emits:

  • replayStatements batches for already-admitted matching statements
  • replayDone once that filter's replay is drained
  • newStatements for live statements, including all matching filterIds
  • stop if local subscription resource caps are hit

Live statements that arrive while a replay is still in progress are kept in matcher-owned pending state, then released once replay ordering allows it. Statements already delivered by replay are kept out of the live path for that filter, avoiding duplicate delivery for the common "submit, then subscribe" case.

Each subscription is capped at 128 active filters (MAX_FILTERS_PER_SUBSCRIPTION). Filter removal is idempotent, and dropping the RPC subscription cleans up matcher state.

@DenzelPenzel DenzelPenzel marked this pull request as draft May 5, 2026 21:12
@DenzelPenzel
Copy link
Copy Markdown
Contributor Author

/cmd fmt

@DenzelPenzel DenzelPenzel requested a review from alexggh May 8, 2026 15:57
@DenzelPenzel DenzelPenzel marked this pull request as ready for review May 8, 2026 15:57
@DenzelPenzel
Copy link
Copy Markdown
Contributor Author

/cmd fmt

@DenzelPenzel DenzelPenzel added T0-node This PR/Issue is related to the topic “node”. T10-tests This PR/Issue is related to tests. labels May 8, 2026
@DenzelPenzel
Copy link
Copy Markdown
Contributor Author

/cmd prdoc --audience node_dev --bump patch

github-actions Bot and others added 2 commits May 11, 2026 11:55
…ent-store-api

# Conflicts:
#	cumulus/zombienet/zombienet-sdk/tests/zombie_ci/statement_store/integration.rs
Comment thread cumulus/zombienet/zombienet-sdk/tests/zombie_ci/statement_store/common.rs Outdated
Comment thread substrate/client/rpc-spec-v2/src/statement/statement.rs
Comment thread substrate/client/statement-store/src/subscription.rs Outdated
Comment thread substrate/client/rpc-spec-v2/src/statement/subscription.rs Outdated
Comment thread substrate/client/rpc-spec-v2/src/statement/statement.rs
Comment thread substrate/client/rpc-spec-v2/src/statement/statement.rs Outdated
Comment thread substrate/client/rpc-spec-v2/src/statement/subscription.rs Outdated
Comment thread substrate/client/rpc-spec-v2/src/statement/statement.rs
Comment thread substrate/client/rpc-spec-v2/src/statement/subscription.rs Outdated
Comment thread substrate/primitives/statement-store/src/store_api.rs
Comment thread substrate/client/rpc-spec-v2/src/statement/statement.rs Outdated
{
let state = self.state.lock();
if state.active_filter_ids.len() >= MAX_FILTERS_PER_SUBSCRIPTION ||
state.pending_replays.len() >= PENDING_REPLAYS_HARD_CAP
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although this is capped, the size of the response is not, 128 matchAny requests can be open per every 16 connections. So it is 2048 matchAny that basically want to return all the statement store. And while 1 filter is streaming to the client everything else sits in RAM. Because as soon as filter request received we scale decode it and keep in store.pending_replays inside register_filter_with_snapshot. Even with 100MB state it will be a 200Gb.
There should be a way to make a "lazy" approach.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this is a real issue, see two implementation options:

  1. Option A – lazy by hashes, not bytes: At filter registration time, keep the current snapshot/register atomicity, but store only the matching statement hashes in the pending replay state instead of storing the full SCALE-encoded statements
  2. Option B – real cursor with snapshot watermark: add a replay cursor/watermark at the store level and replay matching statements lazily from the store up to that boundary

wdyt @alexggh @P1sar

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Option C -> Have a temporary column where is store the pending replys.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexggh you mean storing them on disc?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, column in the database.

Copy link
Copy Markdown
Contributor Author

@DenzelPenzel DenzelPenzel May 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add the hash instead of materializing full statements and than load statements from the col::STATEMENTS @alexggh @P1sar

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's continue track this issue here #12153

@DenzelPenzel
Copy link
Copy Markdown
Contributor Author

DenzelPenzel commented May 13, 2026

Statement Store RPC Bench Report

Parameter Value
Nodes 5 collators
Clients 50,000
Rounds 1
Interval 10,000 ms
Messages per client 5
Message size 512 bytes
Messages per round 250,000
RPC pool 5,000 connections x 5 nodes = 25,000 total
Collator RPC max connections 51,000
Collator RPC max subscriptions per connection 160
Runs per bench 9

Aggregate Result

Lower latency, send time, and elapsed time are better. On the main latency metric,
v2 is faster than v1 by 20.64%.

Metric v1 bench v2 unstable_bench v2 vs v1
Send avg, s 43.091667 33.355000 -22.60%
Receive avg, s 0.001000 0.841222 +0.840222 s
Latency avg, s 43.092556 34.196222 -20.64%
Attempts avg/msg 1.000000 1.000000 0.00%
Elapsed avg, s 195.00 176.89 -9.29%

Receive avg is shown as an absolute delta because the v1 value is effectively
near zero. In bench, subscription notifications are already available when the
timed receive phase starts. In unstable_bench, the timed receive phase includes
draining live events from the unstable multi-filter subscription after submit.
The meaningful end-to-end comparison is Latency avg.

Per-Run Data

v1: bench

Run Send avg, s Receive avg, s Latency avg, s Latency max, s Elapsed, s
1 41.186 0.001 41.187 60.614 197
2 39.830 0.001 39.831 60.698 191
3 43.228 0.001 43.229 61.045 192
4 46.910 0.001 46.911 68.892 201
5 39.224 0.001 39.224 58.607 189
6 39.419 0.001 39.420 59.423 190
7 44.629 0.001 44.630 65.223 194
8 47.242 0.001 47.243 69.187 202
9 46.157 0.001 46.158 67.934 199

v2: unstable_bench

Run Send avg, s Receive avg, s Latency avg, s Latency max, s Elapsed, s
1 33.534 0.937 34.471 46.233 179
2 33.413 0.809 34.222 43.961 177
3 35.799 0.936 36.735 47.177 179
4 27.989 0.625 28.614 41.925 173
5 31.313 0.785 32.098 42.184 171
6 42.028 1.097 43.125 57.402 189
7 32.255 0.819 33.075 49.644 181
8 28.545 0.655 29.200 38.125 166
9 35.319 0.908 36.226 47.307 177

Tail Behavior

v2 improves both average and tail latency in this run set.

Metric v1 bench v2 unstable_bench v2 vs v1
Median latency avg, s 43.229 34.222 -20.84%
Mean of latency max, s 63.514 45.995 -27.58%
Worst latency max, s 69.187 57.402 -17.03%

Conclusion

v2 is better on the primary latency path for this run set:

Comparison Result
Average latency v2 is 20.64% faster
Average send time v2 is 22.60% faster
Median per-run latency v2 is 20.84% faster
Mean max latency v2 is 27.58% better
Worst observed max latency v2 is 17.03% better
Average elapsed time v2 is 9.29% faster
Compared sample size 9 runs per bench

Comment thread substrate/client/rpc-spec-v2/src/statement/subscription.rs Outdated
Comment thread substrate/client/rpc-spec-v2/src/statement/statement.rs Outdated
Comment thread substrate/client/rpc-spec-v2/src/statement/event.rs Outdated
Comment thread substrate/client/statement-store/src/subscription.rs Outdated
Comment thread substrate/client/statement-store/src/subscription.rs Outdated
@DenzelPenzel
Copy link
Copy Markdown
Contributor Author

/cmd fmt

@DenzelPenzel DenzelPenzel force-pushed the denzelpenzel/statement-store-api branch from 113bb18 to 61be0eb Compare May 20, 2026 22:01
@DenzelPenzel DenzelPenzel requested review from P1sar and alexggh May 21, 2026 11:30
Comment thread substrate/client/statement-store/src/subscription.rs Outdated
Comment thread prdoc/pr_11989.prdoc
Comment thread substrate/client/rpc-spec-v2/src/statement/event.rs Outdated
Comment thread substrate/client/rpc-spec-v2/src/statement/statement.rs
Comment thread substrate/client/statement-store/src/subscription.rs Outdated
Comment thread substrate/client/statement-store/src/subscription.rs Outdated
Comment thread substrate/client/statement-store/src/subscription.rs Outdated
Comment thread substrate/client/rpc-spec-v2/src/statement/statement.rs Outdated
Comment thread substrate/client/rpc-spec-v2/src/statement/api.rs Outdated
@alexggh
Copy link
Copy Markdown
Contributor

alexggh commented May 27, 2026

Had a detailed look, good job @DenzelPenzel.

If left you a few more comments, nothing major, with those applied this should be ready for approving from my side.

@DenzelPenzel DenzelPenzel force-pushed the denzelpenzel/statement-store-api branch from 3f29e47 to b626640 Compare May 27, 2026 23:21
@DenzelPenzel DenzelPenzel force-pushed the denzelpenzel/statement-store-api branch from f6debbb to b362729 Compare May 28, 2026 11:08
@paritytech-workflow-stopper
Copy link
Copy Markdown

All GitHub workflows were cancelled due to failure one of the required jobs.
Failed workflow url: https://github.com/paritytech/polkadot-sdk/actions/runs/26595273713
Failed job name: fmt

@DenzelPenzel DenzelPenzel force-pushed the denzelpenzel/statement-store-api branch from 01522dd to 1042495 Compare May 28, 2026 21:24
@DenzelPenzel
Copy link
Copy Markdown
Contributor Author

/cmd fmt

@DenzelPenzel DenzelPenzel requested a review from alexggh May 28, 2026 22:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

T0-node This PR/Issue is related to the topic “node”. T10-tests This PR/Issue is related to tests.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants