Deterministic runtime crate by Shubham8287 · Pull Request #5016 · clockworklabs/SpacetimeDB

Shubham8287 · 2026-05-13T15:29:36Z

Description of Changes.

Introduces deterministic runtime crate.
Integrate it with RelationalDB.

I think best steps to review:

Read the README of runtime crate.
Look at the integration with existing crates - durability, core, snapshot, etc.
Read runtime crate's code.

Draft branch to Test code - #5019

API and ABI breaking changes

NA

Expected complexity level and risk

Does not intend to change any production functionality, but it's big code.

Testing

new crate contains unit and integration tests.
Existing tests should work for production.

…straction

Signed-off-by: Shubham Mishra <shivam828787@gmail.com>

Co-authored-by: Shubham Mishra <shivam828787@gmail.com> Signed-off-by: Shubham Mishra <shivam828787@gmail.com>

Signed-off-by: Shubham Mishra <shivam828787@gmail.com>

kim · 2026-05-19T06:35:48Z

+    Tokio(tokio::task::JoinHandle<T>),
+    #[cfg(feature = "simulation")]
+    Simulation(sim::JoinHandle<T>),
+    Detached(PhantomData<T>),


This one is interesting and could use some commentary!

kim · 2026-05-19T08:14:49Z

+    let original = PTHREAD_ATTR_INIT.get_or_init(|| unsafe {
+        // `RTLD_NEXT` skips this interposed function and finds the libc
+        // implementation that would have been called without the simulator.
+        let ptr = libc::dlsym(libc::RTLD_NEXT, c"pthread_attr_init".as_ptr().cast());


overriding libc methods turning out to be limited for detecting source of nondeterminism. I am experimenting with eBPF, which looks more powerful.

TIL about eBPF. Interesting.

It can be used to do so many things, https://github.com/clockworklabs/SpacetimeDB/pull/5177/changes#diff-8c7a3b299146fed405fed8a131643bb6aaac19152857cb129826ef17457d1a89. This is some experimental low level code to detect futex call (under simulation, this means determinism leakage).

jsdt · 2026-05-29T16:45:00Z

+
+- **Zero dependency.** The simulation core in `sim/` is already `no_std + alloc`. The `sim_std` module is a thin OS-facing wrapper — the std dependency lives there, not in the simulation core itself. It stays until the application logic above this crate also moves to `no_std`.
+
+## Current Limitations


You touched on this in the point about it being a single-threaded runtime, but this is also limited in how granularly it can interleave executions of different tasks, which is going to make it impossible to produce some race conditions that would be possible in a multi-threaded runtime. If I'm reading this correctly, the sim executor picks one of the tasks, then it polls that task (which runs the task until it finishes or an await returns Pending), then it picks a task again. This means we can only mess with scheduling when a task yields.

For a simple example of a case of an execution that we can't simulate:

async fn task_a(mut rx: Receiver<Msg>) { log("A: waiting"); rx.recv().await; // Pending until Task B sends. log("A: received a message"); do_a_work(); // Cannot run until Task B yields or finishes. } async fn task_b(mut tx: Sender<Msg>) { log("B: running"); tx.send(Msg).await; // Ready immediately; wakes Task A but does not yield. do_b_work(); // Always runs before Task A resumes. log("B: finished working"); }

We could never produce:

A: waiting B: running A: received a message B: finished working

So if there is a bug when do_a_work runs before do_b_work, we would never be able to catch it.

I think the fix for this specific case would be to have channels be part of the runtime crate, so that when send is called, we can insert a yield_now call in the sim version (if send is async).

More generally, if the goals is to be able to simulate any possible race condition, we need to be able to suspend/interleave execution at any point in the code that either affects other tasks/threads or is affected by other tasks/threads.

For things with an async api, having a sim version that inserts a yield_now will make it possible to interleave, but addressing this for sync apis that affect other threads will require changing the executor. Even channels have some sync functions that we would want to be able to suspend (like try_recv or send on an UnboundedSender).

If we want this to work for sync functions that interact with other tasks, then the sim executor will need more than one thread.

As we also discussed this out of band, but putting it here too:

More generally, if the goals is to be able to simulate any possible race condition, we need to be able to suspend/interleave execution at any point in the code that either affects other tasks/threads or is affected by other tasks/threads.

Agreed. With the current DB architecture, a single-threaded executor cannot simulate every possible race condition.

That said, I think the broader goal should be to move toward an architecture where data races are impossible by design. In practice, that means moving toward a single-threaded runtime model, which also aligns with our thread-per-core principle.

Practically, I think this would mean allowing only the database thread to touch the datastore, while potentially keeping separate threads for things like ws message handoff. If the “deep core” part of our DB runs single-threaded, we should be able to simulate all of it. The parts that must remain multi-threaded should live outside the “core,” and have to be tackled later.

If we want this to work for sync functions that interact with other tasks, then the sim executor will need more than one thread.

I think putting more than one thread in executor make the tests not replayable by seed.

So, maybe I think current scope should be to keep executor single-threaded only, even if does not let us explore all race conditions, it is still very helpful for other classes of bugs - Schema bugs, replayability bugs, fault-tolerance related, etc from day 1.

Btw, I’m also checking out https://docs.rs/loom/latest/loom/ to understand how it detects multithread-specific bugs while still being deterministic, and whether we can leverage something from it.

I think putting more than one thread in executor make the tests not replayable by seed.

I'm sure that is not the case. We can have multiple threads while staying replayable/deterministic as long as we ensure that only one thread is unparked at any given time. I'm working on a PoC of that today.

I think current scope should be to keep executor single-threaded only, even if does not let us explore all race conditions, it is still very helpful for other classes of bugs - Schema bugs, replayability bugs, fault-tolerance related, etc from day 1.

I think that is reasonable. Maybe if the multi-threaded version is easy to implement, it would make sense to add it sooner, but it can probably be added later without breaking what we have now.

That said, I think the broader goal should be to move toward an architecture where data races are impossible by design. In practice, that means moving toward a single-threaded runtime model, which also aligns with our thread-per-core principle.

Making races impossible sounds great, but even with a thread-per-core design, we will almost certainly still have multiple threads interacting with each other.

I'm sure that is not the case. We can have multiple threads while staying replayable/deterministic as long as we ensure that only one thread is unparked at any given time. I'm working on a PoC of that today.

Right, that's a good approach which you already explain but missed from my mind :)

cloutiertyler · 2026-06-02T16:28:34Z

+    Tokio(TokioHandle),
+    #[cfg(feature = "simulation")]
+    Simulation(sim::Handle),
+}


I don't want this comment to block merging this PR, so feel free to close it, but I'm curious about the chose of enums vs traits to implement this. Seems reasonable, but could you comment on that choice briefly?

No strong reasons, but this felt more natural due to small set of variants. Using trait would have require to wrap TokioHandle in some newtype to be able to implement custom trait.

cloutiertyler · 2026-06-02T16:30:50Z

+    /// Paused-node tasks are diverted into that node's paused buffer instead of
+    /// being polled immediately.
+    fn run_all_ready(&self) {
+        while let Some(runnable) = self.queue.try_recv_random(&self.rng) {


As we discussed, I also think that round robin order is correct and possibly preferrable (within a single node).

Why would round-robin order be preferrable? That seems like it would limit the set race conditions that can be simulated significantly.

cloutiertyler · 2026-06-02T16:33:20Z

@@ -0,0 +1,51 @@
+use crate::sim::Runtime;


NIT: You might consider having sim be a separate crate, because eventually you're going to want to mock out things like the disk/network/etc. which isn't really part of the runtime.

cloutiertyler · 2026-06-02T16:35:01Z

This is all very similar to what I have in my experimental project.

cloutiertyler · 2026-06-02T16:41:25Z

This is all reasonable, but I do want to flag that we should work towards just not using these things inside the deep database at all.

Yeah, that's how this crate is structured. everything inside sim module is no_std. sim_std is helper wrapper until dependent code also uses std.

cloutiertyler

This is quite good overall. I think it's a good start.

jsdt

This looks like a solid start

jsdt · 2026-06-02T18:15:11Z

+                };
+            }
+
+            if self.time.wake_next_timer() {


If we only advance timers when none of the other tasks are runnable, are we going to be unable to simulate some timing scenarios? We could randomly decide to wake a timer instead of running a task sometimes (in run_all_ready), though that might just cause a lot of time outs.

Oh, right! I think more subtle way to do this would be to occasionally advance time inside run_all_ready by few microseconds. This can be done by calling self.time.advace() api. Which will also take care to wake the registered sleepers, if needed.
This will advance the clock more naturally, and let timer tasks to be waken up along with other tasks?

time.wake_next_timer() thing could still be there when no tasks presents, to accelerate the time.

I think time.wake_next_timer() does need to be there so avoid getting stuck (for cases where we are actually hitting a time out).

The idea of randomly/periodically advancing the clock makes sense to me. We could have a parameter like polls_per_second or avg_time_per_poll to control how quickly the clock advances as tasks are executed. With a avg_time_per_poll: Duration parameter, we could just advance time by that amount after every piece of work, or we could make it random with something like avg_time_per_poll.mul_f64(-random::<f64>().ln()).

jsdt · 2026-06-02T22:08:42Z

+
+## Status Definitions
+
+- `Controlled`


I think we need one more status type to distinguish between a source of nondeterminism that we can make deterministic (which is what you are calling controlled), vs a source of non-determinism which we can deterministically simulate all possible outcomes.

Some examples that are deterministic with this framework, but can result in race conditions that we can't create are channels and atomic variables.

Shubham8287 and others added 30 commits May 8, 2026 16:58

snapshot abstraction

42e55dc

lint

f508a04

Add runtime crate and RuntimeDispatch integration

5356b81

LockedFsRepo

c83ed2e

comments

813e418

cleanup

5946261

Merge remote-tracking branch 'origin/master' into shub/persistence-ab…

1f6bdcb

…straction

lint

2104ced

Signed-off-by: Shubham Mishra <shivam828787@gmail.com>

make sim module mostly non_std

fc2e146

drop durability in reopen test helper

e4de2bd

Merge branch 'shub/persistence-abstraction' into shub/sim

4050da2

drop durability in test

795a704

Merge branch 'shub/persistence-abstraction' into shub/sim

e072845

fix snapshot compressor

425e728

minor fixes

466481c

minor fix

7d1e21d

fixes

a521298

fix unneccessary diff

e59ac12

polishing

d074cf0

more polishing

9789d70

update readme

8cd609c

Merge remote-tracking branch 'origin/master' into shub/sim

c62e8b2

Runtime -> Handle

730028f

Apply suggestions from code review

35cbea9

Co-authored-by: Shubham Mishra <shivam828787@gmail.com> Signed-off-by: Shubham Mishra <shivam828787@gmail.com>

Update crates/commitlog/src/lib.rs

5af7fd9

Signed-off-by: Shubham Mishra <shivam828787@gmail.com>

compile fix

52783ce

lint

30012db

fix Cargo.toml

d9f009b

endlines on README

3b76725

comments

9996a16

lint

76a8228

Shubham8287 self-assigned this May 14, 2026

Shubham8287 and others added 3 commits May 14, 2026 23:11

lint

7876599

unused import lint

0b2c53c

Merge branch 'master' into shub/sim

8d2c0af

Shubham8287 requested review from Centril, joshua-spacetime and kim May 18, 2026 14:14

kim reviewed May 19, 2026

View reviewed changes

Shubham8287 added 2 commits May 19, 2026 17:09

review commentary

740170a

typo

252c3e4

kim approved these changes May 20, 2026

View reviewed changes

joshua-spacetime added the asap label May 20, 2026

Merge remote-tracking branch 'origin/master' into shub/sim

622ce5c

Shubham8287 requested review from jsdt and removed request for Centril May 28, 2026 15:50

jsdt reviewed May 29, 2026

View reviewed changes

cloutiertyler reviewed Jun 2, 2026

View reviewed changes

cloutiertyler approved these changes Jun 2, 2026

View reviewed changes

jsdt approved these changes Jun 2, 2026

View reviewed changes

Shubham8287 added 6 commits June 3, 2026 12:39

Merge remote-tracking branch 'origin/master' into shub/sim

a52d834

executor: advance time per task by nanoseconds

e3bbf01

Merge remote-tracking branch 'origin/master' into shub/sim

f2690f4

fmt

7c683e6

fix cargo lock

9b6e4dc

tokio snapshot helper

9600a17


		- Zero dependency. The simulation core in `sim/` is already `no_std + alloc`. The `sim_std` module is a thin OS-facing wrapper — the std dependency lives there, not in the simulation core itself. It stays until the application logic above this crate also moves to `no_std`.

		## Current Limitations

Conversation

Shubham8287 commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description of Changes.

API and ABI breaking changes

Expected complexity level and risk

Testing

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Shubham8287 May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Shubham8287 Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Shubham8287 Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Shubham8287 Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cloutiertyler Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cloutiertyler left a comment

Choose a reason for hiding this comment

Uh oh!

jsdt left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Shubham8287 commented May 13, 2026 •

edited

Loading

Shubham8287 May 20, 2026 •

edited

Loading

Shubham8287 Jun 1, 2026 •

edited

Loading

Shubham8287 Jun 1, 2026 •

edited

Loading

Shubham8287 Jun 1, 2026 •

edited

Loading

cloutiertyler Jun 2, 2026 •

edited

Loading