Skip to content

Commit ce3e85f

Browse files
committed
Add more sections
1 parent ffa7ebb commit ce3e85f

File tree

1 file changed

+57
-13
lines changed

1 file changed

+57
-13
lines changed

doc/concurrency_guide.md

Lines changed: 57 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -9,11 +9,10 @@ language. It will go over:
99
* The difference between the VM lock and the GVL.
1010
* How to write code that is ractor safe.
1111
* What a VM barrier is and when to use it.
12-
* The lock hierarchy of some important locks.
12+
* The lock ordering of some important locks.
1313
* How ruby interrupt handling works.
14-
* What happens when IO is performed through ruby.
1514
* The timer thread and what it's responsible for.
16-
15+
* What happens when IO is performed through ruby.
1716

1817
## The VM Lock
1918

@@ -80,29 +79,74 @@ is waiting (blocked) on this same native lock, it can't join the barrier and a d
8079

8180
The VM Lock is a particular lock in the source code. There is only one VM Lock. The GVL, on the other hand, is more of a combination of locks.
8281
It is "acquired" when a ruby thread is about to run or is running. Since many ruby threads can run at the same time if they're in different ractors,
83-
there are many GVLs (1 per `SNT` + 1 for the main ractor). It can no longer be thought of as a "Global VM Lock".
82+
there are many GVLs (1 per `SNT` + 1 for the main ractor). It can no longer be thought of as a "Global VM Lock" like it once was.
8483

8584
## How To Write Ractor-Safe Code
8685

8786
Before ractors, only one ruby thread could run at once. That didn't mean you could forget about concurrency issues, though. Context switches happen
8887
often and need to be taken into account when writing code. Also, threads without the GVL run too, like the timer thread. Sometimes these threads need
8988
to coordinate with ruby threads, and this coordination often needs locks or atomics.
9089

91-
When you add ractors to the mix, it gets more complicated. Take the `fstring` table, for example. It's a global set of strings that each ractor can update
92-
concurrently, and it's used heavily. A lockless solution is preferred to using the VM lock in this case, as taking the VM Lock would cause too many OS context
93-
switches. A lockless solution is also preferable for dealing with call cache tables on classes. These are also updated often and can run from multiple ractors
94-
concurrently. Here, an RCU (Read-Copy-Update) solution is used. What was previously an `st_table` is now a ruby object, and the old and new tables are switched
95-
atomically.
90+
When you add ractors to the mix, it gets more complicated. However, ractors allow you to forget about synchronization for non-shareable objects because
91+
they aren't used across ractors. Only one ruby thread can touch the object at once. For shareable objects, they are deeply frozen so there isn't any
92+
mutation on the objects themselves, but things like reading/writing constants across ractors do need to be synchronized. Most synchronization though, is due
93+
to the VM internals that need to be protected. These internals include structures for the thread scheduler on each ractor, the global ractor scheduler, the
94+
coordination between ruby threads and ractors, global tables (for `fstrings`, encodings, symbols and global vars), etc.
9695

9796
## VM Barriers
9897

99-
Sometimes, taking the VM Lock isn't enough and you need a guarantee that all ractors have stopped. This happens when running GC, for instance.
100-
A VM barrier is designed for this use case. It's not used often as taking the barrier slows ractor performance down considerably, but it's useful to
98+
Sometimes, taking the VM Lock isn't enough and you need a guarantee that all ractors have stopped. This happens when running `GC`, for instance.
99+
A `VM barrier` is designed for this use case. It's not used often as taking a barrier slows ractor performance down considerably, but it's useful to
101100
know about and is sometimes the only solution.
102101

103-
## Lock Hierarchy
102+
## Lock Orderings
103+
104+
It's a good idea to not hold more than 2 locks at once on the same thread. Locking multiple locks can introduce deadlocks, so do it with care. When locking
105+
multiple locks at once, follow an ordering that is consistent across the program. Here are the orderings of some important locks:
106+
107+
* VM lock before ractor_sched_lock()
108+
* thread_sched_lock() before ractor_sched_lock()
109+
* interrupt_lock() before timer_th.waiting_lock()
110+
* timer_th.waiting_lock() before ractor_sched_lock()
111+
112+
These orderings are subject to change, so check the source if you're not sure. On top of this:
113+
114+
* During each `ubf` (unblock) function, the VM lock can be taken around it in some circumstances. See the "Interrupt Handling" section
115+
for more details.
104116

105117
## Ruby Interrupt Handling
106118

107119
When the VM runs ruby code, ruby's threads intermittently check ruby-level interrupts. These software interrupts
108-
are for various things, like
120+
are for various things in ruby:
121+
122+
* Ruby threads check when they should give up their timeslice and switch to another thread when their time is up.
123+
* The timer thread sends a "trap" interrupt to the main thread if any ruby-level signal handlers are pending.
124+
* Ruby threads can have other ruby threads run tasks for them by sending them an interrupt. For instance, ractors send
125+
the main thread an interrupt when they need to `require` a file so that it's done on the main thread. They wait for the
126+
main thread's result.
127+
* During VM shutdown, a "terminate" interrupt is sent to all ractor main threads top stop them right away.
128+
* When calling `Thread#raise`, the caller sends an interrupt to that thread telling it which exception to raise.
129+
* Unlocking a mutex sends the next waiter (if any) an interrupt telling it to grab the lock.
130+
* Signalling or broadcasting on a condition variable tells the waiter(s) to wake up.
131+
132+
This isn't a complete list.
133+
134+
When sending an interrupt to a ruby thread, the ruby thread can be blocked. For example, it could be in the middle of a `TCPSocket#read` call. If so,
135+
the receiving thread's `ubf` (unblock function) gets called from the thread (ruby thread or timer thread) that sent the interrupt.
136+
Each ruby thread has a `ubf` that is set when it enters a blocking operation. By default, this `ubf` function sends a
137+
`SIGVTALRM` to the receiving thread to try to unblock it from the kernel so it can check its interrupts. There are other `ubfs` that
138+
aren't associated with a syscall, such as when calling `Ractor#join` or `sleep`. All `ubfs` are called with the `interrupt_lock` held,
139+
so take that into account when using locks inside `ubfs`.
140+
141+
Remember, `ubfs` can be called from the timer thread so you cannot assume an `ec` inside them.
142+
143+
## The Timer Thread
144+
145+
The timer thread has a few functions. They are:
146+
147+
* Send interrupts to ruby threads that have run for their whole timeslice.
148+
* Wake up M:N ruby threads (threads in non-main ractors) blocked on IO or after a specified timeout. This
149+
uses `kqueue` or `epoll`, depending on the OS, to receive IO events on behalf of the threads.
150+
* Continue calling the `SIGVTARLM` signal if a thread is still blocked on a syscall after the first `ubf` call.
151+
* Signal native threads (`SNT`) waiting on a ractor if there are ractors waiting in the queue.
152+
* Create more `SNT`s if some are blocked waiting, like on IO or calling `Ractor#join`.

0 commit comments

Comments
 (0)