|
| 1 | +# Thread safety |
| 2 | + |
| 3 | +## Critical Context |
| 4 | + |
| 5 | +### Key Principles |
| 6 | + |
| 7 | +- **Data corruption the primary concern** - prevention is absolutely critical |
| 8 | +- Assume that code will be executed concurrently by multiple threads/processes |
| 9 | +- Assume that code may be suspended and resumed across fiber boundaries |
| 10 | +- **Fibers and threads are NOT the same thing** |
| 11 | + - They do however share safety requirements |
| 12 | +- **Shared mutable state should be avoided** |
| 13 | +- **C extensions e.g. C/Rust etc. can block the fiber scheduler entirely** |
| 14 | + |
| 15 | +## Understanding Fibers vs Threads in Ruby |
| 16 | + |
| 17 | +### Fibers |
| 18 | + |
| 19 | +- **Cooperative multitasking** within a single thread |
| 20 | +- **Explicit yield points** (I/O operations, `Fiber.yield`) |
| 21 | +- **Separate stack** but shared heap memory within the same thread |
| 22 | +- **Faster context switching** than threads (lightweight ~4KB per fiber) |
| 23 | +- **No preemption** - code runs until it yields |
| 24 | +- **No true parallelism** - all fibers run sequentially within their thread |
| 25 | + |
| 26 | +### Threads |
| 27 | + |
| 28 | +- **Preemptive multitasking** with native OS threads |
| 29 | +- **Can be interrupted** at any point by the scheduler |
| 30 | +- **Separate stacks** but shared heap memory |
| 31 | +- **Heavier context switching** and memory usage |
| 32 | +- **Automatic execution** - OS scheduler manages thread execution |
| 33 | +- **Limited parallelism in MRI Ruby** due to Global Interpreter Lock (GIL/GVL): |
| 34 | + - True parallelism only during I/O operations (GIL released) |
| 35 | + - True parallelism for C extensions that release the GIL |
| 36 | + |
| 37 | +## Common patterns with potential issues |
| 38 | + |
| 39 | +1. **Memoization patterns** (`||=`) |
| 40 | + - This is NOT atomic and is problematic with class variables and shared mutable data |
| 41 | +2. **Class and module variables** (`@@variable`, `class_attribute`) |
| 42 | + - Should never be mutated if used |
| 43 | + - Should be treated as generally problematic |
| 44 | +3. **Shared mutable state** (class instance variables accessed by multiple threads/fibers) |
| 45 | + - AVOID |
| 46 | +4. **Lazy initialization** |
| 47 | + - Especially on shared mutable sate |
| 48 | +5. **Hash and array mutations on shared objects** |
| 49 | +6. **C extensions that don't respect the fiber scheduler** |
| 50 | +7. **Thread-local storage** |
| 51 | + - When using a fiber scheduler |
| 52 | +8. **Synchronization mechanisms** (Mutex, ReadWriteLock) |
| 53 | + - Beware of deadlocks |
| 54 | +9. **Concurrent data structures** |
| 55 | + - e.g. `Concurrent::Set` if available can reduce risk of thread safety issues |
| 56 | + - This approach, like all approaches has trade offs |
| 57 | + - Can only be used if the concurrent gem is available |
| 58 | + |
| 59 | +## Common unsafe patterns |
| 60 | + |
| 61 | +### 1. Memoization with `||=` on shared data |
| 62 | + |
| 63 | +```ruby |
| 64 | +class Foo |
| 65 | + def self.bar |
| 66 | + # Issue: Two threads can both see @data and can modify it without |
| 67 | + # the other knowing |
| 68 | + @data ||= fetch_next_fizzbuzz_from_fizzbuzz_api |
| 69 | + end |
| 70 | +end |
| 71 | +``` |
| 72 | + |
| 73 | +**Why is this problematic?**: Multiple threads can pass `nil?` checks simultaneously on shared mutable data. Best case this wastes resources, worst case the wrong data is used or data corruption occurs. |
| 74 | + |
| 75 | +**Potential fix with mutex** (WARNING: locks like mutex could cause deadlocks) |
| 76 | + |
| 77 | +```ruby |
| 78 | +class Foo |
| 79 | + @mutex = Mutex.new |
| 80 | + |
| 81 | + def self.bar |
| 82 | + @mutex.synchronize do |
| 83 | + return @data if defined?(@data) |
| 84 | + @data = fetch_next_fizzbuzz_from_fizzbuzz_api |
| 85 | + end |
| 86 | + end |
| 87 | +end |
| 88 | +``` |
| 89 | + |
| 90 | +**Concurrent::Map example** - Only available if the gem is available |
| 91 | + |
| 92 | +```ruby |
| 93 | +class Foo |
| 94 | + @cache = Concurrent::Map.new |
| 95 | + |
| 96 | + def self.bar(key:) |
| 97 | + # Issue: Two threads can both see @cache and both can modify it without |
| 98 | + # the other knowing |
| 99 | + @cache.compute_if_absent(key) { expensive_operation(key) } |
| 100 | + end |
| 101 | +end |
| 102 | +``` |
| 103 | + |
| 104 | +**Safe if instances are not shared** |
| 105 | + |
| 106 | +```ruby |
| 107 | +class Foo |
| 108 | + def expensive_operation |
| 109 | + @result ||= calculate # Each instance has own @result |
| 110 | + end |
| 111 | +end |
| 112 | +``` |
| 113 | + |
| 114 | + |
| 115 | +### 2. Class Variables with shared state |
| 116 | + |
| 117 | +```ruby |
| 118 | +class GlobalConfig |
| 119 | + @@settings = {} # Issue: Class variables are shared across inheritance |
| 120 | + @settings = {} # Issue: Shared mutable state without synchronization |
| 121 | +end |
| 122 | +``` |
| 123 | + |
| 124 | +**Why is this problematic?**: Class variables affect the entire inheritance hierarchy, shared mutable state causes race conditions. |
| 125 | + |
| 126 | +**Better alternatives:** |
| 127 | + |
| 128 | +- Use dependency injection for runtime configuration |
| 129 | +- Use `Concurrent::Hash.new` if shared state is required and the gem is available |
| 130 | +- Simply avoid if possible |
| 131 | + |
| 132 | +## Fiber safety |
| 133 | + |
| 134 | +**Key Difference**: When a fiber yields during I/O, another fiber may access the shared state. |
| 135 | + |
| 136 | +**Problematic - Thread.current in specific cases e.g. request scoped data when paired with servers like Falcon:** |
| 137 | + |
| 138 | +```ruby |
| 139 | +class SomeRequestSpecificData |
| 140 | + def track_request(some_request_specific_data) |
| 141 | + # Issue: Thread.current can leak between requests |
| 142 | + Thread.current[:some_request_specific_data] = some_request_specific_data |
| 143 | + external_api_call # Fiber yields, other requests may be processed |
| 144 | + log("Processing #{Thread.current[:some_request_specific_data]}") # May have incorrect data! |
| 145 | + end |
| 146 | +end |
| 147 | +``` |
| 148 | + |
| 149 | +**Why is this problematic?**: When a fiber yields during I/O, a fiber based web server may process other requests causing data to become corrupted. |
| 150 | + |
| 151 | +**Better - Use Fiber storage for request data:** |
| 152 | + |
| 153 | +```ruby |
| 154 | +class SomeRequestSpecificData |
| 155 | + def track_request(some_request_specific_data) |
| 156 | + Fiber[:some_request_specific_data] = some_request_specific_data |
| 157 | + external_api_call # Fiber yields |
| 158 | + log("Processing #{Fiber[:some_request_specific_data]}") |
| 159 | + end |
| 160 | +end |
| 161 | +``` |
| 162 | + |
| 163 | +**Acceptable - Thread.current for thread-level concerns:** |
| 164 | + |
| 165 | +```ruby |
| 166 | +def some_non_changing_config |
| 167 | + Thread.current[:some_non_changing_config] || default_config |
| 168 | +end |
| 169 | +``` |
| 170 | + |
| 171 | +## Essential Safe Patterns |
| 172 | + |
| 173 | +### 1. Mutex for Critical Sections |
| 174 | + |
| 175 | +```ruby |
| 176 | +class ResourcePool |
| 177 | + def initialize |
| 178 | + @resources = [] |
| 179 | + @mutex = Mutex.new |
| 180 | + end |
| 181 | + |
| 182 | + def checkout |
| 183 | + @mutex.synchronize { @resources.pop || create_resource } |
| 184 | + end |
| 185 | + |
| 186 | + def checkin(resource) |
| 187 | + @mutex.synchronize { @resources.push(resource) } |
| 188 | + end |
| 189 | +end |
| 190 | +``` |
| 191 | + |
| 192 | +### 2. Concurrent Data Structures (if available) |
| 193 | + |
| 194 | +Examples: |
| 195 | + |
| 196 | +```ruby |
| 197 | +@cache = Concurrent::Map.new |
| 198 | +@config = Concurrent::Hash.new |
| 199 | +@enabled = Concurrent::AtomicBoolean.new(false) |
| 200 | +``` |
| 201 | + |
| 202 | +### 3. Request-Scoped storage when working with web servers |
| 203 | + |
| 204 | +```ruby |
| 205 | +# Use Fiber[:key] for request-scoped data |
| 206 | +Fiber[:user_context] = current_user |
| 207 | + |
| 208 | +# Access later |
| 209 | +def some_controller_method |
| 210 | + Fiber[:user_context] |
| 211 | +end |
| 212 | +``` |
| 213 | + |
| 214 | +## When `||=` can be safe |
| 215 | + |
| 216 | +### 1. Instance variables on unshared objects |
| 217 | + |
| 218 | +When each instance is only accessed by a single thread/fiber: |
| 219 | + |
| 220 | +```ruby |
| 221 | +class RequestHandler |
| 222 | + def process_request |
| 223 | + @parser ||= create_parser # Safe: each request has its own handler instance |
| 224 | + end |
| 225 | +end |
| 226 | +``` |
| 227 | + |
| 228 | +### 2. Synchronization primitives |
| 229 | + |
| 230 | +Synchronization objects are designed for concurrent access, so duplicate creation is wasteful but not harmful: |
| 231 | + |
| 232 | +```ruby |
| 233 | +class Task |
| 234 | + def wait |
| 235 | + @condition ||= Condition.new # Safe: condition variables handle concurrent access |
| 236 | + @condition.wait |
| 237 | + end |
| 238 | +end |
| 239 | +``` |
| 240 | + |
| 241 | +**Why this is acceptable**: |
| 242 | +- Condition variables, mutexes, and semaphores intended for protecting critical sections |
| 243 | +- Synchronization still works correctly |
| 244 | + |
| 245 | +### 3. Immutable or effectively immutable objects |
| 246 | + |
| 247 | +```ruby |
| 248 | +class Calculator |
| 249 | + def pi |
| 250 | + @pi ||= Math::PI # Safe: immutable value |
| 251 | + end |
| 252 | + |
| 253 | + def default_config |
| 254 | + @config ||= Config.new.freeze # Safe: frozen object |
| 255 | + end |
| 256 | +end |
| 257 | +``` |
| 258 | + |
| 259 | +## Performance Considerations |
| 260 | + |
| 261 | +### Synchronization Overhead |
| 262 | + |
| 263 | +| Mechanism | Use Case | Performance Impact | |
| 264 | +| :---- | :---- | :---- | |
| 265 | +| Mutex | General purpose locking | Medium overhead, can cause contention | |
| 266 | +| ReadWriteLock | Read-heavy workloads | Better for many readers, few writers | |
| 267 | +| Concurrent::* | Lock-free operations | Generally faster, but higher memory usage | |
| 268 | +| Atomic operations | Simple counters/flags | Fastest for simple operations | |
| 269 | + |
| 270 | +### Choosing the Right Approach |
| 271 | + |
| 272 | +1. **No shared state** > **Immutable shared state** > **Synchronized mutable state** |
| 273 | +2. **Lock-free (Concurrent::*)** > **Fine-grained locks** > **Coarse-grained locks** |
| 274 | + |
| 275 | +## Quick reference |
| 276 | + |
| 277 | +### Thread Safety |
| 278 | + |
| 279 | +| Unsafe Pattern | Safe Alternative | |
| 280 | +| :---- | :---- | |
| 281 | +| `@data ||= expensive_operation` | `@mutex.synchronize { @data ||= expensive_operation }` | |
| 282 | +| `@@class_var` | dependency injection | |
| 283 | +| `@shared_array << item` | `@concurrent_set.add(item)` if available or mutex | |
| 284 | +| Nested mutex acquisition | Consistent lock ordering | |
| 285 | + |
| 286 | +### Fiber Safety |
| 287 | + |
| 288 | +| Unsafe Pattern | Safe Alternative | |
| 289 | +| :---- | :---- | |
| 290 | +| `Thread.current[:some_request_based_id] = id` | `Fiber[:some_request_based_id] = id` | |
| 291 | +| `@cache ||= {}` (in shared objects) | `Fiber[:cache] ||= {}` | |
| 292 | + |
| 293 | +**Note**: `Thread.current` is ok for thread-level concerns like debugging |
| 294 | + |
| 295 | +## Summary |
| 296 | + |
| 297 | +1. **Prevent data corruption** - Primary production concern |
| 298 | +2. **Avoid shared mutable state** |
| 299 | +3. **Request isolation** - **Always use `Fiber[:key]` for request-scoped data** |
| 300 | +4. **Be aware of performance trade-offs** - Synchronization has costs |
| 301 | + |
| 302 | +### Key Takeaways |
| 303 | + |
| 304 | +- **Fibers != Threads**: Different models but **similar safety requirements** |
| 305 | +- **Avoid shared mutable state**: Use dependency injection or immutable objects |
| 306 | +- **Prevention over testing**: Write inherently thread-safe code |
| 307 | +- **Lock ordering matters**: Prevent deadlocks with consistent acquisition order |
| 308 | +- **Snapshot concurrent collections**: Before iteration to avoid inconsistencies |
| 309 | + |
| 310 | +**Critical**: Data corruption from race conditions is the primary concern. Prevention through proper design is a requirement. |
0 commit comments