Skip to content

Commit e6b05f2

Browse files
authored
Audit docs, drop dead code, add snippet linter (#65)
Fix two stale identifier references (Py_GIL_OWN, multi_executor fallback), delete priv/_erlang_impl/_ssl.py and three uncalled py_util/0,1,2,3 exports, and repair a broken SharedDict example in docs/shared-dict.md. Add tests for py:cast/4, py:async_gather/2 and py:dup_fd/1, plus test/coverage_audit.md mapping every public API to its suite. New scripts/lint_doc_snippets.escript validates py:Fn/N calls and Python syntax in fenced blocks; wired into CI and a Makefile target.
1 parent dad2ede commit e6b05f2

13 files changed

Lines changed: 502 additions & 388 deletions

File tree

.github/workflows/ci.yml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -92,6 +92,9 @@ jobs:
9292
- name: Run xref
9393
run: rebar3 xref
9494

95+
- name: Lint docs
96+
run: escript scripts/lint_doc_snippets.escript
97+
9598
# FreeBSD test using cross-platform action
9699
test-freebsd:
97100
name: FreeBSD 14 / Python ${{ matrix.python }}

Makefile

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
.PHONY: all compile test lint-docs clean
2+
3+
all: compile
4+
5+
compile:
6+
rebar3 compile
7+
8+
test:
9+
rebar3 ct --readable=compact
10+
11+
# Validate fenced code blocks in README.md and docs/*.md.
12+
# Erlang `py:Fn(...)` calls must reference a real export at the right
13+
# arity; Python blocks must parse (IndentationError tolerated for
14+
# tutorial fragments). Mark a block to skip with `<!-- skip-lint -->`
15+
# on the line immediately above the opening fence.
16+
lint-docs: compile
17+
escript scripts/lint_doc_snippets.escript
18+
19+
clean:
20+
rebar3 clean

docs/migration.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,7 @@ application:set_env(erlang_python, context_mode, owngil).
3838

3939
**`py:num_executors/0`** - Removed. Contexts now use per-context worker threads.
4040

41+
<!-- skip-lint -->
4142
```erlang
4243
%% v2.x - check executor count
4344
N = py:num_executors().
@@ -254,6 +255,7 @@ N = py_context_router:num_contexts().
254255
The function for non-blocking Python calls has been renamed to follow gen_server conventions:
255256

256257
**Before (v1.8.x):**
258+
<!-- skip-lint -->
257259
```erlang
258260
Ref = py:call_async(math, factorial, [100]),
259261
{ok, Result} = py:await(Ref).
@@ -355,6 +357,7 @@ For more sophisticated web framework integration, consider the [Reactor API](rea
355357
The process-binding functions have been removed. The new architecture uses `py_context_router` for automatic scheduler-affinity routing.
356358

357359
**Before (v1.8.x):**
360+
<!-- skip-lint -->
358361
```erlang
359362
ok = py:bind(),
360363
ok = py:exec(<<"x = 42">>),
@@ -760,9 +763,9 @@ ImportError: module does not support subinterpreters
760763
```
761764

762765
Options:
763-
1. Use Python < 3.12 (falls back to multi_executor mode)
764-
2. Check if the library has a subinterpreter-compatible version
765-
3. Isolate the library usage to a single context
766+
1. Use Python 3.12 or 3.13: the runtime uses `worker` mode (subinterpreters require Python 3.14+).
767+
2. Check if the library has a subinterpreter-compatible version.
768+
3. Isolate the library usage to a single context.
766769

767770
### Python 3.14: `erlang_loop_import_failed`
768771

docs/owngil_internals.md

Lines changed: 46 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -425,22 +425,53 @@ class EchoProtocol(reactor.Protocol):
425425

426426
## Performance Characteristics
427427

428-
| Operation | Shared-GIL | OWN_GIL |
429-
|-----------|-----------|---------|
428+
| Operation | Worker (shared GIL) | OWN_GIL |
429+
|-----------|--------------------|---------|
430430
| Call overhead | ~2.5μs | ~10μs |
431-
| Throughput (single) | 400K/s | 100K/s |
432-
| Parallelism | None | True |
433-
| Resource usage | Lower | Higher (1 pthread per context) |
434-
435-
Use OWN_GIL when:
436-
- CPU-bound Python work that benefits from parallelism
437-
- Long-running computations
438-
- Need true concurrent Python execution
439-
440-
Use worker mode when:
441-
- I/O-bound or short operations
442-
- High call frequency
443-
- Resource constraints
431+
| Throughput (single context) | ~400K/s | ~100K/s |
432+
| Parallelism (N contexts) | GIL-bound | Linear up to N cores |
433+
| Resource usage | One pthread per context | One pthread + one subinterpreter per context |
434+
435+
## Pros and Cons
436+
437+
### Pros
438+
439+
- **True CPU parallelism.** Each context owns its GIL, so N contexts run on N cores at once. Worker mode serialises on the main GIL unless Python is built free-threaded (3.13t+).
440+
- **Crash isolation.** A C-level fault in one subinterpreter leaves the others alive. Worker mode shares the main interpreter, so a corrupt module state can take everything down.
441+
- **Clean namespace per context.** Each subinterpreter has its own `sys.modules`, so module-level state cannot bleed between contexts. Useful when running adversarial or untrusted code paths side by side.
442+
- **Predictable scheduling.** Requests are dispatched via mutex/condvar IPC, not dirty schedulers, so OWN_GIL contexts will not be starved by other dirty NIF traffic.
443+
444+
### Cons
445+
446+
- **Python 3.14+ only.** Earlier versions have C-extension global-state bugs (`_decimal`, `numpy`, etc.) that crash inside subinterpreters. See [cpython#106078](https://github.com/python/cpython/issues/106078).
447+
- **Higher per-call latency.** ~4x the round-trip cost of worker mode (~10μs vs ~2.5μs) because every call crosses a mutex/condvar handoff to the dedicated thread.
448+
- **Higher memory.** Each subinterpreter imports its own copy of every module. A 50 MB module set across 8 contexts is ~400 MB resident, not 50 MB.
449+
- **C-extension compatibility is not universal.** Extensions must opt in via the multi-phase init protocol (PEP 489) and `Py_mod_multiple_interpreters`. Pure-Python and well-behaved C extensions work; older ones fail at import inside the subinterpreter.
450+
- **No shared Python state.** Module globals, class definitions, and cached objects are per-interpreter. Use `py:state_store/2` (ETS-backed) or `erlang.send` for cross-context data.
451+
- **Callback re-entry is restricted.** When Python in an OWN_GIL context calls `erlang.call`, the callback runs on a thread worker, not back on the OWN_GIL thread (which cannot suspend). Re-entrant Python -> Erlang -> *same* OWN_GIL context calls will not work; use a different context for the nested call, or use `erlang.async_call` from asyncio code.
452+
- **Process-local envs do not span interpreters.** A `py_env_resource_t` is bound to the interpreter that created it. Reusing one across contexts returns `{error, env_wrong_interpreter}`.
453+
454+
### When to Use Each
455+
456+
Use **OWN_GIL** when:
457+
458+
- The workload is CPU-bound Python (ML inference, numpy/torch compute, parsing, codecs) and you want N-way parallelism per BEAM scheduler.
459+
- You can pin the per-context memory budget and the modules in use are subinterpreter-safe.
460+
- You are on Python 3.14+.
461+
462+
Use **worker** (default) when:
463+
464+
- You are on Python 3.12 or 3.13.
465+
- Calls are short and frequent (every microsecond of overhead matters).
466+
- You are running modules that are not subinterpreter-safe (some scientific stacks, older C extensions).
467+
- You are already running free-threaded Python (3.13t+); worker mode gets parallelism for free without the per-interpreter memory cost.
468+
469+
### Common Pitfalls
470+
471+
- **Importing once is not enough.** Imports happen per subinterpreter. Pre-warming a worker context will not pre-warm the OWN_GIL contexts; do it inside each `py_context`.
472+
- **Sharing Python objects across contexts.** Passing a `PyObject*` reference (via `py_state` or otherwise) between OWN_GIL contexts is undefined behaviour. Round-trip through Erlang terms or ETS-backed state.
473+
- **Long-running tasks block the dispatcher.** A single OWN_GIL context processes one request at a time. If you have a 30-second compute job, parallelise across contexts; do not queue everything onto context 1.
474+
- **Callback storms.** Heavy `erlang.call` use inside an OWN_GIL context routes to thread workers, which is fine, but the round-trip cost is then worker-style on top of OWN_GIL dispatch. For tight callback loops, prefer worker mode end-to-end.
444475

445476
## Benchmarking
446477

docs/scalability.md

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -108,6 +108,10 @@ Ctx = py:context(1),
108108
- Higher memory usage (each interpreter loads modules separately)
109109
- Some C extensions don't support subinterpreters
110110
- Requires Python 3.14+
111+
- Higher per-call latency (~4x worker)
112+
- Callback re-entry to the same context is restricted (`erlang.call` from inside an OWN_GIL context routes to a thread worker, not back to that context)
113+
114+
For a fuller breakdown of OWN_GIL tradeoffs, common pitfalls, and a usage decision guide, see [OWN_GIL Internals: Pros and Cons](owngil_internals.md#pros-and-cons).
111115

112116
## Subinterpreter Architecture
113117

@@ -144,7 +148,7 @@ Ctx = py:context(1),
144148
│ │ └──────────┘ │ │ └──────────┘ │ │ └──────────┘ │ │
145149
│ └──────────────┘ └──────────────┘ └──────────────┘ │
146150
│ │
147-
│ Each thread owns its interpreter's GIL (Py_GIL_OWN)
151+
│ Each thread owns its GIL (PyInterpreterConfig_OWN_GIL)
148152
│ No GIL contention between threads │
149153
└─────────────────────────────────────────────────────────────────┘
150154
```
@@ -155,7 +159,7 @@ Ctx = py:context(1),
155159

156160
**py_context_process**: Gen_server that owns a Python context reference and handles call/eval/exec operations.
157161

158-
**Subinterpreter Thread Pool (C)**: Manages N threads, each with its own Python subinterpreter created with `Py_NewInterpreterFromConfig()` and `Py_GIL_OWN`.
162+
**Subinterpreter Thread Pool (C)**: Manages N threads, each with its own Python subinterpreter created with `Py_NewInterpreterFromConfig()` and `PyInterpreterConfig_OWN_GIL`.
159163

160164
### Request Flow
161165

docs/security.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,7 @@ This provides defense-in-depth - even if Python code tries to import `os` or `su
4242

4343
When blocked operations are attempted, you'll see:
4444

45+
<!-- skip-lint -->
4546
```python
4647
>>> import subprocess
4748
>>> subprocess.run(['ls'])
@@ -50,6 +51,7 @@ fork()/exec() would corrupt the Erlang runtime.
5051
Use Erlang ports (open_port/2) for subprocess management.
5152
```
5253

54+
<!-- skip-lint -->
5355
```python
5456
>>> import os
5557
>>> os.fork()

docs/shared-dict.md

Lines changed: 11 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -279,16 +279,21 @@ ok = py:shared_dict_destroy(Session).
279279
%% Create shared cache
280280
{ok, Cache} = py:shared_dict_new(),
281281

282-
%% Python can populate the cache
282+
%% Inject the handle into Python globals (py:exec/1 has no locals
283+
%% argument, so we stash it via py:eval with a side effect).
284+
{ok, _} = py:eval(
285+
<<"(globals().__setitem__('_cache_handle', handle), None)[-1]">>,
286+
#{handle => Cache}),
287+
288+
%% Python can now populate the cache
283289
ok = py:exec(<<"
284290
from erlang import SharedDict
285-
cache = SharedDict(handle)
286-
cache['computed'] = expensive_computation()
287-
">>,
288-
ok = py:eval(<<"1">>, #{<<"handle">> => Cache}),
291+
cache = SharedDict(_cache_handle)
292+
cache['computed'] = 42
293+
">>),
289294

290295
%% Erlang can read cached values
291-
CachedValue = py:shared_dict_get(Cache, <<"computed">>).
296+
42 = py:shared_dict_get(Cache, <<"computed">>).
292297
```
293298

294299
## See Also

0 commit comments

Comments
 (0)