You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+9-6Lines changed: 9 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,6 +9,7 @@
9
9
`pyfuse` captures a function's source code, dependencies, and imports via a single `@trace` decorator.<br/>Workers reconstruct and execute the function from scratch – no deployment, no shared filesystem. Packages are installed automatically.
Only the entry point needs `@trace`. Everything it calls -- `add()`, imports, class methods -- is captured automatically.
@@ -75,12 +79,11 @@ Common mappings (`cv2` -> `opencv-python`, `PIL` -> `Pillow`, etc.) are built in
75
79
76
80
-**Automatic dependency detection** -- AST-based, recursive. Untraced helpers, class methods, module-level constants, class-level attributes, and class decorators are all captured.
77
81
-**Third-party package auto-install** -- Workers install missing packages via pip before execution.
78
-
-**Async support** -- `async def` functions execute transparently. `await result`, `.arun()`, `.amap()`, and `asyncio.gather` all work out of the box.
79
-
-**Notification-based result delivery** -- Push notifications fan out to many waiters via a single backend listener. No polling.
82
+
-**Async-native** -- The entire I/O layer is built on `asyncio`. `.run()`, `.start()`, `.map()`, `await result`, and `asyncio.gather` all work out of the box.
80
83
-**Heartbeat & stall detection** -- Workers send periodic heartbeats. Clients raise `TaskStalled` when a worker stops responding.
81
84
-**Class methods** -- `self.method()` and `cls.method()` dependencies are detected. Entire class hierarchies (including `super()`), class-level attributes, decorators (`@dataclass`, etc.), and metaclass keywords are reconstructed.
82
85
-**Retry and timeout** -- `@trace(timeout=30, retries=3)` with exponential backoff.
@@ -52,7 +52,13 @@ Data models and error types. `FunctionNode` represents a function in the graph,
52
52
53
53
### 3. Worker layer (`worker/`)
54
54
55
-
Handles remote execution. The `Worker` class reconstructs functions from serialized stores, caches compiled namespaces by subgraph content hash, and executes with retry/timeout policies (including `async def` functions via `asyncio.run()`). `remote.py` orchestrates the connection lifecycle, worker event loop, and heartbeat threads. `Result` is an awaitable future returned by `.run()`. The `ResultWaiter` singleton per backend uses push notifications to fan out results to many waiters without polling.
55
+
Handles remote execution. Built entirely on `asyncio`:
56
+
57
+
-**`worker.py`**: `Worker` class reconstructs functions from serialized stores, caches compiled namespaces by subgraph content hash, and executes with retry/timeout policies. Async user functions are awaited directly; sync user functions run in `loop.run_in_executor()`. Timeouts use `asyncio.wait_for()`.
58
+
-**`remote.py`**: Orchestrates the connection lifecycle, worker event loop (`asyncio.TaskGroup` + `asyncio.Semaphore` for bounded concurrency), and heartbeat tasks (`asyncio.create_task`).
59
+
-**`result.py`**: `Result` is an awaitable future returned by `.start()`. Simple async polling loop for stall detection.
60
+
-**`deps.py`**: Package installation via `asyncio.create_subprocess_exec`.
61
+
-**`backends/`**: All backend methods are `async def`. `listen()` and `subscribe_results()` are async generators.
56
62
57
63
## Data flow
58
64
@@ -63,19 +69,21 @@ Handles remote execution. The `Worker` class reconstructs functions from seriali
63
69
-> graph.register(func) # add FunctionNode to graph
-> backend.send_result(task_id, envelope) # return result
85
+
-> await namespace[func_name](*args) # execute (or run_in_executor for sync)
86
+
-> await backend.send_result(task_id, envelope) # return result
79
87
```
80
88
81
89
## Key patterns and conventions
@@ -87,13 +95,12 @@ Worker.run(task)
87
95
-**Cross-module inlining**: Imports like `from utils import helper` where `helper` is a user function get converted from import statements to inline dependency edges, making reconstructed code self-contained.
88
96
-**Module-level variables**: Constants and assignments (`MAX_RETRIES = 5`, `CONFIG = {...}`) referenced by traced functions are captured and emitted in reconstructed source.
89
97
-**Class-level attributes**: Class body statements (assignments, annotated assignments, docstrings) are extracted from AST and emitted in reconstructed class blocks. Class decorators (`@dataclass`, etc.) and metaclass keywords (`metaclass=ABCMeta`) are captured and emitted.
90
-
-**Closure handling**: Multi-tier capture: `repr()` validation, then lambda source extraction, then auto-discovery for non-traced user functions, then constructor expressions for common stdlib types (`defaultdict`, `Counter`, `deque`), then pickle fallback for picklable objects. Traced function references become dependency edges.
-**Decorator stripping**: `@trace` lines are removed from captured source so reconstructed code doesn't depend on pyfuse.
92
100
-**Backend auto-detection**: `connect()` picks Redis or shared memory based on URL scheme. Falls back to `PYFUSE_BACKEND` env var.
93
101
-**Worker caching**: Keyed by SHA-256 of all reachable content hashes (sorted + joined). Same code from different clients = cache hit.
94
-
-**Async transparency**: Workers detect `async def` functions and run them via `asyncio.run()`. Results are awaitable via `asyncio.Future` fan-out.
95
-
-**Notification fan-out**: `ResultWaiter` singleton per backend runs one listener thread and one heartbeat thread, serving all pending `Result` objects. No per-task polling.
96
-
-**Heartbeat monitoring**: Workers send heartbeats every 1s. Client-side stall detection tracks when heartbeat *values* last changed using local monotonic clock (no cross-machine timestamp comparison).
102
+
-**Async-native I/O**: All backend methods, worker execution, result handling, pip installation, and subprocess management use `asyncio`. Sync user functions run in `loop.run_in_executor()` to avoid blocking the event loop.
103
+
-**Heartbeat**: Workers send heartbeats via `asyncio.create_task`. Client-side stall detection tracks when heartbeat *values* last changed using local monotonic clock (no cross-machine timestamp comparison).
97
104
98
105
## Serialization format (v0.4.0)
99
106
@@ -128,24 +135,24 @@ Worker.run(task)
128
135
@trace# capture function
129
136
@trace(timeout=30, retries=3) # with execution options
15 test modules covering: API surface, AST analysis, async features (aresult, await, arun, amap, gather, heartbeat, stall detection, notification-based result delivery), auto-discovery (including metaclass keywords, class attributes, class decorators, `__init_subclass__`), dependency management, graph operations, integration scenarios, remote execution, runtime tracing (including closure capture of non-traced functions, lambdas, constructor expressions, pickle fallback), shared memory backend, store operations, stress tests (47 functions across 7 files), task serialization, temp venv management, and worker caching/execution.
169
+
15 test modules covering: API surface, AST analysis, async features (Result.result, await, .run(), .start(), .map(), gather, heartbeat, stall detection), auto-discovery (including metaclass keywords, class attributes, class decorators, `__init_subclass__`), dependency management, graph operations, integration scenarios, remote execution, runtime tracing (including closure capture of non-traced functions, lambdas, constructor expressions, pickle fallback), shared memory backend, store operations, stress tests (47 functions across 7 files), task serialization, temp venv management, and worker caching/execution.
170
+
171
+
All async tests use `pytest-asyncio` with `asyncio_mode = "auto"`.
163
172
164
173
## Development
165
174
@@ -176,8 +185,7 @@ pytest # test suite
176
185
-`analyzer.py` is the core of static analysis (~365 lines). Changes here affect what gets captured.
177
186
-`tracing.py` uses `contextvars.ContextVar` for thread/async safety. The `_runtime_deps` dict is guarded by `threading.Lock`.
178
187
- The `Task` wire format keeps `graph` as a JSON string (not nested object) to keep the envelope flat.
179
-
- Backend implementations must satisfy the `Backend` ABC in `backends/base.py`. New methods (`notify_result`, `subscribe_results`, `get_heartbeats`) are non-abstract with safe defaults -- custom backends don't break.
180
-
-`ResultWaiter` in `result.py` is a per-backend singleton with two daemon threads (listener + heartbeat). It uses `loop.call_soon_threadsafe()` for async fan-out and `threading.Event` for sync fan-out.
188
+
- Backend implementations must satisfy the `Backend` ABC in `backends/base.py`. All methods are `async def`. New methods (`notify_result`, `subscribe_results`, `get_heartbeats`) are non-abstract with safe defaults -- custom backends don't break.
181
189
-`install_package_as()` is a no-op at runtime; the AST analyzer in `decorator.py`/`analyzer.py` detects the `with` block pattern and tags `ImportInfo` objects with the package name.
182
190
-`_capture_closure()` in `graph.py` uses a multi-tier strategy: repr validation → traced functions → lambdas (source extraction) → non-traced user functions (auto-registration) → constructor expressions (defaultdict/Counter/deque) → pickle fallback → warning. Returns function objects for auto-registration.
183
191
-`_set_class_metadata()` in `graph.py` captures class-level attributes and decorators from the class source AST. Called from both `_auto_register_class` and `_discover_self_call_deps` to handle both constructor-discovered and directly-traced method classes.
0 commit comments