You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -73,12 +73,12 @@ Common mappings (`cv2` -> `opencv-python`, `PIL` -> `Pillow`, etc.) are built in
73
73
74
74
## Features
75
75
76
-
-**Automatic dependency detection** -- AST-based, recursive. Untraced helpers, class methods, module-level constants are all captured.
76
+
-**Automatic dependency detection** -- AST-based, recursive. Untraced helpers, class methods, module-level constants, class-level attributes, and class decorators are all captured.
77
77
-**Third-party package auto-install** -- Workers install missing packages via pip before execution.
78
78
-**Async support** -- `async def` functions execute transparently. `await result`, `.arun()`, `.amap()`, and `asyncio.gather` all work out of the box.
79
79
-**Notification-based result delivery** -- Push notifications fan out to many waiters via a single backend listener. No polling.
80
80
-**Heartbeat & stall detection** -- Workers send periodic heartbeats. Clients raise `TaskStalled` when a worker stops responding.
81
-
-**Class methods** -- `self.method()` and `cls.method()` dependencies are detected. Entire class hierarchies (including `super()`) are reconstructed.
81
+
-**Class methods** -- `self.method()` and `cls.method()` dependencies are detected. Entire class hierarchies (including `super()`), class-level attributes, decorators (`@dataclass`, etc.), and metaclass keywords are reconstructed.
82
82
-**Retry and timeout** -- `@trace(timeout=30, retries=3)` with exponential backoff.
Copy file name to clipboardExpand all lines: docs/CONTEXT.md
+17-10Lines changed: 17 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,7 +4,7 @@
4
4
5
5
pyfuse is a Python library for distributed function execution via automatic source code serialization. A `@trace` decorator captures a function's source, imports, and full dependency tree via AST analysis. Workers reconstruct and execute functions from scratch with zero prior knowledge of the code, installing missing packages automatically.
│ └── tracing.py # Runtime call-stack tracing via contextvars (TracingMixin)
28
28
└── worker/
29
29
├── worker.py # Worker: reconstruct, cache, execute with retry/timeout
@@ -48,7 +48,7 @@ The `Store` is a content-addressable JSON format where each function is identifi
48
48
49
49
### 2. Core layer (`core/`)
50
50
51
-
Data models and error types. `FunctionNode` represents a function in the graph. `ImportInfo` represents a single import binding. `Task` is a frozen dataclass that bundles a serialized graph with function name, arguments, and execution options (timeout, retries).
51
+
Data models and error types. `FunctionNode` represents a function in the graph, including class metadata (`class_keywords`, `class_attrs`, `class_decorators`). `ImportInfo` represents a single import binding. `Task` is a frozen dataclass that bundles a serialized graph with function name, arguments, and execution options (timeout, retries).
52
52
53
53
### 3. Worker layer (`worker/`)
54
54
@@ -86,19 +86,20 @@ Worker.run(task)
86
86
-**Auto-discovery**: When a traced function calls an untraced user-defined function, pyfuse automatically finds and registers it. This is recursive. Class constructors (`MyClass()`), `@staticmethod`, `@classmethod`, and entire class hierarchies (via `super()`) are discovered too.
87
87
-**Cross-module inlining**: Imports like `from utils import helper` where `helper` is a user function get converted from import statements to inline dependency edges, making reconstructed code self-contained.
88
88
-**Module-level variables**: Constants and assignments (`MAX_RETRIES = 5`, `CONFIG = {...}`) referenced by traced functions are captured and emitted in reconstructed source.
89
-
-**Closure handling**: Captured variables are serialized via `repr()` and hoisted as keyword-only parameters with defaults. Traced function references become dependency edges.
89
+
-**Class-level attributes**: Class body statements (assignments, annotated assignments, docstrings) are extracted from AST and emitted in reconstructed class blocks. Class decorators (`@dataclass`, etc.) and metaclass keywords (`metaclass=ABCMeta`) are captured and emitted.
90
+
-**Closure handling**: Multi-tier capture: `repr()` validation, then lambda source extraction, then auto-discovery for non-traced user functions, then constructor expressions for common stdlib types (`defaultdict`, `Counter`, `deque`), then pickle fallback for picklable objects. Traced function references become dependency edges.
90
91
-**Decorator stripping**: `@trace` lines are removed from captured source so reconstructed code doesn't depend on pyfuse.
91
92
-**Backend auto-detection**: `connect()` picks Redis or shared memory based on URL scheme. Falls back to `PYFUSE_BACKEND` env var.
92
93
-**Worker caching**: Keyed by SHA-256 of all reachable content hashes (sorted + joined). Same code from different clients = cache hit.
93
94
-**Async transparency**: Workers detect `async def` functions and run them via `asyncio.run()`. Results are awaitable via `asyncio.Future` fan-out.
94
95
-**Notification fan-out**: `ResultWaiter` singleton per backend runs one listener thread and one heartbeat thread, serving all pending `Result` objects. No per-task polling.
95
96
-**Heartbeat monitoring**: Workers send heartbeats every 1s. Client-side stall detection tracks when heartbeat *values* last changed using local monotonic clock (no cross-machine timestamp comparison).
96
97
97
-
## Serialization format (v0.3.0)
98
+
## Serialization format (v0.4.0)
98
99
99
100
```json
100
101
{
101
-
"version": "0.3.0",
102
+
"version": "0.4.0",
102
103
"objects": {
103
104
"<content_hash>": {
104
105
"name": "func_name",
@@ -109,7 +110,10 @@ Worker.run(task)
109
110
"closure_vars": {},
110
111
"closure_func_refs": {},
111
112
"module_vars": {},
112
-
"class_bases": []
113
+
"class_bases": [],
114
+
"class_keywords": {},
115
+
"class_attrs": [],
116
+
"class_decorators": []
113
117
}
114
118
},
115
119
"deps": {"<hash>": ["<dep_hash>", ...]},
@@ -155,7 +159,7 @@ pytest # run all tests
155
159
pytest tests/test_api.py # specific module
156
160
```
157
161
158
-
15 test modules covering: API surface, AST analysis, async features (aresult, await, arun, amap, gather, heartbeat, stall detection, notification-based result delivery), auto-discovery, dependency management, graph operations, integration scenarios, remote execution, runtime tracing, shared memory backend, store operations, stress tests (47 functions across 7 files), task serialization, temp venv management, and worker caching/execution.
162
+
15 test modules covering: API surface, AST analysis, async features (aresult, await, arun, amap, gather, heartbeat, stall detection, notification-based result delivery), auto-discovery (including metaclass keywords, class attributes, class decorators, `__init_subclass__`), dependency management, graph operations, integration scenarios, remote execution, runtime tracing (including closure capture of non-traced functions, lambdas, constructor expressions, pickle fallback), shared memory backend, store operations, stress tests (47 functions across 7 files), task serialization, temp venv management, and worker caching/execution.
159
163
160
164
## Development
161
165
@@ -175,3 +179,6 @@ pytest # test suite
175
179
- Backend implementations must satisfy the `Backend` ABC in `backends/base.py`. New methods (`notify_result`, `subscribe_results`, `get_heartbeats`) are non-abstract with safe defaults -- custom backends don't break.
176
180
-`ResultWaiter` in `result.py` is a per-backend singleton with two daemon threads (listener + heartbeat). It uses `loop.call_soon_threadsafe()` for async fan-out and `threading.Event` for sync fan-out.
177
181
-`install_package_as()` is a no-op at runtime; the AST analyzer in `decorator.py`/`analyzer.py` detects the `with` block pattern and tags `ImportInfo` objects with the package name.
182
+
-`_capture_closure()` in `graph.py` uses a multi-tier strategy: repr validation → traced functions → lambdas (source extraction) → non-traced user functions (auto-registration) → constructor expressions (defaultdict/Counter/deque) → pickle fallback → warning. Returns function objects for auto-registration.
183
+
-`_set_class_metadata()` in `graph.py` captures class-level attributes and decorators from the class source AST. Called from both `_auto_register_class` and `_discover_self_call_deps` to handle both constructor-discovered and directly-traced method classes.
184
+
-`_resolve_class_bases()` now also extracts class definition keywords (e.g., `metaclass=ABCMeta`) and adds necessary imports for keyword values.
tracing.py Runtime call-stack tracing via contextvars
@@ -153,17 +153,23 @@ Cross-module imports (e.g., `from utils import helper`) are converted from impor
153
153
154
154
Class constructors (`MyClass()`) are auto-discovered: pyfuse registers all user-defined methods of the class. `@staticmethod` and `@classmethod` descriptors are unwrapped and registered correctly. When a method uses `super()`, base classes and their methods are discovered recursively, and `class Foo(Base):` headers are emitted in reconstructed source.
155
155
156
+
Class-level attributes (assignments, annotated assignments, docstrings) are extracted from the class source AST and emitted in reconstructed class blocks. Class decorators (e.g., `@dataclass`) are captured and emitted above the class header. Metaclass keywords (e.g., `metaclass=ABCMeta`) and other class keywords are extracted from the class definition and included in the reconstructed header.
157
+
156
158
Module-level constants and variables referenced by traced functions (e.g., `MAX_RETRIES = 5`) are captured and emitted in reconstructed source.
157
159
158
160
**Not auto-discovered:** standard library functions, third-party packages (kept as imports).
159
161
160
162
### 5. Closure capture
161
163
162
-
If the function captures variables from an enclosing scope:
163
-
- Values are serialized via `repr()` and validated with `ast.parse()`.
164
-
- Valid reprs become keyword-only parameters with defaults in reconstructed code.
165
-
- Traced function references are recorded as dependency edges.
166
-
- Invalid reprs trigger a warning and are skipped.
164
+
If the function captures variables from an enclosing scope, pyfuse uses a multi-tier capture strategy:
165
+
166
+
1.**`repr()` validation** -- Values whose `repr()` is valid Python (passes `ast.parse()`) are stored directly. They become keyword-only parameters with defaults in reconstructed code.
167
+
2.**Traced functions** -- References to `@trace`-decorated functions are recorded as dependency edges.
168
+
3.**Lambda functions** -- Source is extracted via `inspect.getsource()` + AST walking, stored as a closure variable expression.
169
+
4.**Non-traced user functions** -- Automatically discovered and registered as dependencies (same as traced functions).
170
+
5.**Constructor expressions** -- Common stdlib types (`defaultdict`, `Counter`, `deque`) whose `repr()` isn't valid Python are captured via self-contained constructor expressions (e.g., `__import__('collections').defaultdict(int, {'a': 1})`).
171
+
6.**Pickle fallback** -- Picklable objects are serialized via `pickle.dumps()` + base64 encoding into a self-contained expression.
172
+
7.**Warning** -- Objects that can't be captured by any method trigger a warning with the variable name and type.
167
173
168
174
### 6. Runtime tracing
169
175
@@ -204,7 +210,7 @@ When arguments contain class instances, a custom JSON encoder serializes them vi
Because dependencies are excluded from the hash, adding or removing an edge never changes a node's hash. This enables workers to cache objects by hash and request only missing ones: `missing = incoming.keys() - cached.keys()`.
317
324
@@ -323,7 +330,7 @@ Given a store and a target function name:
323
330
2.**Walk** -- BFS through `deps` to collect all transitive dependencies.
324
331
3.**Sort** -- Topological sort: dependencies before dependents.
325
332
4.**Deduplicate imports** -- Merge imports across all functions.
326
-
5.**Assemble** -- Emit imports, then module-level variable assignments, then functions in order. Methods are grouped into `class` blocks with proper base classes. Closure variables become keyword-only parameters with defaults.
333
+
5.**Assemble** -- Emit imports, then module-level variable assignments, then functions in order. Methods are grouped into `class` blocks with decorators, base classes, metaclass keywords, class-level attributes, and methods. Closure variables become keyword-only parameters with defaults.
327
334
328
335
## Data model
329
336
@@ -356,6 +363,9 @@ One function in the dependency graph:
356
363
|`closure_func_refs`|`dict[str, str]` -- references to traced functions captured in closures |
|`class_decorators`|`list[str]` -- class decorator source strings (without `@` prefix) |
359
369
360
370
## Dependency auto-installation
361
371
@@ -449,16 +459,15 @@ When a module contains `from X import *`:
449
459
- Circular dependencies raise `CycleError` during reconstruction.
450
460
451
461
### Closure capture
452
-
- Values whose `repr()` is not valid Python (file handles, sockets, etc.) are skipped with a warning.
453
-
- Non-traced callables captured in closures are skipped.
462
+
- Objects that are neither repr-serializable, picklable, nor user-defined callables are skipped with a warning (e.g., file handles, sockets).
454
463
455
464
### Imports
456
465
- Relative star imports (`from . import *`) are not supported.
457
466
- Aliased cross-module imports (`from utils import helper as h`) are skipped to avoid name mismatches.
458
467
459
468
### Classes
460
-
- Metaclasses and `__init_subclass__` hooks are not replayed on the worker.
461
-
- Class-level attributes that aren't assignments (e.g., descriptors created by external decorators) may not be captured.
469
+
- Metaclasses and `__init_subclass__` hooks are replayed when the parent class is in the dependency tree (i.e., referenced via `super()` or constructor call).
470
+
- Class-level attributes defined via complex descriptors or external decorators (beyond simple assignments) may not be captured.
0 commit comments