Background
When the ARC ingestion pipeline dispatches a sync task to Celery, the ARC is currently passed as a Python dict. Celery's JSON serializer then serializes it internally, and on the worker side json.dumps must be called again before ARC.from_rocrate_json_string can be invoked — two serialization steps.
The API client already uses the correct pattern: ARC is passed as a serialized JSON string, not a dict.
Proposed Change
Pass the ARC as a JSON string at the point of Celery dispatch (API side). The worker receives the string and can call ARC.from_rocrate_json_string directly — one explicit serialization step, no intermediate json.dumps.
Affected code:
middleware/api/src/middleware/api/business_logic/arc_manager.py — serialize to JSON string before dispatching
- Celery task handler (worker side) — remove any intermediate
json.dumps call
Pros of JSON string over dict
- One explicit serialization step instead of two
- Celery treats the string as an opaque value — no risk of float-rounding or key-order changes introduced by Celery's own serializer
- Directly compatible with
ARC.from_rocrate_json_string on the worker — no intermediate step
- Easier to log and debug (the payload is already a valid JSON document)
Related
middleware/api/spec/arc-manager/design.md — Key Decision 7 (currently documents dict; update once implemented)
spec/principles.md — "ARC objects must not cross process boundaries via pickle"
Background
When the ARC ingestion pipeline dispatches a sync task to Celery, the ARC is currently passed as a Python
dict. Celery's JSON serializer then serializes it internally, and on the worker sidejson.dumpsmust be called again beforeARC.from_rocrate_json_stringcan be invoked — two serialization steps.The API client already uses the correct pattern: ARC is passed as a serialized JSON string, not a dict.
Proposed Change
Pass the ARC as a JSON string at the point of Celery dispatch (API side). The worker receives the string and can call
ARC.from_rocrate_json_stringdirectly — one explicit serialization step, no intermediatejson.dumps.Affected code:
middleware/api/src/middleware/api/business_logic/arc_manager.py— serialize to JSON string before dispatchingjson.dumpscallPros of JSON string over dict
ARC.from_rocrate_json_stringon the worker — no intermediate stepRelated
middleware/api/spec/arc-manager/design.md— Key Decision 7 (currently documentsdict; update once implemented)spec/principles.md— "ARC objects must not cross process boundaries via pickle"