PytorchConnectomics · akgohain · Apr 13, 2026
diff --git a/docs/integrations/cave-integration-spike.md b/docs/integrations/cave-integration-spike.md
@@ -0,0 +1,82 @@
+# CAVE Integration Spike (Non-invasive)
+
+## Purpose and target use case
+
+This spike defines how the existing workflow artifacts produced by this app could be transformed into **CAVE-compatible payloads** for downstream visualization and annotation pipelines, while preserving current behavior.
+
+Target use case in this app:
+- A workflow route produces internal artifacts (e.g., dataset IDs, segmentation IDs, metadata blobs, and coordinate references).
+- An explicit future integration step (not in default execution path) invokes a CAVE adapter to:
+  1. normalize workflow artifacts into a CAVE request model,
+  2. submit or hand off payloads to a CAVE-facing client,
+  3. map CAVE responses back into app-level integration results.
+
+Non-goal for this spike:
+- No live CAVE API calls.
+- No authentication or network wiring.
+- No changes to existing routes, services, or startup behavior.
+
+## Expected adapter inputs and outputs
+
+### Input contract (adapter-facing)
+
+Expected input object fields for conversion:
+- `workflow_id: str` — stable workflow/run identifier.
+- `artifact_uri: str` — URI/path to workflow-produced artifact.
+- `dataset_id: str` — source dataset identifier used by this app.
+- `segmentation_id: str | None` — optional segmentation or object collection handle.
+- `point_xyz: tuple[float, float, float] | None` — optional spatial anchor.
+- `metadata: dict[str, object]` — passthrough metadata for provenance.
+
+### Output contract (adapter-facing)
+
+Expected output object fields after conversion from CAVE-like response:
+- `workflow_id: str` — original workflow ID echoed for correlation.
+- `cave_object_id: str` — external object/resource identifier.
+- `status: str` — normalized adapter status (`"ready"`, `"pending"`, `"error"`).
+- `detail: dict[str, object]` — provider-specific details retained for debugging.
+
+## Mapping: workflow artifacts → CAVE concepts
+
+| App workflow artifact | CAVE concept | Notes |
+|---|---|---|
+| `dataset_id` | CAVE `datastack` / dataset namespace | one-to-one candidate mapping; final naming TBD |
+| `segmentation_id` | CAVE segmentation/table reference | may map to a root ID table depending on deployment |
+| `point_xyz` | CAVE spatial query point | coordinate frame assumptions must be validated |
+| `artifact_uri` | CAVE payload provenance reference | used for traceability, not transport itself |
+| `metadata` | CAVE annotation/provenance fields | allow selective pass-through with allowlist |
+
+## Proposed integration flow (future, explicit path only)
+
+1. Existing workflow completes artifact generation exactly as today.
+2. Optional integration entrypoint calls `CaveWorkflowAdapter.build_payload(...)`.
+3. A future network client (not part of this spike) sends payload to CAVE.
+4. Optional integration entrypoint calls `CaveWorkflowAdapter.parse_result(...)`.
+5. Caller decides whether/how to persist mapped result.
+
+Because steps 2–5 are opt-in and not wired into current routes, runtime behavior remains unchanged.
+
+## TODOs and assumptions
+
+### Auth assumptions (TODO)
+- TODO: define auth mode (service account token vs user-delegated token).
+- TODO: define secure token source (secret manager / env var injection policy).
+- TODO: define token refresh and expiration handling contract.
+
+### Network assumptions (TODO)
+- TODO: identify CAVE endpoint base URL(s) per environment.
+- TODO: define retry/backoff and timeout defaults for network client.
+- TODO: define circuit-breaker behavior for upstream CAVE outages.
+
+### Deployment assumptions (TODO)
+- TODO: decide whether adapter + client run in API container or async worker.
+- TODO: define feature flag gating and rollout plan (staging → production).
+- TODO: define observability requirements (structured logs, metrics, tracing).
+
+## Actionable next steps
+
+1. Confirm canonical mapping for `dataset_id`, `segmentation_id`, and coordinate frame with platform owners.
+2. Add an explicit feature flag and non-default route hook for integration execution.
+3. Implement a separate CAVE client module with auth/network concerns isolated from adapter mapping logic.
+4. Add integration tests with mocked CAVE responses and failure scenarios.
+5. Prepare deployment runbook covering credentials, endpoints, and rollback steps.
diff --git a/server_api/workflow/cave_adapter.py b/server_api/workflow/cave_adapter.py
@@ -0,0 +1,64 @@
+"""CAVE workflow adapter spike.
+
+This module intentionally provides interface stubs only.
+It is inert unless explicitly imported and called by future integration code.
+"""
+
+from __future__ import annotations
+
+from dataclasses import dataclass
+from typing import Any
+
+
+@dataclass(frozen=True)
+class WorkflowArtifact:
+    """App-level artifact payload expected by the adapter."""
+
+    workflow_id: str
+    artifact_uri: str
+    dataset_id: str
+    segmentation_id: str | None = None
+    point_xyz: tuple[float, float, float] | None = None
+    metadata: dict[str, Any] | None = None
+
+
+@dataclass(frozen=True)
+class CavePayload:
+    """Normalized payload to be consumed by a future CAVE client."""
+
+    workflow_id: str
+    datastack: str
+    segmentation_ref: str | None
+    point_xyz: tuple[float, float, float] | None
+    provenance: dict[str, Any]
+
+
+@dataclass(frozen=True)
+class CaveResult:
+    """Mapped adapter output normalized for app-level callers."""
+
+    workflow_id: str
+    cave_object_id: str
+    status: str
+    detail: dict[str, Any]
+
+
+class CaveWorkflowAdapter:
+    """Adapter interface scaffold for future CAVE integration.
+
+    No network/auth behavior is implemented in this spike.
+    """
+
+    def build_payload(self, artifact: WorkflowArtifact) -> CavePayload:
+        """Convert a workflow artifact into a CAVE payload.
+
+        TODO: finalize mapping with production CAVE schema.
+        """
+        raise NotImplementedError("Spike scaffold: mapping logic is not implemented")
+
+    def parse_result(self, workflow_id: str, response: dict[str, Any]) -> CaveResult:
+        """Convert a CAVE response into a normalized app-level result.
+
+        TODO: finalize result normalization once CAVE response schema is fixed.
+        """
+        raise NotImplementedError("Spike scaffold: result parsing is not implemented")
diff --git a/tests/test_cave_adapter_contract.py b/tests/test_cave_adapter_contract.py
@@ -0,0 +1,49 @@
+"""Contract tests for CAVE adapter spike scaffolding."""
+
+from __future__ import annotations
+
+import importlib.util
+import sys
+from pathlib import Path
+
+
+MODULE_PATH = Path("server_api/workflow/cave_adapter.py")
+
+
+def _load_module():
+    spec = importlib.util.spec_from_file_location("cave_adapter", MODULE_PATH)
+    assert spec is not None and spec.loader is not None
+    module = importlib.util.module_from_spec(spec)
+    sys.modules[spec.name] = module
+    spec.loader.exec_module(module)
+    return module
+
+
+def test_adapter_module_can_be_loaded_explicitly_without_side_effects():
+    module = _load_module()
+    assert hasattr(module, "CaveWorkflowAdapter")
+    assert hasattr(module, "WorkflowArtifact")
+
+
+def test_adapter_interface_methods_are_stubbed_only():
+    module = _load_module()
+    adapter = module.CaveWorkflowAdapter()
+    artifact = module.WorkflowArtifact(
+        workflow_id="wf-1",
+        artifact_uri="s3://bucket/artifact.json",
+        dataset_id="dataset-A",
+    )
+
+    try:
+        adapter.build_payload(artifact)
+    except NotImplementedError as exc:
+        assert "Spike scaffold" in str(exc)
+    else:
+        raise AssertionError("build_payload should be a stub in this spike")
+
+    try:
+        adapter.parse_result("wf-1", {"id": "obj-1"})
+    except NotImplementedError as exc:
+        assert "Spike scaffold" in str(exc)
+    else:
+        raise AssertionError("parse_result should be a stub in this spike")