This is a demo-ready skeleton implementing the case study described in Scispot Case Study (1).pdf (see prompt). It runs an example 96-well ELISA analysis workflow end-to-end with pluggable backends:
- S3: real AWS S3 (recommended) or mock filesystem fallback under
data/s3/(app/mocks.py) - Elasticsearch: real Docker Elasticsearch (recommended) or mock JSONL fallback under
data/es/(app/mocks.py) - Agentic report: multi-step “agent” sequence (Summarizer → Comparator → ReportWriter), with optional Cerebras LLM
flowchart LR
Client[Scientist / UI / CLI] --> API[FastAPI Service]
API -->|Start/Query| Temporal[Temporal Cluster]
Temporal --> Worker[Python Worker]
Worker --> S3[(S3: AWS or local mock)]
Worker --> ES[(Elasticsearch: Docker or local mock)]
Worker --> LLM[Agent (Cerebras or stub)]
- Workflow builder concept: workflow templates are stored as JSON with a list of reusable steps (a tiny DSL).
- Temporal:
WorkflowRunnerWorkflow: executes a template step-by-step.ElisaAnalysisWorkflow: the anchor scenario as a concrete workflow (also runnable directly).
- FastAPI surface:
- Create/update template
- Trigger a run for an experiment
- Check run status
- Fetch outputs (analysis object + report)
- Start Temporal + UI:
docker compose up -dIf you’re running this from WSL and see Docker permission errors, run with sudo (for demo simplicity).
- Create and activate a venv, install deps:
python -m venv .venv
.\.venv\Scripts\activate
pip install -r requirements.txt- Start the worker (in one terminal):
python -m app.worker- Start the API (in another terminal):
python -m app.api- Temporal UI:
http://localhost:8081 - API docs:
http://localhost:8000/docs
By default, the agent step is a stub (no external LLM calls). To enable a real model call, set:
LLM_PROVIDER=cerebras+CEREBRAS_API_KEY=...(Cerebras OpenAI-compatible endpoint; model defaultgpt-oss-120b)
Optional overrides:
CEREBRAS_MODEL(default:gpt-oss-120b)CEREBRAS_BASE_URL(default:https://api.cerebras.ai/v1)CEREBRAS_TIMEOUT_SECONDS(default:30)LLM_FALLBACK_TO_STUB(default:true) — iffalse, the workflow will fail (and Temporal will retry) on LLM errors.
By default, this project uses a local filesystem “mock S3” under data/s3/.
If you created a real S3 bucket (single bucket + prefixes), set:
S3_BACKEND=awsS3_BUCKET=scispot-case-study(your bucket name)AWS_REGION=us-east-2(or your bucket’s region)
Optional:
S3_KEY_PREFIX=dev(writes tos3://<bucket>/dev/raw-plates/..., etc.)
The code will store objects under prefixes matching the existing demo buckets:
raw-plates/<experiment_id>/<run_id>/raw_plate.jsonanalysis-results/<experiment_id>/<run_id>/analysis.jsonreports/<experiment_id>/<run_id>/report.jsonruns/<workflow_run_id>/run_manifest.json
Credentials:
- Use any standard AWS credential method supported by
boto3(env vars,~/.aws/credentials, IAM role).
By default, this project uses a local JSONL “mock Elasticsearch” under data/es/.
For local development, you can run Elasticsearch via Docker:
docker compose up -d elasticsearchThen enable it in .env:
ES_BACKEND=elasticsearchES_URL=http://localhost:9200- Optional:
ES_INDEX_EXPERIMENT_SUMMARIES=experiment-summaries
Verify it’s up:
curl http://localhost:9200
curl "http://localhost:9200/_cat/indices?v"After running a workflow, you can see indexed documents with:
curl "http://localhost:9200/experiment-summaries/_search?pretty"- Create a workflow template (or use the built-in
elisa_v1template). - Trigger a run for an
experiment_id. - Poll status; fetch outputs when complete.
Example (curl):
curl -X POST http://localhost:8000/runs -H "Content-Type: application/json" -d "{\"template_id\":\"elisa_v1\",\"experiment_id\":\"EXP-123\"}"Then:
curl http://localhost:8000/runs/<workflow_id>/status
curl "http://localhost:8000/runs/<workflow_id>/outputs?experiment_id=EXP-123"- Idempotency: activities write deterministic keys (based on
run_id+experiment_id) and overwrite safely. - Retries: activities have Temporal retry policies; external calls are isolated to activities.
- Waiting for data: in production, use a
wait_for_s3_objectactivity with backoff and/or a Temporal signal when an uploader finishes. - Workflow builder: next step is validation + versioning, plus a UI that composes reusable steps with typed inputs/outputs.
- Observability: add OpenTelemetry, structured logs with correlation IDs, and metrics per step (duration, retries).