Skip to content

Commit 93379f2

Browse files
authored
docs: refresh README with visual demo assets (#123)
Improve README readability and GitHub rendering by restructuring sections, adding demo and workflow images, and clarifying quickstart guidance.
1 parent 71e109d commit 93379f2

7 files changed

Lines changed: 162 additions & 104 deletions

File tree

README.md

Lines changed: 162 additions & 104 deletions
Original file line numberDiff line numberDiff line change
@@ -1,202 +1,260 @@
1-
# DeepEye
1+
# DeepEye: A Steerable Self-driving Data Agent System
22

3+
[![arXiv](https://img.shields.io/badge/arXiv-2603.28889-b31b1b.svg?logo=arXiv)](https://arxiv.org/abs/2603.28889)
34
[![CI](https://github.com/HKUSTDial/DeepEye/actions/workflows/pytest.yml/badge.svg)](https://github.com/HKUSTDial/DeepEye/actions/workflows/pytest.yml)
45
[![License: Apache-2.0](https://img.shields.io/badge/License-Apache--2.0-blue.svg)](LICENSE)
6+
[![SIGMOD](https://img.shields.io/badge/SIGMOD-Demo%202026-orange.svg)](https://doi.org/10.1145/3788853.3801612)
7+
[![GitHub Stars](https://img.shields.io/github/stars/HKUSTDial/DeepEye?style=social)](https://github.com/HKUSTDial/DeepEye)
58

6-
DeepEye harnesses the reasoning capabilities of large language models to autonomously orchestrate complex data analysis workflows. It combines a chat workspace, workflow orchestration, sandboxed execution, and artifact rendering for analytical outputs such as reports, dashboards, and data videos.
9+
**DeepEye** is a production-ready, steerable self-driving data agent system. Unlike linear "ChatBI" tools, DeepEye adopts a **workflow-centric architecture** that handles heterogeneous data sources and complex iterative analysis without context explosion. It autonomously orchestrates multi-step workflows to produce three classes of rich analytical artifacts:
710

8-
> **Project Status: Active Development Preview**
9-
>
10-
> DeepEye is being shared publicly while the codebase, documentation, tests, deployment workflow, and security hardening are still being actively improved. The repository is useful for local evaluation, collaboration, and understanding the architecture, but APIs, internal contracts, and setup details may continue to change as the project is refined.
11+
- 🎬 **Data Videos** — narrated, animated data stories rendered from structured analysis
12+
- 📊 **Dashboards** — interactive, live-updating visual dashboards
13+
- 📝 **Analytical Reports** — structured, analyst-grade written reports
1114

12-
## What DeepEye Does
15+
![DeepEye System Overview](./assets/overview.png)
1316

14-
- Turns data analysis requests into visible workflow drafts and execution runs.
15-
- Works with uploaded files and database-backed data sources.
16-
- Produces structured artifacts such as reports, dashboards, tables, files, and video previews.
17-
- Provides a React workspace for chat, workflow inspection, output preview, and session state.
18-
- Runs backend services, workers, storage, and runtime control through a local Docker Compose stack.
17+
Key architectural advantages:
18+
- 🔗 **Unified Multimodal Orchestration** — seamlessly integrates databases, documents, CSV/Excel files, and APIs in a single workflow
19+
- 🧠 **Hierarchical Reasoning** — decomposes complex intents into isolated AgentNodes and deterministic ToolNodes to eliminate hallucinations
20+
- ⚙️ **Workflow Engine** — a database-inspired execution pipeline that guarantees structural correctness and accelerates runtime through topology-aware scheduling
21+
- 👁 **Human-in-the-loop** — transparent step-by-step execution with a live workflow inspector and runtime steering controls
1922

20-
## Current Focus
23+
---
2124

22-
The project is currently being stabilized around a unified workflow model:
25+
## 🔥 News
2326

24-
```text
25-
session -> turn -> draft -> run -> artifact
26-
```
27+
- **[2026.05]**: DeepEye [paper](https://arxiv.org/abs/2603.28889) and [code](https://github.com/HKUSTDial/DeepEye) are released! Accepted at **SIGMOD Demo 2026**.
2728

28-
Ongoing work includes:
29+
---
2930

30-
- improving documentation and onboarding paths
31-
- tightening generated-code and sandbox execution boundaries
32-
- converging report, dashboard, and video flows onto shared workflow/artifact contracts
33-
- expanding automated tests and integration coverage
34-
- cleaning up legacy or duplicated internal paths
31+
## 🖥 Demo
32+
[![DeepEye Demo](./assets/demo.gif)](./assets/demo.mp4)
3533

36-
Progress is tracked in [docs/open_source_remediation_checklist.md](docs/open_source_remediation_checklist.md).
34+
🎬 Full demo video: [assets/demo.mp4](./assets/demo.mp4)
3735

38-
## Repository Layout
36+
Given heterogeneous data sources, DeepEye autonomously orchestrates a workflow and delivers Data Videos 🎬, Dashboards 📊, and Analytical Reports 📝 in one run.
3937

40-
| Path | Purpose |
41-
| --- | --- |
42-
| [packages/backend](packages/backend) | FastAPI API, Celery workers, workflow orchestration, persistence, sandbox/runtime integration |
43-
| [packages/core](packages/core) | Shared agent, datasource, workflow, graph, and sandbox primitives |
44-
| [packages/frontend](packages/frontend) | React + TypeScript workspace UI for chat, workflow, reports, dashboards, and video preview panels |
45-
| [docker](docker) | Dockerfiles, nginx config, scripts, and local runtime assets |
46-
| [docs](docs) | Architecture notes, RFCs, UI notes, and remediation tracking |
4738

48-
## Quick Start
39+
---
40+
41+
## 🚀 Quick Start
4942

5043
### Prerequisites
5144

5245
- Docker and Docker Compose
53-
- A supported LLM provider key and model
46+
- A supported LLM provider key and model (`LLM_API_KEY`, `LLM_BASE_URL`, `LLM_MODEL`)
5447
- `uv` for Python development and tests
55-
- Node.js/npm only if running the frontend outside Docker
48+
- Node.js / npm only if running the frontend outside Docker
5649

5750
### 1. Configure Environment
5851

59-
Copy the example environment file:
60-
6152
```bash
6253
cp env.example .env
6354
```
6455

65-
Then update the values in `.env`. At minimum, review:
56+
Open `.env` and set at minimum:
6657

67-
- `LLM_API_KEY`, `LLM_BASE_URL`, `LLM_MODEL`
68-
- `JWT_SECRET_KEY`
69-
- `POSTGRES_PASSWORD`
70-
- `MINIO_ACCESS_KEY`, `MINIO_SECRET_KEY`
71-
- `RETAIL_OPS_DB_USER`, `RETAIL_OPS_DB_PASSWORD`, `RETAIL_OPS_DB_NAME`
72-
- `HOST_GATEWAY_PORT` if port `8080` is already in use
58+
```bash
59+
LLM_API_KEY=...
60+
LLM_BASE_URL=...
61+
LLM_MODEL=...
62+
JWT_SECRET_KEY=...
63+
POSTGRES_PASSWORD=...
64+
MINIO_ACCESS_KEY=...
65+
MINIO_SECRET_KEY=...
66+
```
7367

74-
For shared development machines, set a unique `COMPOSE_PROJECT_NAME` and `HOST_GATEWAY_PORT` to avoid container, volume, and port conflicts.
68+
For shared machines, also set a unique `COMPOSE_PROJECT_NAME` and `HOST_GATEWAY_PORT` to avoid port conflicts.
7569

76-
### 2. Start the Local Stack
70+
### 2. Start the Stack
7771

7872
```bash
7973
docker compose up --build
8074
```
8175

82-
The Compose stack starts Postgres, Redis, MinIO, the backend API, Celery worker, runtime-control service, frontend, and nginx gateway. Database migrations are applied automatically before backend services start.
76+
This starts Postgres, Redis, MinIO, the backend API, Celery worker, runtime-control service, frontend, and nginx gateway. Database migrations run automatically.
8377

8478
### 3. Open DeepEye
8579

86-
By default, the app is available at:
87-
88-
```text
80+
```
8981
http://localhost:8080
9082
```
9183

92-
If you changed `HOST_GATEWAY_PORT`, use that port instead.
84+
> If `http://localhost:8080` is unavailable, verify your `.env` port settings (for example `HOST_GATEWAY_PORT`) and check that all Docker services are healthy with `docker compose ps`.
9385
9486
### 4. Stop the Stack
9587

9688
```bash
9789
docker compose down
90+
# add -v to also remove stored volumes/data
9891
```
9992

100-
Use `docker compose down -v` only when you intentionally want to remove local volumes and stored development data.
93+
---
10194

102-
## Development
95+
## 💡 Running Example
10396

104-
Run the full local quality gate:
97+
This walkthrough shows DeepEye completing a full sales analysis end-to-end.
10598

106-
```bash
107-
make check
108-
```
99+
### Step 1 — Connect Data Sources
109100

110-
Install dependencies before checking:
101+
In the **Data Sources** panel, add your files or databases. DeepEye supports:
102+
- Structured: PostgreSQL / MySQL databases, CSV, Excel
103+
- Semi-structured: JSON, XML
104+
- Unstructured: PDF, TXT, Markdown (indexed as Knowledge Bases)
105+
106+
### Step 2 — Describe Your Goal in Chat
111107

112-
```bash
113-
make check-install
108+
```
109+
@Financial Metrics @Meta-data @Sales Database
110+
Analyze the 2025 global sales performance,
111+
generate a video, dashboard, and analytical report.
114112
```
115113

116-
Validate Docker Compose configuration:
114+
DeepEye automatically creates a workflow plan and asks for confirmation before running.
117115

118-
```bash
119-
make compose-config
116+
### Step 3 — Inspect the Workflow
117+
118+
The **Workflow Graph** panel renders the generated DAG, for example:
119+
120+
![Generated Workflow DAG](./assets/workflow.png)
121+
122+
DeepEye provides node-level execution details:
123+
124+
![Workflow Node Inspector](./assets/node.png)
125+
126+
Each node shows its inputs, the exact SQL queries executed, status, and outputs.
127+
128+
### Step 4 — Review Outputs
129+
130+
Outputs are rendered inline in the right panel:
131+
132+
![DeepEye Output Artifacts](./assets/example.png)
133+
134+
Artifacts are versioned per session and can be exported or shared.
135+
136+
---
137+
138+
## 🏗 Architecture
139+
140+
```
141+
session → turn → draft → run → artifact
120142
```
121143

122-
### Backend and Core
144+
DeepEye's Workflow Engine processes every draft through four stages before execution:
145+
146+
| Stage | Role |
147+
|---|---|
148+
| **Compiler** | Parses the LLM-generated workflow plan into a typed DAG |
149+
| **Validator** | Checks node types, required parameters, and edge constraints |
150+
| **Optimizer** | Reorders independent nodes for maximum parallel execution |
151+
| **Executor** | Runs the DAG with isolated sandboxed code environments per node |
152+
153+
### Repository Layout
154+
155+
| Path | Purpose |
156+
|---|---|
157+
| [`packages/backend`](packages/backend) | FastAPI API, Celery workers, workflow orchestration, persistence, sandbox/runtime |
158+
| [`packages/core`](packages/core) | Shared agent, datasource, workflow, graph, and sandbox primitives |
159+
| [`packages/frontend`](packages/frontend) | React + TypeScript workspace: chat, workflow graph, artifact preview panels |
160+
| [`docker`](docker) | Dockerfiles, nginx config, scripts, and local runtime assets |
161+
| [`docs`](docs) | Architecture notes, RFCs, and remediation tracking |
162+
163+
---
123164

124-
Run the default backend/core test set:
165+
## 🛠 Development
166+
167+
Run the full local quality gate:
125168

126169
```bash
127-
uv run pytest packages/backend/app/test packages/core/tests -q
170+
make check # run after installing deps
171+
make check-install # install then check
172+
make compose-config # validate Docker Compose config
128173
```
129174

130-
Run Docker-backed sandbox integration tests explicitly:
175+
### Backend and Core Tests
131176

132177
```bash
178+
# Default test suite
179+
uv run pytest packages/backend/app/test packages/core/tests -q
180+
181+
# Docker-backed sandbox integration tests (requires Docker)
133182
DEEPEYE_RUN_DOCKER_TESTS=1 uv run pytest \
134183
packages/backend/app/test/test_sandbox.py \
135184
packages/backend/app/test/test_sandbox_manager.py -q
136-
```
137185

138-
Apply migrations manually when working outside the Compose flow:
139-
140-
```bash
186+
# Apply migrations manually outside Compose
141187
uv run alembic -c packages/backend/alembic.ini upgrade head
142188
```
143189

144190
### Frontend
145191

146-
The recommended path is to run the frontend through Docker Compose with the rest of the stack. For frontend-only development:
147-
148192
```bash
149193
cd packages/frontend
150194
npm install
151-
npm run dev
195+
npm run dev # dev server
196+
npm run build # production build
152197
```
153198

154-
Build the frontend:
199+
---
155200

156-
```bash
157-
npm run build
158-
```
159-
160-
## Security and Deployment Notes
201+
## 🔒 Security and Deployment
161202

162-
DeepEye orchestrates LLM-assisted workflows and Docker-backed execution runtimes. Treat the current stack as a local development environment unless you have reviewed and hardened the deployment for your own threat model.
203+
DeepEye orchestrates LLM-assisted workflows with Docker-backed code execution. Treat the current stack as a **local development environment** unless you have reviewed and hardened the deployment for your threat model.
163204

164205
Before exposing DeepEye beyond a trusted local environment:
165206

166-
- replace all example secrets and development credentials
167-
- review authentication, cookie, CORS, and gateway settings
168-
- review Docker socket access and runtime-control boundaries
169-
- review generated-code execution paths for reports, dashboards, and video generation
170-
- set resource limits and cleanup policies appropriate for your infrastructure
207+
- Replace all example secrets and development credentials
208+
- Review authentication, cookie, CORS, and gateway settings
209+
- Review Docker socket access and runtime-control boundaries
210+
- Review generated-code execution paths (reports, dashboards, video generation)
211+
- Set resource limits and cleanup policies for your infrastructure
212+
213+
See [docs/security_model.md](docs/security_model.md) for details.
171214

172-
## Documentation
215+
---
216+
217+
## 📚 Documentation
173218

174219
- [Documentation index](docs/README.md)
175220
- [Local quickstart](docs/quickstart_local.md)
176-
- [Security model](docs/security_model.md)
177-
- [Backend README](packages/backend/README.md)
178-
- [Frontend README](packages/frontend/README.md)
179-
- [Core README](packages/core/README.md)
180221
- [Workflow-native agent refactor RFC](docs/rfcs/workflow_native_agent_refactor.md)
181222
- [Artifact protocol RFC](docs/rfcs/artifact_protocol.md)
182-
- [Maintainability refactor plan](docs/rfcs/maintainability_refactor_plan.md)
183223
- [Open-source remediation checklist](docs/open_source_remediation_checklist.md)
184-
- [Release process](docs/release_process.md)
185224

186-
## Community And Governance
225+
---
226+
227+
## 👏 Contribution
228+
229+
We welcome all forms of contributions. Merged PRs will be credited as contributors.
187230

188-
- [Contributing guide](CONTRIBUTING.md)
189-
- [Security policy](SECURITY.md)
190-
- [Support](SUPPORT.md)
191-
- [Roadmap](ROADMAP.md)
192-
- [Changelog](CHANGELOG.md)
193-
- [Code of conduct](CODE_OF_CONDUCT.md)
194-
- [Maintainer guide](docs/maintainer_guide.md)
231+
- Bug reports and feature requests: open an [Issue](https://github.com/HKUSTDial/DeepEye/issues)
232+
- Code contributions: submit a [Pull Request](https://github.com/HKUSTDial/DeepEye/pulls)
233+
- Please read [CONTRIBUTING.md](CONTRIBUTING.md) before submitting
195234

196-
## License
235+
---
197236

198-
DeepEye is licensed under the [Apache License 2.0](LICENSE).
237+
## 🖋 Citation
238+
239+
If DeepEye is useful for your research or work, please cite:
240+
241+
```bibtex
242+
@inproceedings{10.1145/3788853.3801612,
243+
author = {Li, Boyan and Peng, Yiran and Xie, Yupeng and Lu, Sirong and Zhu, Yizhang and Mu, Xing and Liu, Xinyu and Luo, Yuyu},
244+
title = {DeepEye: A Steerable Self-driving Data Agent System},
245+
year = {2026},
246+
isbn = {9798400724503},
247+
publisher = {Association for Computing Machinery},
248+
address = {New York, NY, USA},
249+
url = {https://doi.org/10.1145/3788853.3801612},
250+
doi = {10.1145/3788853.3801612},
251+
booktitle = {Companion of the International Conference on Management of Data},
252+
pages = {74–77},
253+
numpages = {4},
254+
location = {India},
255+
series = {SIGMOD Companion '26}
256+
}
257+
```
199258

200-
## Project Maturity
259+
If you have questions, feel free to open an issue.
201260

202-
DeepEye is moving quickly. Some modules are production-shaped, while others are still being consolidated or documented. If you are evaluating the project, please expect active iteration and prefer the documented Docker Compose workflow as the most reliable way to run it locally.

assets/demo.gif

46.7 MB
Loading

assets/demo.mp4

18.9 MB
Binary file not shown.

assets/example.png

1.96 MB
Loading

assets/node.png

1.68 MB
Loading

assets/overview.png

932 KB
Loading

assets/workflow.png

198 KB
Loading

0 commit comments

Comments
 (0)