Skip to content

Commit 046c25c

Browse files
committed
docs: split readme into bilingual docs
1 parent 7d778ff commit 046c25c

12 files changed

Lines changed: 1320 additions & 716 deletions

README.md

Lines changed: 49 additions & 364 deletions
Large diffs are not rendered by default.

README.zh-CN.md

Lines changed: 50 additions & 352 deletions
Large diffs are not rendered by default.

docs/DEVELOPMENT.md

Lines changed: 179 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,179 @@
1+
# ChatCrystal Development Guide
2+
3+
English | [简体中文](DEVELOPMENT.zh-CN.md)
4+
5+
This guide covers repository structure, architecture, development commands, testing, and release workflows.
6+
7+
## Project Overview
8+
9+
ChatCrystal is a local-first AI conversation crystallization tool. It imports conversations from AI coding tools, generates structured notes with LLMs, builds embeddings for semantic search, and exposes both UI and MCP workflows.
10+
11+
## Monorepo Layout
12+
13+
```
14+
ChatCrystal/
15+
├── shared/ # Shared TypeScript types
16+
├── server/ # Fastify backend, CLI, MCP server
17+
├── client/ # React SPA
18+
├── electron/ # Electron main and preload processes
19+
├── skills/ # Publishable ChatCrystal agent skills
20+
├── docs/ # Maintainer and user documentation
21+
├── scripts/ # Release and utility scripts
22+
└── site/ # Project website
23+
```
24+
25+
## Tech Stack
26+
27+
| Layer | Technology |
28+
|---|---|
29+
| Backend | Node.js, Fastify v5, TypeScript |
30+
| Frontend | Vite v8, React 19, Tailwind CSS v4, TanStack React Query v5 |
31+
| Desktop | Electron, electron-builder |
32+
| Database | sql.js WASM SQLite |
33+
| LLM | Vercel AI SDK v6 |
34+
| Embeddings | vectra local vector index |
35+
| Queue | p-queue |
36+
| File watching | chokidar |
37+
38+
## Development Commands
39+
40+
```bash
41+
npm run dev # Server 3721 + client 13721
42+
npm run build # Build server and client
43+
npm start # Production server
44+
npm run lint # Biome + client ESLint
45+
npm run lint:fix # Apply safe lint fixes
46+
npm run test -w server # Server tests
47+
npm run dev:electron # Electron dev mode
48+
npm run build:electron # Build Windows installer
49+
npm run pack:electron # Build unpacked Electron app
50+
npm run eval:experience -w server
51+
```
52+
53+
`npm run eval:experience -w server` runs the offline calibration suite for the experience quality gate.
54+
55+
## Runtime Data
56+
57+
Runtime data is stored in `config.json` and `chatcrystal.db` under the active data directory.
58+
59+
Default data directory:
60+
61+
- CLI, MCP, npm package, repository checkout, and Electron: `~/.chatcrystal/data`
62+
- Explicit override: `DATA_DIR`
63+
64+
Electron sets `ELECTRON=true`, `DATA_DIR`, and `ELECTRON_PACKAGED` when applicable.
65+
66+
## Data Flow
67+
68+
```
69+
AI tool conversation files
70+
-> SourceAdapter scan/parse
71+
-> Import service deduplication
72+
-> SQLite conversations/messages
73+
-> Summarization queue
74+
-> LLM structured note generation
75+
-> Embedding generation
76+
-> vectra semantic index
77+
-> REST API, UI, CLI, MCP
78+
```
79+
80+
## Summarization Pipeline
81+
82+
ChatCrystal uses turn-based transcript preparation before summarization:
83+
84+
1. Split messages into user-assistant turns.
85+
2. Keep the user instruction plus the first and last substantial assistant replies in each turn.
86+
3. Score turns by instruction length and assistant engagement.
87+
4. Always include the first turn and final turns.
88+
5. Fill the remaining budget with high-value middle turns.
89+
6. Compress skipped turns into one-line previews.
90+
91+
Structured output uses Vercel AI SDK `generateObject()` with Zod schemas. This avoids fragile JSON extraction and lets schema validation retry invalid model output.
92+
93+
## Source Adapters
94+
95+
Add a new source by implementing `SourceAdapter`:
96+
97+
```typescript
98+
interface SourceAdapter {
99+
name: string;
100+
displayName: string;
101+
detect(): Promise<SourceInfo | null>;
102+
scan(): Promise<ConversationMeta[]>;
103+
parse(meta: ConversationMeta): Promise<ParsedConversation>;
104+
}
105+
```
106+
107+
Built-in adapters:
108+
109+
| Adapter | Data Source | Format |
110+
|---|---|---|
111+
| `claude-code` | `~/.claude/projects/**/*.jsonl` | JSONL conversation log |
112+
| `codex` | `~/.codex/sessions/**/rollout-*.jsonl` | JSONL event stream |
113+
| `cursor` | Cursor `workspaceStorage/state.vscdb` | SQLite KV store |
114+
| `trae` | Trae `workspaceStorage/state.vscdb` | SQLite KV store |
115+
| `copilot` | VS Code `workspaceStorage/chatSessions/*.jsonl` | JSONL snapshots |
116+
117+
Create the adapter under `server/src/parser/adapters/` and register it in `server/src/parser/index.ts`.
118+
119+
## API Surface
120+
121+
Key REST endpoints:
122+
123+
| Method | Path | Description |
124+
|---|---|---|
125+
| GET | `/api/status` | Server status and statistics |
126+
| GET | `/api/config` | Current config with secrets redacted |
127+
| POST | `/api/config` | Update provider config |
128+
| POST | `/api/import/scan` | Trigger import |
129+
| GET | `/api/conversations` | List conversations |
130+
| GET | `/api/conversations/:id` | Conversation detail |
131+
| POST | `/api/conversations/:id/summarize` | Summarize one conversation |
132+
| POST | `/api/summarize/batch` | Batch summarization |
133+
| GET | `/api/notes` | List notes |
134+
| GET | `/api/notes/:id` | Note detail |
135+
| GET | `/api/search?q=...&expand=true` | Semantic search |
136+
| GET | `/api/relations/graph` | Knowledge graph data |
137+
| GET | `/api/queue/status` | Queue status |
138+
139+
## Knowledge Graph
140+
141+
The relation system supports these relation types:
142+
143+
| Relation | Meaning |
144+
|---|---|
145+
| `CAUSED_BY` | Causation |
146+
| `LEADS_TO` | Leads to |
147+
| `RESOLVED_BY` | Resolved by |
148+
| `SIMILAR_TO` | Similar topic |
149+
| `CONTRADICTS` | Contradiction |
150+
| `DEPENDS_ON` | Dependency |
151+
| `EXTENDS` | Extension |
152+
| `REFERENCES` | Reference |
153+
154+
Relations can be discovered by LLM, added manually, or followed during semantic search expansion.
155+
156+
## Testing
157+
158+
Primary verification:
159+
160+
```bash
161+
npm run test -w server
162+
npm run build
163+
npm run lint
164+
npm run eval:experience -w server
165+
```
166+
167+
Use focused server tests while iterating, then run the full commands before committing.
168+
169+
## Release
170+
171+
```bash
172+
npm run release
173+
npm run release -- minor
174+
npm run release -- major
175+
npm run release -- 1.0.0
176+
```
177+
178+
The release script bumps version, creates a git tag, and pushes so CI can build and publish artifacts.
179+

docs/DEVELOPMENT.zh-CN.md

Lines changed: 179 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,179 @@
1+
# ChatCrystal 开发者指南
2+
3+
[English](DEVELOPMENT.md) | 简体中文
4+
5+
本文档说明仓库结构、架构、开发命令、测试和发布流程。
6+
7+
## 项目概览
8+
9+
ChatCrystal 是一个本地优先的 AI 对话经验沉淀工具。它从 AI 编程工具中导入对话,用 LLM 生成结构化笔记,为语义搜索建立 Embedding,并同时提供 UI、CLI 和 MCP 工作流。
10+
11+
## Monorepo 结构
12+
13+
```
14+
ChatCrystal/
15+
├── shared/ # 共享 TypeScript 类型
16+
├── server/ # Fastify 后端、CLI、MCP server
17+
├── client/ # React SPA
18+
├── electron/ # Electron main/preload 进程
19+
├── skills/ # 可发布的 ChatCrystal agent skills
20+
├── docs/ # 用户与维护者文档
21+
├── scripts/ # 发布与辅助脚本
22+
└── site/ # 项目官网
23+
```
24+
25+
## 技术栈
26+
27+
|| 技术 |
28+
|---|---|
29+
| 后端 | Node.js, Fastify v5, TypeScript |
30+
| 前端 | Vite v8, React 19, Tailwind CSS v4, TanStack React Query v5 |
31+
| 桌面 | Electron, electron-builder |
32+
| 数据库 | sql.js WASM SQLite |
33+
| LLM | Vercel AI SDK v6 |
34+
| Embeddings | vectra 本地向量索引 |
35+
| 队列 | p-queue |
36+
| 文件监听 | chokidar |
37+
38+
## 开发命令
39+
40+
```bash
41+
npm run dev # Server 3721 + client 13721
42+
npm run build # 构建 server 和 client
43+
npm start # 生产 server
44+
npm run lint # Biome + client ESLint
45+
npm run lint:fix # 应用安全 lint 修复
46+
npm run test -w server # Server 测试
47+
npm run dev:electron # Electron 开发模式
48+
npm run build:electron # 构建 Windows 安装包
49+
npm run pack:electron # 构建未打包 Electron 应用
50+
npm run eval:experience -w server
51+
```
52+
53+
`npm run eval:experience -w server` 用于运行经验质量门槛的离线校准样本。
54+
55+
## 运行时数据
56+
57+
运行时数据保存在当前数据目录下的 `config.json``chatcrystal.db`
58+
59+
默认数据目录:
60+
61+
- CLI、MCP、npm 包、仓库 checkout 和 Electron:`~/.chatcrystal/data`
62+
- 显式覆盖:`DATA_DIR`
63+
64+
Electron 会按需设置 `ELECTRON=true``DATA_DIR``ELECTRON_PACKAGED`
65+
66+
## 数据流
67+
68+
```
69+
AI 工具对话文件
70+
-> SourceAdapter scan/parse
71+
-> Import service 去重
72+
-> SQLite conversations/messages
73+
-> Summarization queue
74+
-> LLM 结构化笔记生成
75+
-> Embedding 生成
76+
-> vectra 语义索引
77+
-> REST API, UI, CLI, MCP
78+
```
79+
80+
## 摘要流水线
81+
82+
ChatCrystal 在摘要前使用 turn-based 对话预处理:
83+
84+
1. 将消息切分为 user-assistant turn。
85+
2. 每个 turn 保留用户指令和助手首尾两条实质回复。
86+
3. 根据指令长度和助手参与度给 turn 评分。
87+
4. 固定保留第一个 turn 和最后几个 turn。
88+
5. 剩余预算给高价值中间 turn。
89+
6. 被跳过的 turn 压缩成单行预览。
90+
91+
结构化输出使用 Vercel AI SDK 的 `generateObject()` 和 Zod schema。这样可以避免脆弱的 JSON 提取,并在模型输出不符合 schema 时自动重试。
92+
93+
## 数据源适配器
94+
95+
新增数据源需要实现 `SourceAdapter`
96+
97+
```typescript
98+
interface SourceAdapter {
99+
name: string;
100+
displayName: string;
101+
detect(): Promise<SourceInfo | null>;
102+
scan(): Promise<ConversationMeta[]>;
103+
parse(meta: ConversationMeta): Promise<ParsedConversation>;
104+
}
105+
```
106+
107+
内置适配器:
108+
109+
| Adapter | 数据源 | 格式 |
110+
|---|---|---|
111+
| `claude-code` | `~/.claude/projects/**/*.jsonl` | JSONL 对话日志 |
112+
| `codex` | `~/.codex/sessions/**/rollout-*.jsonl` | JSONL 事件流 |
113+
| `cursor` | Cursor `workspaceStorage/state.vscdb` | SQLite KV store |
114+
| `trae` | Trae `workspaceStorage/state.vscdb` | SQLite KV store |
115+
| `copilot` | VS Code `workspaceStorage/chatSessions/*.jsonl` | JSONL 快照 |
116+
117+
`server/src/parser/adapters/` 下创建适配器,并注册到 `server/src/parser/index.ts`
118+
119+
## API 面
120+
121+
主要 REST endpoints:
122+
123+
| Method | Path | Description |
124+
|---|---|---|
125+
| GET | `/api/status` | 服务状态与统计 |
126+
| GET | `/api/config` | 当前配置,密钥已脱敏 |
127+
| POST | `/api/config` | 更新 Provider 配置 |
128+
| POST | `/api/import/scan` | 触发导入 |
129+
| GET | `/api/conversations` | 对话列表 |
130+
| GET | `/api/conversations/:id` | 对话详情 |
131+
| POST | `/api/conversations/:id/summarize` | 摘要单条对话 |
132+
| POST | `/api/summarize/batch` | 批量摘要 |
133+
| GET | `/api/notes` | 笔记列表 |
134+
| GET | `/api/notes/:id` | 笔记详情 |
135+
| GET | `/api/search?q=...&expand=true` | 语义搜索 |
136+
| GET | `/api/relations/graph` | 知识图谱数据 |
137+
| GET | `/api/queue/status` | 队列状态 |
138+
139+
## 知识图谱
140+
141+
关系系统支持以下类型:
142+
143+
| Relation | 含义 |
144+
|---|---|
145+
| `CAUSED_BY` | 因果 |
146+
| `LEADS_TO` | 导致 |
147+
| `RESOLVED_BY` | 被解决 |
148+
| `SIMILAR_TO` | 主题相似 |
149+
| `CONTRADICTS` | 矛盾 |
150+
| `DEPENDS_ON` | 依赖 |
151+
| `EXTENDS` | 扩展 |
152+
| `REFERENCES` | 引用 |
153+
154+
关系可以由 LLM 发现、手动添加,也可以在语义搜索扩展结果时被跟随。
155+
156+
## 测试
157+
158+
主要验证命令:
159+
160+
```bash
161+
npm run test -w server
162+
npm run build
163+
npm run lint
164+
npm run eval:experience -w server
165+
```
166+
167+
开发时可以先跑聚焦测试,提交前再跑完整命令。
168+
169+
## 发布
170+
171+
```bash
172+
npm run release
173+
npm run release -- minor
174+
npm run release -- major
175+
npm run release -- 1.0.0
176+
```
177+
178+
发布脚本会更新版本、创建 git tag 并 push,随后由 CI 构建和发布产物。
179+

0 commit comments

Comments
 (0)