|
| 1 | +# ChatCrystal Development Guide |
| 2 | + |
| 3 | +English | [简体中文](DEVELOPMENT.zh-CN.md) |
| 4 | + |
| 5 | +This guide covers repository structure, architecture, development commands, testing, and release workflows. |
| 6 | + |
| 7 | +## Project Overview |
| 8 | + |
| 9 | +ChatCrystal is a local-first AI conversation crystallization tool. It imports conversations from AI coding tools, generates structured notes with LLMs, builds embeddings for semantic search, and exposes both UI and MCP workflows. |
| 10 | + |
| 11 | +## Monorepo Layout |
| 12 | + |
| 13 | +``` |
| 14 | +ChatCrystal/ |
| 15 | +├── shared/ # Shared TypeScript types |
| 16 | +├── server/ # Fastify backend, CLI, MCP server |
| 17 | +├── client/ # React SPA |
| 18 | +├── electron/ # Electron main and preload processes |
| 19 | +├── skills/ # Publishable ChatCrystal agent skills |
| 20 | +├── docs/ # Maintainer and user documentation |
| 21 | +├── scripts/ # Release and utility scripts |
| 22 | +└── site/ # Project website |
| 23 | +``` |
| 24 | + |
| 25 | +## Tech Stack |
| 26 | + |
| 27 | +| Layer | Technology | |
| 28 | +|---|---| |
| 29 | +| Backend | Node.js, Fastify v5, TypeScript | |
| 30 | +| Frontend | Vite v8, React 19, Tailwind CSS v4, TanStack React Query v5 | |
| 31 | +| Desktop | Electron, electron-builder | |
| 32 | +| Database | sql.js WASM SQLite | |
| 33 | +| LLM | Vercel AI SDK v6 | |
| 34 | +| Embeddings | vectra local vector index | |
| 35 | +| Queue | p-queue | |
| 36 | +| File watching | chokidar | |
| 37 | + |
| 38 | +## Development Commands |
| 39 | + |
| 40 | +```bash |
| 41 | +npm run dev # Server 3721 + client 13721 |
| 42 | +npm run build # Build server and client |
| 43 | +npm start # Production server |
| 44 | +npm run lint # Biome + client ESLint |
| 45 | +npm run lint:fix # Apply safe lint fixes |
| 46 | +npm run test -w server # Server tests |
| 47 | +npm run dev:electron # Electron dev mode |
| 48 | +npm run build:electron # Build Windows installer |
| 49 | +npm run pack:electron # Build unpacked Electron app |
| 50 | +npm run eval:experience -w server |
| 51 | +``` |
| 52 | + |
| 53 | +`npm run eval:experience -w server` runs the offline calibration suite for the experience quality gate. |
| 54 | + |
| 55 | +## Runtime Data |
| 56 | + |
| 57 | +Runtime data is stored in `config.json` and `chatcrystal.db` under the active data directory. |
| 58 | + |
| 59 | +Default data directory: |
| 60 | + |
| 61 | +- CLI, MCP, npm package, repository checkout, and Electron: `~/.chatcrystal/data` |
| 62 | +- Explicit override: `DATA_DIR` |
| 63 | + |
| 64 | +Electron sets `ELECTRON=true`, `DATA_DIR`, and `ELECTRON_PACKAGED` when applicable. |
| 65 | + |
| 66 | +## Data Flow |
| 67 | + |
| 68 | +``` |
| 69 | +AI tool conversation files |
| 70 | + -> SourceAdapter scan/parse |
| 71 | + -> Import service deduplication |
| 72 | + -> SQLite conversations/messages |
| 73 | + -> Summarization queue |
| 74 | + -> LLM structured note generation |
| 75 | + -> Embedding generation |
| 76 | + -> vectra semantic index |
| 77 | + -> REST API, UI, CLI, MCP |
| 78 | +``` |
| 79 | + |
| 80 | +## Summarization Pipeline |
| 81 | + |
| 82 | +ChatCrystal uses turn-based transcript preparation before summarization: |
| 83 | + |
| 84 | +1. Split messages into user-assistant turns. |
| 85 | +2. Keep the user instruction plus the first and last substantial assistant replies in each turn. |
| 86 | +3. Score turns by instruction length and assistant engagement. |
| 87 | +4. Always include the first turn and final turns. |
| 88 | +5. Fill the remaining budget with high-value middle turns. |
| 89 | +6. Compress skipped turns into one-line previews. |
| 90 | + |
| 91 | +Structured output uses Vercel AI SDK `generateObject()` with Zod schemas. This avoids fragile JSON extraction and lets schema validation retry invalid model output. |
| 92 | + |
| 93 | +## Source Adapters |
| 94 | + |
| 95 | +Add a new source by implementing `SourceAdapter`: |
| 96 | + |
| 97 | +```typescript |
| 98 | +interface SourceAdapter { |
| 99 | + name: string; |
| 100 | + displayName: string; |
| 101 | + detect(): Promise<SourceInfo | null>; |
| 102 | + scan(): Promise<ConversationMeta[]>; |
| 103 | + parse(meta: ConversationMeta): Promise<ParsedConversation>; |
| 104 | +} |
| 105 | +``` |
| 106 | + |
| 107 | +Built-in adapters: |
| 108 | + |
| 109 | +| Adapter | Data Source | Format | |
| 110 | +|---|---|---| |
| 111 | +| `claude-code` | `~/.claude/projects/**/*.jsonl` | JSONL conversation log | |
| 112 | +| `codex` | `~/.codex/sessions/**/rollout-*.jsonl` | JSONL event stream | |
| 113 | +| `cursor` | Cursor `workspaceStorage/state.vscdb` | SQLite KV store | |
| 114 | +| `trae` | Trae `workspaceStorage/state.vscdb` | SQLite KV store | |
| 115 | +| `copilot` | VS Code `workspaceStorage/chatSessions/*.jsonl` | JSONL snapshots | |
| 116 | + |
| 117 | +Create the adapter under `server/src/parser/adapters/` and register it in `server/src/parser/index.ts`. |
| 118 | + |
| 119 | +## API Surface |
| 120 | + |
| 121 | +Key REST endpoints: |
| 122 | + |
| 123 | +| Method | Path | Description | |
| 124 | +|---|---|---| |
| 125 | +| GET | `/api/status` | Server status and statistics | |
| 126 | +| GET | `/api/config` | Current config with secrets redacted | |
| 127 | +| POST | `/api/config` | Update provider config | |
| 128 | +| POST | `/api/import/scan` | Trigger import | |
| 129 | +| GET | `/api/conversations` | List conversations | |
| 130 | +| GET | `/api/conversations/:id` | Conversation detail | |
| 131 | +| POST | `/api/conversations/:id/summarize` | Summarize one conversation | |
| 132 | +| POST | `/api/summarize/batch` | Batch summarization | |
| 133 | +| GET | `/api/notes` | List notes | |
| 134 | +| GET | `/api/notes/:id` | Note detail | |
| 135 | +| GET | `/api/search?q=...&expand=true` | Semantic search | |
| 136 | +| GET | `/api/relations/graph` | Knowledge graph data | |
| 137 | +| GET | `/api/queue/status` | Queue status | |
| 138 | + |
| 139 | +## Knowledge Graph |
| 140 | + |
| 141 | +The relation system supports these relation types: |
| 142 | + |
| 143 | +| Relation | Meaning | |
| 144 | +|---|---| |
| 145 | +| `CAUSED_BY` | Causation | |
| 146 | +| `LEADS_TO` | Leads to | |
| 147 | +| `RESOLVED_BY` | Resolved by | |
| 148 | +| `SIMILAR_TO` | Similar topic | |
| 149 | +| `CONTRADICTS` | Contradiction | |
| 150 | +| `DEPENDS_ON` | Dependency | |
| 151 | +| `EXTENDS` | Extension | |
| 152 | +| `REFERENCES` | Reference | |
| 153 | + |
| 154 | +Relations can be discovered by LLM, added manually, or followed during semantic search expansion. |
| 155 | + |
| 156 | +## Testing |
| 157 | + |
| 158 | +Primary verification: |
| 159 | + |
| 160 | +```bash |
| 161 | +npm run test -w server |
| 162 | +npm run build |
| 163 | +npm run lint |
| 164 | +npm run eval:experience -w server |
| 165 | +``` |
| 166 | + |
| 167 | +Use focused server tests while iterating, then run the full commands before committing. |
| 168 | + |
| 169 | +## Release |
| 170 | + |
| 171 | +```bash |
| 172 | +npm run release |
| 173 | +npm run release -- minor |
| 174 | +npm run release -- major |
| 175 | +npm run release -- 1.0.0 |
| 176 | +``` |
| 177 | + |
| 178 | +The release script bumps version, creates a git tag, and pushes so CI can build and publish artifacts. |
| 179 | + |
0 commit comments