|
| 1 | +# AGENTS.md |
| 2 | + |
| 3 | +This file provides guidance to Codex (Codex.ai/code) when working with code in this repository. |
| 4 | + |
| 5 | +## Project Overview |
| 6 | + |
| 7 | +ChatCrystal is an AI conversation knowledge crystallization tool. It imports conversations from AI coding tools (currently Codex), uses LLM to generate structured notes (title, summary, key conclusions, code snippets, tags), and provides semantic search via embeddings. The UI is in Simplified Chinese. |
| 8 | + |
| 9 | +## Commands |
| 10 | + |
| 11 | +```bash |
| 12 | +# Development (server port 3721 + client port 13721) |
| 13 | +npm run dev |
| 14 | + |
| 15 | +# Build for production |
| 16 | +npm run build |
| 17 | + |
| 18 | +# Production server (serves frontend statically on port 3721) |
| 19 | +npm start |
| 20 | + |
| 21 | +# Electron desktop app |
| 22 | +npm run dev:electron # dev mode (Vite HMR + Electron window) |
| 23 | +npm run build:electron # build NSIS installer → release/ |
| 24 | +npm run pack:electron # build unpacked directory (faster for testing) |
| 25 | + |
| 26 | +# Release (bump version + git tag + push → CI builds & publishes) |
| 27 | +npm run release # patch bump (0.1.0 → 0.1.1) |
| 28 | +npm run release -- minor # minor bump |
| 29 | +npm run release -- major # major bump |
| 30 | +npm run release -- 1.0.0 # explicit version |
| 31 | + |
| 32 | +# System tray (legacy, replaced by Electron tray) |
| 33 | +npm run tray # with console |
| 34 | +npm run tray:silent # silent (VBS launcher) |
| 35 | + |
| 36 | +# Lint |
| 37 | +npm run lint |
| 38 | +npm run lint:fix |
| 39 | +``` |
| 40 | + |
| 41 | +### CLI (`crystal`) |
| 42 | + |
| 43 | +Install globally: `npm install -g chatcrystal` |
| 44 | + |
| 45 | +```bash |
| 46 | +# Core commands |
| 47 | +crystal status # Server status and DB stats |
| 48 | +crystal import [--source Codex] # Scan and import conversations |
| 49 | +crystal search "query" [--limit 10] # Semantic search |
| 50 | +crystal notes list [--tag X] # Browse notes |
| 51 | +crystal notes get <id> # View note detail |
| 52 | +crystal notes relations <id> # View note relations |
| 53 | +crystal tags # List tags with counts |
| 54 | +crystal summarize <id> # Summarize one conversation |
| 55 | +crystal summarize --all # Batch summarize |
| 56 | +crystal config get # View config |
| 57 | +crystal config set llm.provider openai # Update config |
| 58 | +crystal config test # Test LLM connection |
| 59 | + |
| 60 | +# Server management |
| 61 | +crystal serve # Start server (foreground) |
| 62 | +crystal serve -d # Start server (daemon) |
| 63 | +crystal serve stop # Stop daemon |
| 64 | +crystal serve status # Check if running |
| 65 | + |
| 66 | +# MCP Server (for AI tool integration) |
| 67 | +crystal mcp # Start MCP stdio server |
| 68 | +``` |
| 69 | + |
| 70 | +Global options: `--base-url` (server URL), `--json` (force JSON output), `--version`. |
| 71 | + |
| 72 | +Auto-start: commands that need the server will auto-launch it in background if not running. |
| 73 | + |
| 74 | +#### MCP Configuration |
| 75 | + |
| 76 | +Codex (`settings.json`): |
| 77 | +```json |
| 78 | +{ |
| 79 | + "mcpServers": { |
| 80 | + "chatcrystal": { |
| 81 | + "command": "crystal", |
| 82 | + "args": ["mcp"] |
| 83 | + } |
| 84 | + } |
| 85 | +} |
| 86 | +``` |
| 87 | + |
| 88 | +MCP exposes 4 read-only tools: `search_knowledge`, `get_note`, `list_notes`, `get_relations`. |
| 89 | + |
| 90 | +## Architecture |
| 91 | + |
| 92 | +Monorepo with three npm workspaces: |
| 93 | + |
| 94 | +### `shared/` — Shared Types (`@chatcrystal/shared`) |
| 95 | +- No build step; exports TypeScript types directly from `types/index.ts` |
| 96 | + |
| 97 | +### `server/` — Fastify Backend (`@chatcrystal/server`) |
| 98 | +- **Runtime:** tsx (dev + prod) |
| 99 | +- **Framework:** Fastify v5 with CORS and static file serving (production SPA fallback) |
| 100 | +- **Database:** sql.js (WASM SQLite) at `data/chatcrystal.db`, auto-saved every 30s |
| 101 | +- **Key modules:** |
| 102 | + - `db/` — Schema (7 tables), utils (`resultToObjects`) |
| 103 | + - `parser/` — Plugin architecture via `SourceAdapter`. Five built-in adapters: |
| 104 | + - `adapters/Codex.ts` — JSONL from `~/.Codex/projects/`. Sanitizes `<system-reminder>`, `<command-name>` tags. |
| 105 | + - `adapters/codex.ts` — JSONL event stream from `~/.codex/sessions/`. Reconstructs conversation from event_msg/response_item events. |
| 106 | + - `adapters/cursor.ts` — SQLite `state.vscdb` from Cursor's workspaceStorage/globalStorage. Reads composer metadata + bubble data via sql.js. |
| 107 | + - `adapters/trae.ts` — SQLite `state.vscdb` from Trae's workspaceStorage. Reads `memento/icube-ai-agent-storage` key; extracts content from agentTaskContent for agent responses. |
| 108 | + - `adapters/copilot.ts` — JSONL from VS Code's workspaceStorage/chatSessions + globalStorage/emptyWindowChatSessions. Parses session snapshots (kind:0) with requests/response arrays. |
| 109 | + - `services/llm.ts` — Provider factory: Ollama/OpenAI/Custom via Vercel AI SDK |
| 110 | + - `services/summarize.ts` — Conversation preprocessing (truncate ~8000 tokens) + LLM call + JSON parsing + DB persistence. Auto-generates embeddings after summarization. |
| 111 | + - `services/embedding.ts` — Embedding model factory + vectra LocalIndex + text chunking |
| 112 | + - `services/import.ts` — Scan + dedup (file size + mtime) + batch insert |
| 113 | + - `routes/` — REST endpoints: status, config, import, conversations CRUD, notes CRUD, tags, search, queue status, batch operations |
| 114 | + - `queue/` — p-queue singleton (concurrency=1, 1 req/sec, 429 retry) |
| 115 | + - `watcher/` — chokidar watches all data source paths (Codex JSONL, Codex sessions JSONL, Cursor global vscdb), debounced auto-import |
| 116 | + |
| 117 | +### `client/` — React SPA |
| 118 | +- **Build:** Vite v8 + `@vitejs/plugin-react` |
| 119 | +- **Styling:** Tailwind CSS v4 via `@tailwindcss/vite`. Custom utility classes in `index.css` reference CSS variables injected by ThemeProvider. |
| 120 | +- **State:** TanStack React Query v5; React Context for theming |
| 121 | +- **Routing:** React Router v7. Pages: Dashboard, Conversations, ConversationDetail, Notes, NoteDetail, Search, Settings |
| 122 | +- **Components:** MarkdownRenderer (react-markdown + remark-gfm + react-syntax-highlighter/Prism), ToolCallGroup (collapsible), Layout, Sidebar |
| 123 | +- **Path alias:** `@/` maps to `client/src/` |
| 124 | +- **Theming:** Runtime CSS variable injection. Theme: `dark-workshop` |
| 125 | + |
| 126 | +### `electron/` — Electron Desktop App |
| 127 | +- **Not an npm workspace** — compiled separately via `tsc -p electron/tsconfig.json` |
| 128 | +- `main.ts` — Main process: single-instance lock, port detection, Fastify server startup (embedded), BrowserWindow creation, system tray, window state persistence, data migration |
| 129 | +- `preload.ts` — Minimal contextBridge exposing `electronAPI.isElectron` and version info |
| 130 | +- `tray.ts` — System tray icon + context menu (open window, open in browser, quit) |
| 131 | +- `icon.svg/png/ico` — Application icon (crystal gem + chat bubble) |
| 132 | +- **Lifecycle:** Window close → hide to tray; tray quit → graceful shutdown (watcher stop → DB save → Fastify close → tray destroy) |
| 133 | +- **Data directory:** |
| 134 | + - Dev mode: `./data` (project root) |
| 135 | + - Packaged: `%APPDATA%/ChatCrystal/data` (auto-migrates from old `data/` on first launch) |
| 136 | +- **Environment vars set by Electron:** `ELECTRON=true`, `DATA_DIR`, `ELECTRON_PACKAGED` (packaged only) |
| 137 | +- **Server import:** Production uses dynamic `import()` via `file://` URL to load compiled server ESM |
| 138 | +- **Packaging:** `electron-builder.yml` → NSIS installer, `sql-wasm.wasm` as extraResource, aggressive node_modules filtering |
| 139 | + |
| 140 | +### `scripts/` — Legacy System Tray & Launchers |
| 141 | +- `tray.ps1` — PowerShell WinForms NotifyIcon tray app (superseded by Electron tray) |
| 142 | +- `start-silent.vbs` — VBS wrapper for hidden launch |
| 143 | + |
| 144 | +## Data Flow |
| 145 | + |
| 146 | +``` |
| 147 | +~/.Codex/projects/**/*.jsonl → Codex Adapter (JSONL parse + sanitize) |
| 148 | +~/.codex/sessions/**/*.jsonl → Codex Adapter (event stream → conversation) |
| 149 | +Cursor workspaceStorage/state.vscdb → Cursor Adapter (SQLite KV → messages) |
| 150 | +Trae workspaceStorage/state.vscdb → Trae Adapter (SQLite KV → agent task content) |
| 151 | +Code workspaceStorage/chatSessions → Copilot Adapter (JSONL session snapshots) |
| 152 | + → Import Service (dedup by id+source, insert) |
| 153 | + → SQLite |
| 154 | + → Fastify REST API |
| 155 | + → React client (React Query hooks) |
| 156 | +
|
| 157 | +Summarization: |
| 158 | + Conversation → prepareTranscript (truncate) → generateText (AI SDK) → extractJSON → saveNote → generateEmbeddings → vectra index |
| 159 | +
|
| 160 | +Search: |
| 161 | + Query → embed(query) → vectra.queryItems → deduplicate by noteId → enrich with tags |
| 162 | +``` |
| 163 | + |
| 164 | +## Environment |
| 165 | + |
| 166 | +Copy `.env.example` to `.env`. Key variables: |
| 167 | +- `PORT` (default 3721) |
| 168 | +- `CLAUDE_PROJECTS_DIR` — path to Codex projects |
| 169 | +- `CODEX_SESSIONS_DIR` — path to Codex CLI sessions |
| 170 | +- `CURSOR_DATA_DIR` — path to Cursor data (auto-detected per platform) |
| 171 | +- `TRAE_DATA_DIR` — path to Trae data (auto-detected per platform) |
| 172 | +- `COPILOT_DATA_DIR` — path to VS Code Copilot data (auto-detected per platform) |
| 173 | +- `LLM_PROVIDER` / `LLM_MODEL` — for summarization (ollama, openai, anthropic, google, custom) |
| 174 | +- `EMBEDDING_PROVIDER` / `EMBEDDING_MODEL` — for semantic search |
| 175 | + |
| 176 | +> **注意:LLM 与 Embedding 需要分别配置。** 语义搜索要求 Embedding 模型支持 `/v1/embeddings` 端点。大语言模型(如 Codex、GPT-4、Qwen)**不能**用作 Embedding 模型。常见可用的 Embedding 模型: |
| 177 | +> - Ollama(本地):`nomic-embed-text`、`mxbai-embed-large` |
| 178 | +> - OpenAI:`text-embedding-3-small`、`text-embedding-3-large` |
| 179 | +> - Google:`text-embedding-004` |
| 180 | +> |
| 181 | +> 如果语义搜索返回 500 "Not Found",通常是 Embedding 模型配置错误导致的。 |
| 182 | +
|
| 183 | +## Key Patterns |
| 184 | + |
| 185 | +- **SourceAdapter plugin interface** (`parser/adapter.ts`): implement `detect()`, `scan()`, `parse()` to add new sources. Currently 5 adapters: Codex (JSONL), codex (JSONL events), cursor (SQLite vscdb), trae (SQLite vscdb), copilot (JSONL sessions) |
| 186 | +- **Shared types are the contract**: both server and client import from `@chatcrystal/shared` |
| 187 | +- **No ORM**: raw SQL via sql.js with parameterized queries |
| 188 | +- **sanitizeContent()**: strips Codex system XML tags from message content |
| 189 | +- **Consecutive tool-use messages**: grouped and collapsed in frontend (ToolCallGroup component) |
| 190 | +- **Production SPA fallback**: Fastify serves `client/dist`, non-API 404s return `index.html` |
| 191 | +- **Dual mode**: `npm start` runs standalone web server; Electron embeds the same server in its main process via `createServer()` export |
| 192 | +- **Window state persistence**: Electron saves/restores window bounds (position, size, maximized) to `%APPDATA%/ChatCrystal/window-state.json` |
| 193 | +- **Single instance**: `app.requestSingleInstanceLock()` prevents duplicate instances; second launch focuses existing window |
0 commit comments