- Project structure with workspaces
- Root package.json with scripts
- .gitignore and .env.example
- README with quick start guide
- Install scripts
- Frameless window with custom titlebar
- Screen capture (3-5 second intervals)
- System tray with menu
- IPC bridge (preload.js)
- macOS permission handling
- Next.js 14 with App Router
- Tailwind CSS configured
- Glassmorphism styling
- Custom titlebar component
- Basic layout and page structure
- FastAPI server with CORS
- Socket.IO for real-time communication
- API routes structure
- Health check endpoint
- Error handling
- React Flow integration
- Custom animated nodes
- Glassmorphism graph styling
- WebSocket connection
- State management with Zustand
- Empty state UI
- pyttsx3 TTS integration
- Casual personality/text casualization
- Proactive vs. on-demand modes
- Voice toggle API endpoint
- Threaded non-blocking speech
- Greeting on startup
- Frontend voice toggle component
Cortex/
βββ electron/
β βββ main.js β
Electron entry, window management
β βββ preload.js β
IPC bridge
β βββ screen-capture.js β
Screenshot logic
β βββ tray.js β
System tray
β βββ package.json β
βββ frontend/
β βββ app/
β β βββ layout.tsx β
Root layout
β β βββ page.tsx β
Main dashboard
β β βββ globals.css β
Glassmorphism styles
β βββ components/
β β βββ titlebar/
β β β βββ CustomTitlebar.tsx β
β β βββ mind-map/
β β β βββ MindMap.tsx β
React Flow container
β β β βββ CustomNode.tsx β
Animated nodes
β β βββ chat/
β β βββ VoiceToggle.tsx β
Voice control
β βββ hooks/
β β βββ useWebSocket.ts β
Real-time connection
β βββ lib/
β β βββ store.ts β
Zustand stores
β βββ tailwind.config.ts β
β βββ package.json β
βββ backend/
β βββ main.py β
FastAPI server
β βββ requirements.txt β
β βββ core/
β β βββ voice_interface.py β
TTS system
β βββ api/
β β βββ routes.py β
API endpoints
β βββ logs/ β
βββ test_voice.py β
Voice testing script
βββ package.json β
Root workspace
βββ .env.example β
βββ .gitignore β
βββ README.md β
# Root dependencies
npm install
# Frontend
cd frontend && npm install
# Backend
cd backend
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txtcp .env.example .env
# Edit .env and add your GEMINI_API_KEYOption A: All at once
npm run devOption B: Separate terminals
# Terminal 1: Backend
cd backend && source venv/bin/activate && python3 main.py
# Terminal 2: Frontend
cd frontend && npm run dev
# Terminal 3: Electron (when ready)
cd electron && npm start# Test voice system
python3 test_voice.py
# Test via API
curl -X POST "http://localhost:8000/api/speak?text=Hello%20from%20Cortex&force=true"- Background insight generation
- Graph pattern detection
- Proactive suggestions
- Integration with voice interface
- 2-3 minute interval loop
- Gemini API client wrapper
- OCR with EasyOCR
- Screen analysis pipeline
- Context extraction
- Connect to screen capture
- NetworkX graph implementation
- Node/edge management
- Daily reset logic
- React Flow converter
- Graph visualization
- PyAutoGUI automation engine
- Action planning with Gemini
- macOS window detection
- Action execution
- Silent error handling
Completed: Phases 0-7 (Foundation + Voice) Remaining: Phases 8-11 (AI Intelligence)
Estimated time to MVP: 4-6 hours
- Electron window opens
- Titlebar works (minimize/close)
- Next.js loads
- Glassmorphism styles render
- React Flow canvas renders
- Nodes appear with animation
- WebSocket connects
- Empty state shows
- TTS speaks on startup
- Voice toggle works
- Casualization works
- Health endpoint shows voice status
- Test script runs
- Gemini API Key: Required for Phases 9-11
- Get from: https://aistudio.google.com/app/apikey
- Add to
.envasGEMINI_API_KEY=your_key
- Accessibility: Required for Phase 11 (computer control)
- System Preferences β Security & Privacy β Privacy β Accessibility
- Screen Recording: Required for Phase 9 (screen capture)
- System Preferences β Security & Privacy β Privacy β Screen Recording
- Voice interface uses macOS built-in TTS (no API required)
- Electron integration pending (Phases 1-2 ready, needs integration)
- Frontend is styled and ready for real data
- Backend endpoints are scaffolded and functional
- Electron not integrated: Frontend runs standalone, needs Electron wrapper
- Screen capture pending: Needs Gemini Vision (Phase 9)
- Graph empty: Needs knowledge graph implementation (Phase 10)
- No real insights yet: Needs reasoning engine (Phase 8)
β Beautiful glassmorphism UI with custom titlebar β Real-time WebSocket infrastructure β Casual voice personality working perfectly β Clean project structure with workspaces β Comprehensive API documentation β Testing scripts ready
Most exciting: The voice interface sounds genuinely casual and friendly! π€
Last Updated: March 10, 2026 Current Phase: Completed Phase 7 (Voice Interface) Next Target: Phase 8 (Reasoning Engine)