|
| 1 | +Context for Sentient |
| 2 | +Vision and Goals |
| 3 | +Sentient is a personal AI companion envisioned as an AI friend that lives across your devices, learns about you, and assists with tasks to help you achieve your goals. Unlike typical generative AI applications that focus on multi-chat interfaces, Sentient aims to provide a unified conversation experience, mimicking human-like interaction with advanced memory management and asynchronous task execution. The ultimate goal is to democratize AI companions, making them accessible to everyone. |
| 4 | +Key Objectives |
| 5 | +Deliver a seamless, unified chat interface with text and voice capabilities. |
| 6 | + |
| 7 | +Implement sophisticated memory systems for long-term, short-term, and episodic contexts. |
| 8 | + |
| 9 | +Enable asynchronous task execution for actions like email sending or calendar queries. |
| 10 | + |
| 11 | +Introduce autonomous context awareness to proactively assist users. |
| 12 | + |
| 13 | +Launch an open-source, self-hosted version for enthusiasts, followed by a cloud-hosted version for general consumers with plans for a mobile app. |
| 14 | + |
| 15 | +Development Strategy |
| 16 | +Current Focus: Building an open-source, self-hosted version for Windows using an Electron frontend and a Python backend, targeting a single user without authentication. |
| 17 | + |
| 18 | +Future Vision: Transition to a cloud-hosted model (e.g., AWS or Azure) with multi-user support, authentication, and a mobile app, funded by traction from the open-source version. |
| 19 | + |
| 20 | +Key Features |
| 21 | +Unified Chat Interface |
| 22 | +Single Conversation Stream: Users interact through one continuous chat, eliminating the need to switch between multiple chats. |
| 23 | + |
| 24 | +Input/Output Modes: |
| 25 | +Text: Standard text-based interaction. |
| 26 | + |
| 27 | +Standard Voice: User audio is converted to text via Speech-to-Text (STT), processed, and responses are converted to audio via Text-to-Speech (TTS). |
| 28 | + |
| 29 | +Advanced Voice: A multimodal model generates audio directly for a true audio-to-audio experience. |
| 30 | + |
| 31 | +Context Management: |
| 32 | +Time-based Conversation Blocking: After a period of inactivity (e.g., 10 minutes), Sentient starts a new internal "chat" to refresh context. |
| 33 | + |
| 34 | +Tag-based Context Switching: Messages are tagged (e.g., "personal - pets"), and relevant history is loaded dynamically when the conversation shifts topics. |
| 35 | + |
| 36 | +Memory Stores |
| 37 | +Long-term Memory: Stored in a Neo4j graph database, capturing persistent facts about the user (e.g., preferences, relationships). Retrieved using GraphRAG for inference-time context. |
| 38 | + |
| 39 | +Short-term Memory: Stored in an SQL database (e.g., SQLite), holding time-sensitive data (e.g., "interview in 2 weeks") with timestamps for expiration or reminders. |
| 40 | + |
| 41 | +Episodic Memory: Stored in LowDB, managing conversation history with tags and timestamps to support context switching in the unified chat. |
| 42 | + |
| 43 | +Actions and Agents |
| 44 | +Task Queue: An asynchronous queue (LowDB-based) for executing user-requested tasks (e.g., sending emails, fetching calendar events). Tasks include priority levels, status tracking, and completion details. |
| 45 | + |
| 46 | +Memory Operations Queue: An asynchronous queue for updating memory stores (create, update, delete) without delaying chat responses. |
| 47 | + |
| 48 | +Agent Orchestrator: Monitors chat history, detects tasks and memory updates, and assigns them to the appropriate queues. |
| 49 | + |
| 50 | +External Integrations: |
| 51 | +API-based Automations: Supports Gmail, Google Docs, Calendar, Drive, Sheets, and Slides. |
| 52 | + |
| 53 | +Sandboxed Environment: A cloud-hosted VM or Browser-Use instance for general tasks (e.g., file operations, web browsing). |
| 54 | + |
| 55 | +Intent and Context Engine |
| 56 | +Context Monitoring: Streams data from desktop/mobile notifications and microphone input (e.g., conversation topics, speaker detection). |
| 57 | + |
| 58 | +Autonomous Actions: Adds tasks or memories to queues based on user-configurable autonomy levels (full autonomy or confirmation required). |
| 59 | + |
| 60 | +Architecture Overview |
| 61 | +Sentient’s architecture is modular, separating the frontend, backend, memory stores, and external integrations. The current self-hosted version connects the Electron frontend to a Python backend via ngrok tunneling, allowing the backend to run on any server. |
| 62 | +Components |
| 63 | +Frontend (Electron) |
| 64 | +Provides the unified chat interface for text and voice interactions. |
| 65 | + |
| 66 | +Handles input/output processing and communicates with the backend via HTTP APIs and WebSockets. |
| 67 | + |
| 68 | +Backend (Python) |
| 69 | +API Layer: Manages HTTP endpoints and WebSockets for real-time interaction. |
| 70 | + |
| 71 | +Conversation Manager: Oversees the unified chat, storing history in LowDB and implementing time-based blocking and tag-based context switching. |
| 72 | + |
| 73 | +Memory Manager: Interfaces with Neo4j (long-term), SQL DB (short-term), and LowDB (episodic) to supply memories for inference. |
| 74 | + |
| 75 | +Task Queue: Manages asynchronous task execution with prioritization (e.g., immediate info requests vs. background tasks). |
| 76 | + |
| 77 | +Memory Operations Queue: Processes asynchronous memory updates. |
| 78 | + |
| 79 | +Agent Orchestrator: Monitors chats, assigns tasks and memory operations to queues, and executes tasks via integrations. |
| 80 | + |
| 81 | +Context Engine: Streams context from notifications and microphone, adding tasks/memories to queues. |
| 82 | + |
| 83 | +Databases |
| 84 | +Neo4j: Long-term memory as a knowledge graph. |
| 85 | + |
| 86 | +SQL DB (SQLite): Short-term memory with time-sensitive data. |
| 87 | + |
| 88 | +LowDB: Episodic memory for conversation history. |
| 89 | + |
| 90 | +External Integrations |
| 91 | +APIs: Specific tools (e.g., Gmail, Calendar). |
| 92 | + |
| 93 | +Sandboxed Environment: Cloud-hosted VM or Browser-Use for general automations. |
| 94 | + |
| 95 | +Component Interactions |
| 96 | +Frontend Backend: Connected via HTTP/WebSockets through ngrok tunneling. |
| 97 | + |
| 98 | +Conversation Flow: The Conversation Manager uses LowDB for history and the Memory Manager for context. |
| 99 | + |
| 100 | +Agent Operations: The Agent Orchestrator monitors chats, feeding the Task Queue and Memory Operations Queue. |
| 101 | + |
| 102 | +Context Engine: Monitors frontend inputs, enhancing autonomy. |
| 103 | + |
| 104 | +External Actions: The Task Queue leverages APIs or the sandbox for execution. |
| 105 | + |
| 106 | +Mermaid Architecture Diagram |
| 107 | +mermaid |
| 108 | + |
| 109 | +graph TD |
| 110 | +subgraph Frontend |
| 111 | +UI[User Interface<br>(Electron)] |
| 112 | +end |
| 113 | + |
| 114 | + subgraph Backend |
| 115 | + API[API Layer<br>(HTTP/WS)] |
| 116 | + ConvMgr[Conversation Manager] |
| 117 | + MemMgr[Memory Manager] |
| 118 | + TaskQ[Task Queue] |
| 119 | + MemOpQ[Memory Operations Queue] |
| 120 | + AgentOrch[Agent Orchestrator] |
| 121 | + ContextEng[Context Engine] |
| 122 | + end |
| 123 | + |
| 124 | + subgraph Databases |
| 125 | + Neo4j[Neo4j<br>(Long-term Memory)] |
| 126 | + SQL[SQL DB<br>(Short-term Memory)] |
| 127 | + LowDB[LowDB<br>(Episodic Memory)] |
| 128 | + end |
| 129 | + |
| 130 | + subgraph External |
| 131 | + APIs[External APIs<br>(Gmail, Calendar, etc.)] |
| 132 | + Sandbox[Sandboxed Environment<br>(VM/Browser-Use)] |
| 133 | + end |
| 134 | + |
| 135 | + UI -->|HTTP/WS via ngrok| API |
| 136 | + API --> ConvMgr |
| 137 | + ConvMgr -->|Reads/Writes| LowDB |
| 138 | + ConvMgr -->|Retrieves Memories| MemMgr |
| 139 | + MemMgr -->|Reads/Writes| Neo4j |
| 140 | + MemMgr -->|Reads/Writes| SQL |
| 141 | + AgentOrch -->|Monitors| ConvMgr |
| 142 | + AgentOrch -->|Adds Tasks| TaskQ |
| 143 | + AgentOrch -->|Adds Memory Ops| MemOpQ |
| 144 | + TaskQ -->|Executes| AgentOrch |
| 145 | + MemOpQ -->|Processes| MemMgr |
| 146 | + ContextEng -->|Monitors| UI |
| 147 | + ContextEng -->|Adds Tasks/Memories| TaskQ |
| 148 | + ContextEng -->|Adds Tasks/Memories| MemOpQ |
| 149 | + TaskQ -->|Uses| APIs |
| 150 | + TaskQ -->|Uses| Sandbox |
| 151 | + |
| 152 | +Potential Issues and Fixes |
| 153 | +Context Switching Complexity: |
| 154 | +Issue: Overlapping topics or multi-tagged messages may confuse tag-based switching. |
| 155 | + |
| 156 | +Fix: Use simple keyword tagging initially, refining with user feedback; retain recent multi-tagged messages to maintain context. |
| 157 | + |
| 158 | +Memory Retrieval Efficiency: |
| 159 | +Issue: Large Neo4j graphs or SQL queries may slow inference. |
| 160 | + |
| 161 | +Fix: Add indexing and caching for frequent memory access. |
| 162 | + |
| 163 | +Task Prioritization: |
| 164 | +Issue: Automatic urgency detection may be inaccurate. |
| 165 | + |
| 166 | +Fix: Apply heuristics (e.g., "now" = high priority) and allow UI overrides. |
| 167 | + |
| 168 | +Autonomous Actions: |
| 169 | +Issue: Misinterpreted intent could lead to unwanted actions. |
| 170 | + |
| 171 | +Fix: Default to confirmation prompts; log autonomous decisions for review. |
| 172 | + |
| 173 | +Scalability: |
| 174 | +Issue: Single-user design may hinder multi-user scaling. |
| 175 | + |
| 176 | +Fix: Use modular design and containerization (e.g., Docker) for future cloud deployment. |
| 177 | + |
| 178 | +Implementation Plan |
| 179 | +A phased approach respecting dependencies: |
| 180 | +Unified Chat Interface: |
| 181 | +Build Electron UI with text input/output. |
| 182 | + |
| 183 | +Integrate LowDB for episodic memory. |
| 184 | + |
| 185 | +Add basic conversation logic in Python. |
| 186 | + |
| 187 | +Memory Stores: |
| 188 | +Set up Neo4j for long-term memory with GraphRAG. |
| 189 | + |
| 190 | +Configure SQLite for short-term memory. |
| 191 | + |
| 192 | +Implement Memory Manager. |
| 193 | + |
| 194 | +Context Management: |
| 195 | +Add time-based blocking. |
| 196 | + |
| 197 | +Develop tag-based switching (keyword-based initially). |
| 198 | + |
| 199 | +Asynchronous Queues: |
| 200 | +Implement Task Queue in LowDB. |
| 201 | + |
| 202 | +Build Memory Operations Queue. |
| 203 | + |
| 204 | +Adapt existing pipelines for async execution. |
| 205 | + |
| 206 | +Agent Orchestrator: |
| 207 | +Create monitoring agent for tasks and memory ops. |
| 208 | + |
| 209 | +Test with mock tasks. |
| 210 | + |
| 211 | +External Integrations: |
| 212 | +Integrate Google APIs. |
| 213 | + |
| 214 | +Set up a basic sandboxed VM. |
| 215 | + |
| 216 | +Context Engine: |
| 217 | +Add notification monitoring. |
| 218 | + |
| 219 | +Implement microphone input (mock initially). |
| 220 | + |
| 221 | +Connect to queues with autonomy settings. |
| 222 | + |
| 223 | +Voice Modes: |
| 224 | +Integrate STT/TTS for standard voice. |
| 225 | + |
| 226 | +Experiment with advanced voice. |
| 227 | + |
| 228 | +Polish and Testing: |
| 229 | +Refine UI/UX. |
| 230 | + |
| 231 | +Test all features thoroughly. |
| 232 | + |
| 233 | +Project File Structure |
| 234 | +Sentient’s codebase is split into interface (frontend) and model (backend), with the backend now designed to run on a separate server (local or cloud) connected via ngrok. |
| 235 | +interface (Frontend - Electron/Next.js) |
| 236 | +app: Pages for chat, integrations, profile, settings, etc. |
| 237 | + |
| 238 | +components: UI elements (e.g., ChatBubble, ModelSelection). |
| 239 | + |
| 240 | +hooks: Custom React hooks (e.g., useMousePosition). |
| 241 | + |
| 242 | +main: Electron entry points (e.g., index.js, preload.js). |
| 243 | + |
| 244 | +public: Static assets (e.g., logos). |
| 245 | + |
| 246 | +scripts: Server scripts (e.g., appServer.js). |
| 247 | + |
| 248 | +styles: CSS and styling configs. |
| 249 | + |
| 250 | +utils: Helper functions (e.g., api.js, auth.js). |
| 251 | + |
| 252 | +model (Backend - Python) |
| 253 | +agents: Agent logic and runnables. |
| 254 | + |
| 255 | +app: Core backend logic. |
| 256 | + |
| 257 | +auth: Authentication (unused in self-hosted version). |
| 258 | + |
| 259 | +chat: Chat-specific functions and prompts. |
| 260 | + |
| 261 | +common: Shared utilities. |
| 262 | + |
| 263 | +input: User input data files. |
| 264 | + |
| 265 | +memory: Memory management logic. |
| 266 | + |
| 267 | +scraper: Web scraping utilities. |
| 268 | + |
| 269 | +utils: General helpers. |
| 270 | + |
| 271 | +.env: Configuration variables. |
| 272 | + |
| 273 | +requirements.txt: Dependencies. |
| 274 | + |
| 275 | +run_servers.sh: Starts backend servers. |
| 276 | + |
| 277 | +Additional Files |
| 278 | +chatsDb.json: Chat history storage. |
| 279 | + |
| 280 | +userProfileDb.json: User profile data. |
| 281 | + |
| 282 | +get-tree.ps1: Generates file tree. |
| 283 | + |
| 284 | +Additional Information |
| 285 | +Self-Hosted Design: The Electron frontend connects to a Python backend on a local or cloud server via ngrok. No authentication; single-user focus. |
| 286 | + |
| 287 | +Backend Separation: All backend logic (Neo4j, queues, agents) runs on a separate server, making the frontend a lightweight shell. |
0 commit comments