MemOS-Docs/content/en/open_source/open_source_api/chat/chat.md at b6574d5344aa82ad2ec77671ec88d91e7de49e1c · MemTensor/MemOS-Docs

title	Chat
desc	An end-to-end RAG loop integrating retrieval, generation, and storage — supporting personalized responses with MemCube and automatic memory crystallization.

:::note For a complete reference of API fields and formats, see the Chat API Documentation. :::

Endpoints:

Complete Response: POST /product/chat/complete
Streaming Response (SSE): POST /product/chat/stream

Description: The core business orchestration entry point of MemOS. It automatically recalls relevant memories from specified readable_cube_ids, generates contextual responses, and optionally writes conversation results back to writable_cube_ids for continuous AI self-evolution.

1. Core Architecture: ChatHandler Orchestration

Memory Retrieval: Calls SearchHandler based on readable_cube_ids to extract relevant facts, preferences, and tool context from isolated Cubes.
Context-Augmented Generation: Injects recalled memory fragments into the prompt, then calls the specified LLM (via model_name_or_path) to generate a targeted response.
Automatic Memory Loop: If add_message_on_answer=true, the system calls AddHandler to asynchronously store the conversation in the specified Cubes — no manual add call required.

2. Key Parameters

2.1 Identity & Context

Parameter	Type	Required	Description
`query`	`str`	Yes	The user's current question.
`user_id`	`str`	Yes	Unique user identifier for auth and data isolation.
`history`	`list`	No	Short-term conversation history for maintaining session coherence.
`session_id`	`str`	No	Session ID. Acts as a "soft signal" to boost recall weight for in-session memories.

2.2 MemCube Read/Write Control

Parameter	Type	Default	Description
`readable_cube_ids`	`list`	-	Read: Memory Cubes allowed for retrieval (can span personal and shared Cubes).
`writable_cube_ids`	`list`	-	Write: Target Cubes for auto-generated memories after conversation.
`add_message_on_answer`	`bool`	`true`	Whether to enable auto-writeback. Recommended to keep enabled.

2.3 Algorithm & Model Configuration

Parameter	Type	Default	Description
`mode`	`str`	`fast`	Retrieval mode: `fast`, `fine`, `mixture`.
`model_name_or_path`	`str`	-	LLM model name or path.
`system_prompt`	`str`	-	Override the default system prompt.
`temperature`	`float`	-	Sampling temperature for controlling creativity.
`threshold`	`float`	`0.5`	Relevance threshold — memories below this score are filtered out.

3. Response Modes

3.1 Complete Response (`/complete`)

Returns the full JSON response after the model finishes generation.
Best for non-interactive tasks, background processing, or simple applications.

3.2 Streaming Response (`/stream`)

Uses Server-Sent Events (SSE) to push tokens in real time.
Best for chatbots and assistants requiring typewriter-style UI feedback.

4. Quick Start

from memos.api.client import MemOSClient

client = MemOSClient(api_key="...", base_url="...")

res = client.chat(
    user_id="dev_user_01",
    query="Based on my preferences, suggest an R data cleaning workflow",
    readable_cube_ids=["private_cube_01", "public_kb_r_lang"],
    writable_cube_ids=["private_cube_01"],
    add_message_on_answer=True,
    mode="fine"
)

if res:
    print(f"AI response: {res.data}")

:::note Developer Tip: For debugging in a Playground environment, use the dedicated stream endpoint /product/chat/stream/playground. :::

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1. Core Architecture: ChatHandler Orchestration

2. Key Parameters

2.1 Identity & Context

2.2 MemCube Read/Write Control

2.3 Algorithm & Model Configuration

3. Response Modes

3.1 Complete Response (`/complete`)

3.2 Streaming Response (`/stream`)

4. Quick Start

FilesExpand file tree

chat.md

Latest commit

History

chat.md

File metadata and controls

1. Core Architecture: ChatHandler Orchestration

2. Key Parameters

2.1 Identity & Context

2.2 MemCube Read/Write Control

2.3 Algorithm & Model Configuration

3. Response Modes

3.1 Complete Response (/complete)

3.2 Streaming Response (/stream)

4. Quick Start

3.1 Complete Response (`/complete`)

3.2 Streaming Response (`/stream`)