Skip to content

Commit 634b273

Browse files
Add request metadata and cache-aware usage to UniResponse (#111)
* Fix: messages schema in OpenAI Adapter with ReAct Strategy * Feat: adapter with more informations * Rub: R2R_Map.get * Docs: remove outdated and add new docs. * Fix: attr accesses * lint: docs
1 parent d407a8b commit 634b273

38 files changed

Lines changed: 338 additions & 553 deletions

docs/docs/.vitepress/config.mts

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -309,8 +309,8 @@ export default withMermaid({
309309
link: "/guide/api-reference/classes/SuspendEnum",
310310
},
311311
{
312-
text: "SuspendObjectStream",
313-
link: "/guide/api-reference/classes/SuspendObjectStream",
312+
text: "RequestMetadata",
313+
link: "/guide/api-reference/classes/RequestMetadata",
314314
},
315315
{
316316
text: "EmbeddingChunk",
@@ -663,8 +663,8 @@ export default withMermaid({
663663
link: "/zh/guide/api-reference/classes/SuspendEnum",
664664
},
665665
{
666-
text: "SuspendObjectStream",
667-
link: "/zh/guide/api-reference/classes/SuspendObjectStream",
666+
text: "RequestMetadata",
667+
link: "/zh/guide/api-reference/classes/RequestMetadata",
668668
},
669669
{
670670
text: "EmbeddingChunk",

docs/docs/guide/api-reference/classes/ChatManager.md

Lines changed: 0 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -32,16 +32,12 @@ Clean up running chat objects under the specified key, keeping only up to `maxit
3232

3333
**Returns:** `bool``True` if cleanup was performed, `False` otherwise
3434

35-
---
36-
3735
### `get_all_objs() -> list[ChatObjectMeta]`
3836

3937
Get metadata for all running chat objects across all sessions.
4038

4139
**Returns:** `list[ChatObjectMeta]` — List of all running chat object metadata snapshots
4240

43-
---
44-
4541
### `get_objs(session_id: str) -> list[ChatObject]`
4642

4743
Get all active chat objects for a given session ID.
@@ -52,8 +48,6 @@ Get all active chat objects for a given session ID.
5248

5349
**Returns:** `list[ChatObject]` — List of chat objects for the session
5450

55-
---
56-
5751
### `async clean_chat_objects(maxitems: int = 10) -> None`
5852

5953
Asynchronously clean up all running chat objects across all sessions, limiting each session to `maxitems` objects.
@@ -62,8 +56,6 @@ Asynchronously clean up all running chat objects across all sessions, limiting e
6256

6357
- `maxitems` (`int`, optional): Maximum number of objects per session. Defaults to `10`.
6458

65-
---
66-
6759
### `async add_chat_object(chat_object: ChatObject) -> None`
6860

6961
Register a new `ChatObject` instance with the manager. Creates a metadata snapshot and inserts the object at the beginning of the session's list.

docs/docs/guide/api-reference/classes/ClientManager.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -57,8 +57,6 @@ Initializes the ClientManager (runs only once due to singleton pattern).
5757

5858
**Note:** Initialization logic executes only on the first instantiation.
5959

60-
---
61-
6260
_All other methods are inherited from [`MultiClientManager`](MultiClientManager.md):_
6361

6462
- `get_client_by_script(server_script)` - Get client by server script

docs/docs/guide/api-reference/classes/MemoryLimiter.md

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -66,16 +66,12 @@ Override the default abstract instruction used for context summarization.
6666
- `TypeError`: If instruction is not a string
6767
- `ValueError`: If instruction is empty
6868

69-
---
70-
7169
### `get_abstract_instruction() -> str`
7270

7371
Get the current abstract instruction text.
7472

7573
**Returns:** `str`
7674

77-
---
78-
7975
### `reset_abstract_instruction()`
8076

8177
Reset the abstract instruction to the framework default.

docs/docs/guide/api-reference/classes/ModelAdapter.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ The `ModelAdapter` class provides a unified interface for integrating different
88

99
Adapters are automatically registered with the [`AdapterManager`](#adaptermanager) when defined, unless marked as abstract or explicitly disabled from registration.
1010

11-
> **Note**: The `ModelAdapter` base class has been moved from `amrita_core.protocol` to `amrita_core.base.adapter`. The `amrita_core.protocol` module is now a deprecated re-export wrapper.
11+
> **Note**: The `ModelAdapter` base class has been moved from `amrita_core.protocol` to `amrita_core.base.adapter`. The `amrita_core.protocol` compatibility endpoint was removed in v0.10.x+; import from `amrita_core.base.adapter`.
1212
1313
## Class Definition
1414

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
# RequestMetadata
2+
3+
`RequestMetadata` captures per-request diagnostic information returned by every adapter call through `UniResponse.metadata`.
4+
5+
## Properties
6+
7+
- `request_id` (str): Auto-generated unique request ID (UUID4). Defaults to a new UUID if not provided.
8+
- `original_request_id` (str | None): Original request ID returned by the LLM provider adapter (e.g., OpenAI's `_request_id`, Anthropic's `request_id`). `None` when unavailable.
9+
- `model` (str): The model used for the request. Defaults to `"__NOT_GIVEN__"` when not available (e.g., streaming before the first chunk).
10+
- `stop_sequence` (str | None): The stop sequence that terminated generation, if any.
11+
- `stop_reason` (STOP_REASON | None): Why the generation stopped. One of:
12+
13+
| Value | Meaning |
14+
| ----------------- | -------------------------- |
15+
| `"end_turn"` | Natural completion |
16+
| `"max_tokens"` | Hit max token limit |
17+
| `"stop_sequence"` | Matched a stop sequence |
18+
| `"tool_use"` | Model called a tool |
19+
| `"pause_turn"` | Anthropic pause turn |
20+
| `"refusal"` | Content filtered / refused |
21+
22+
## Usage
23+
24+
```python
25+
from amrita_core.types.response import RequestMetadata
26+
27+
# Accessed via UniResponse
28+
response: UniResponse = ...
29+
print(response.metadata.model) # e.g. "gpt-4o"
30+
print(response.metadata.stop_reason) # e.g. "end_turn"
31+
print(response.metadata.original_request_id) # Provider's request ID
32+
```
33+
34+
> **Note**: `extra="allow"` is configured, so provider-specific fields may appear in addition to the standard ones.

docs/docs/guide/api-reference/classes/StrategyLikedObject.md

Lines changed: 0 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -44,8 +44,6 @@ Called once by the framework when the execution context is ready. Subclasses may
4444

4545
**Returns:** `Self`
4646

47-
---
48-
4947
### `async single_execute() -> bool`
5048

5149
Execute a single agent step for `"agent"` and `"agent-mixed"` category strategies. Called by the framework to perform one iteration of tool calling.
@@ -54,8 +52,6 @@ Execute a single agent step for `"agent"` and `"agent-mixed"` category strategie
5452

5553
**Note:** This method is used by `"agent"` and `"agent-mixed"` category strategies. `"rag"` and `"workflow"` category strategies should implement `run()` instead.
5654

57-
---
58-
5955
### `async run() -> None`
6056

6157
Run the complete agent strategy for `"rag"` and `"workflow"` category strategies. Gives full control to the strategy implementation for managing tool calling iterations, context construction, error handling, and response generation.
@@ -67,8 +63,6 @@ Run the complete agent strategy for `"rag"` and `"workflow"` category strategies
6763

6864
**Note:** This method is used by `"rag"` and `"workflow"` category strategies. `"agent"` and `"agent-mixed"` category strategies should implement `single_execute()` instead.
6965

70-
---
71-
7266
### `async call_tool(tool_call: ToolCall) -> str`
7367

7468
Execute a single tool call without modifying the agent's context.
@@ -83,16 +77,12 @@ Execute a single tool call without modifying the agent's context.
8377

8478
**Returns:** `str` — The string response from the tool execution, or a default message if the tool returns `None`
8579

86-
---
87-
8880
### `async on_limited() -> None`
8981

9082
Handle the event when the agent reaches its tool calling limit. Called when the agent strategy has reached the maximum allowed number of tool calls.
9183

9284
**Default behavior:** Sends a notification message to the user about the limit being reached.
9385

94-
---
95-
9686
### `async on_exception(exc: BaseException) -> None`
9787

9888
Handle exceptions that occur during strategy execution.
@@ -101,14 +91,10 @@ Handle exceptions that occur during strategy execution.
10191

10292
- `exc` (`BaseException`): The exception that occurred
10393

104-
---
105-
10694
### `async on_post_process() -> None`
10795

10896
Used to process after all steps are completed successfully.
10997

110-
---
111-
11298
### `classmethod get_category() -> Literal["agent", "workflow", "rag", "agent-mixed"]`
11399

114100
Get the category of the agent strategy. This is an abstract method that must be implemented by subclasses.

docs/docs/guide/api-reference/classes/SuspendObjectStream.md

Lines changed: 0 additions & 33 deletions
This file was deleted.

docs/docs/guide/api-reference/classes/UniResponse.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ The UniResponse class provides a unified response format.
1010
- `tool_calls` (T_TOOL): Tool call results, T_TOOL is a generic parameter
1111
- `reasoning_content` (str | None): Reasoning/thinking content from the model, if the model supports it (e.g., o1, Claude with extended thinking)
1212
- `reasoning_signature` (str | None): Anthropic thinking signature, required for round-tripping thinking content with Anthropic API
13+
- `metadata` ([RequestMetadata](RequestMetadata.md)): Request metadata containing request ID, model name, stop reason, and original provider request ID
1314

1415
## Description
1516

docs/docs/guide/api-reference/classes/UniResponseUsage.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,8 @@ UniResponseUsage class represents usage statistics for responses.
77
- `prompt_tokens` (T_INT): Number of tokens used in the prompt
88
- `completion_tokens` (T_INT): Number of tokens used in the completion (generation)
99
- `total_tokens` (T_INT): Total number of tokens used
10+
- `cache_creation` (int | None): Number of tokens used to create the cache entry (Anthropic prompt caching)
11+
- `cache_hit` (int | None): Number of tokens read from the cache (Anthropic prompt caching)
1012

1113
## Description
1214

0 commit comments

Comments
 (0)