Skip to content

Commit dfba0e9

Browse files
authored
feat(core): Unified Context Management and Tool Distillation. (#24157)
1 parent 117a2d3 commit dfba0e9

22 files changed

Lines changed: 1719 additions & 316 deletions

docs/cli/settings.md

Lines changed: 12 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -155,21 +155,18 @@ they appear in the UI.
155155

156156
### Experimental
157157

158-
| UI Label | Setting | Description | Default |
159-
| ---------------------------------- | ---------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------- | ------- |
160-
| Enable Tool Output Masking | `experimental.toolOutputMasking.enabled` | Enables tool output masking to save tokens. | `true` |
161-
| Enable Git Worktrees | `experimental.worktrees` | Enable automated Git worktree management for parallel work. | `false` |
162-
| Use OSC 52 Paste | `experimental.useOSC52Paste` | Use OSC 52 for pasting. This may be more robust than the default system when using remote terminal sessions (if your terminal is configured to allow it). | `false` |
163-
| Use OSC 52 Copy | `experimental.useOSC52Copy` | Use OSC 52 for copying. This may be more robust than the default system when using remote terminal sessions (if your terminal is configured to allow it). | `false` |
164-
| Plan | `experimental.plan` | Enable Plan Mode. | `true` |
165-
| Model Steering | `experimental.modelSteering` | Enable model steering (user hints) to guide the model during tool execution. | `false` |
166-
| Direct Web Fetch | `experimental.directWebFetch` | Enable web fetch behavior that bypasses LLM summarization. | `false` |
167-
| Memory Manager Agent | `experimental.memoryManager` | Replace the built-in save_memory tool with a memory manager subagent that supports adding, removing, de-duplicating, and organizing memories. | `false` |
168-
| Agent History Truncation | `experimental.agentHistoryTruncation` | Enable truncation window logic for the Agent History Provider. | `false` |
169-
| Agent History Truncation Threshold | `experimental.agentHistoryTruncationThreshold` | The maximum number of messages before history is truncated. | `30` |
170-
| Agent History Retained Messages | `experimental.agentHistoryRetainedMessages` | The number of recent messages to retain after truncation. | `15` |
171-
| Agent History Summarization | `experimental.agentHistorySummarization` | Enable summarization of truncated content via a small model for the Agent History Provider. | `false` |
172-
| Topic & Update Narration | `experimental.topicUpdateNarration` | Enable the experimental Topic & Update communication model for reduced chattiness and structured progress reporting. | `false` |
158+
| UI Label | Setting | Description | Default |
159+
| -------------------------- | ---------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------- | ------- |
160+
| Enable Tool Output Masking | `experimental.toolOutputMasking.enabled` | Enables tool output masking to save tokens. | `true` |
161+
| Enable Git Worktrees | `experimental.worktrees` | Enable automated Git worktree management for parallel work. | `false` |
162+
| Use OSC 52 Paste | `experimental.useOSC52Paste` | Use OSC 52 for pasting. This may be more robust than the default system when using remote terminal sessions (if your terminal is configured to allow it). | `false` |
163+
| Use OSC 52 Copy | `experimental.useOSC52Copy` | Use OSC 52 for copying. This may be more robust than the default system when using remote terminal sessions (if your terminal is configured to allow it). | `false` |
164+
| Plan | `experimental.plan` | Enable Plan Mode. | `true` |
165+
| Model Steering | `experimental.modelSteering` | Enable model steering (user hints) to guide the model during tool execution. | `false` |
166+
| Direct Web Fetch | `experimental.directWebFetch` | Enable web fetch behavior that bypasses LLM summarization. | `false` |
167+
| Memory Manager Agent | `experimental.memoryManager` | Replace the built-in save_memory tool with a memory manager subagent that supports adding, removing, de-duplicating, and organizing memories. | `false` |
168+
| Enable Context Management | `experimental.contextManagement` | Enable logic for context management. | `false` |
169+
| Topic & Update Narration | `experimental.topicUpdateNarration` | Enable the experimental Topic & Update communication model for reduced chattiness and structured progress reporting. | `false` |
173170

174171
### Skills
175172

docs/reference/configuration.md

Lines changed: 45 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1702,25 +1702,8 @@ their corresponding top-level category object in your `settings.json` file.
17021702
- **Default:** `false`
17031703
- **Requires restart:** Yes
17041704

1705-
- **`experimental.agentHistoryTruncation`** (boolean):
1706-
- **Description:** Enable truncation window logic for the Agent History
1707-
Provider.
1708-
- **Default:** `false`
1709-
- **Requires restart:** Yes
1710-
1711-
- **`experimental.agentHistoryTruncationThreshold`** (number):
1712-
- **Description:** The maximum number of messages before history is truncated.
1713-
- **Default:** `30`
1714-
- **Requires restart:** Yes
1715-
1716-
- **`experimental.agentHistoryRetainedMessages`** (number):
1717-
- **Description:** The number of recent messages to retain after truncation.
1718-
- **Default:** `15`
1719-
- **Requires restart:** Yes
1720-
1721-
- **`experimental.agentHistorySummarization`** (boolean):
1722-
- **Description:** Enable summarization of truncated content via a small model
1723-
for the Agent History Provider.
1705+
- **`experimental.contextManagement`** (boolean):
1706+
- **Description:** Enable logic for context management.
17241707
- **Default:** `false`
17251708
- **Requires restart:** Yes
17261709

@@ -1815,6 +1798,49 @@ their corresponding top-level category object in your `settings.json` file.
18151798
prioritize available tools dynamically.
18161799
- **Default:** `[]`
18171800

1801+
#### `contextManagement`
1802+
1803+
- **`contextManagement.historyWindow.maxTokens`** (number):
1804+
- **Description:** The number of tokens to allow before triggering
1805+
compression.
1806+
- **Default:** `150000`
1807+
- **Requires restart:** Yes
1808+
1809+
- **`contextManagement.historyWindow.retainedTokens`** (number):
1810+
- **Description:** The number of tokens to always retain.
1811+
- **Default:** `40000`
1812+
- **Requires restart:** Yes
1813+
1814+
- **`contextManagement.messageLimits.normalMaxTokens`** (number):
1815+
- **Description:** The target number of tokens to budget for a normal
1816+
conversation turn.
1817+
- **Default:** `2500`
1818+
- **Requires restart:** Yes
1819+
1820+
- **`contextManagement.messageLimits.retainedMaxTokens`** (number):
1821+
- **Description:** The maximum number of tokens a single conversation turn can
1822+
consume before truncation.
1823+
- **Default:** `12000`
1824+
- **Requires restart:** Yes
1825+
1826+
- **`contextManagement.messageLimits.normalizationHeadRatio`** (number):
1827+
- **Description:** The ratio of tokens to retain from the beginning of a
1828+
truncated message (0.0 to 1.0).
1829+
- **Default:** `0.25`
1830+
- **Requires restart:** Yes
1831+
1832+
- **`contextManagement.toolDistillation.maxOutputTokens`** (number):
1833+
- **Description:** Maximum tokens to show when truncating large tool outputs.
1834+
- **Default:** `10000`
1835+
- **Requires restart:** Yes
1836+
1837+
- **`contextManagement.toolDistillation.summarizationThresholdTokens`**
1838+
(number):
1839+
- **Description:** Threshold above which truncated tool outputs will be
1840+
summarized by an LLM.
1841+
- **Default:** `20000`
1842+
- **Requires restart:** Yes
1843+
18181844
#### `admin`
18191845

18201846
- **`admin.secureModeEnabled`** (boolean):

packages/a2a-server/src/utils/testing_utils.ts

Lines changed: 2 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -109,12 +109,8 @@ export function createMockConfig(
109109
enableEnvironmentVariableRedaction: false,
110110
},
111111
}),
112-
isExperimentalAgentHistoryTruncationEnabled: vi.fn().mockReturnValue(false),
113-
getExperimentalAgentHistoryTruncationThreshold: vi.fn().mockReturnValue(50),
114-
getExperimentalAgentHistoryRetainedMessages: vi.fn().mockReturnValue(30),
115-
isExperimentalAgentHistorySummarizationEnabled: vi
116-
.fn()
117-
.mockReturnValue(false),
112+
isAutoDistillationEnabled: vi.fn().mockReturnValue(false),
113+
getContextManagementConfig: vi.fn().mockReturnValue({ enabled: false }),
118114
...overrides,
119115
} as unknown as Config;
120116

packages/cli/src/config/config.ts

Lines changed: 4 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -977,14 +977,10 @@ export async function loadCliConfig(
977977
disabledSkills: settings.skills?.disabled,
978978
experimentalJitContext: settings.experimental?.jitContext,
979979
experimentalMemoryManager: settings.experimental?.memoryManager,
980-
experimentalAgentHistoryTruncation:
981-
settings.experimental?.agentHistoryTruncation,
982-
experimentalAgentHistoryTruncationThreshold:
983-
settings.experimental?.agentHistoryTruncationThreshold,
984-
experimentalAgentHistoryRetainedMessages:
985-
settings.experimental?.agentHistoryRetainedMessages,
986-
experimentalAgentHistorySummarization:
987-
settings.experimental?.agentHistorySummarization,
980+
contextManagement: {
981+
enabled: settings.experimental?.contextManagement,
982+
...settings?.contextManagement,
983+
},
988984
modelSteering: settings.experimental?.modelSteering,
989985
topicUpdateNarration: settings.experimental?.topicUpdateNarration,
990986
toolOutputMasking: settings.experimental?.toolOutputMasking,

packages/cli/src/config/settingsSchema.ts

Lines changed: 115 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -2169,44 +2169,13 @@ const SETTINGS_SCHEMA = {
21692169
'Replace the built-in save_memory tool with a memory manager subagent that supports adding, removing, de-duplicating, and organizing memories.',
21702170
showInDialog: true,
21712171
},
2172-
agentHistoryTruncation: {
2172+
contextManagement: {
21732173
type: 'boolean',
2174-
label: 'Agent History Truncation',
2174+
label: 'Enable Context Management',
21752175
category: 'Experimental',
21762176
requiresRestart: true,
21772177
default: false,
2178-
description:
2179-
'Enable truncation window logic for the Agent History Provider.',
2180-
showInDialog: true,
2181-
},
2182-
agentHistoryTruncationThreshold: {
2183-
type: 'number',
2184-
label: 'Agent History Truncation Threshold',
2185-
category: 'Experimental',
2186-
requiresRestart: true,
2187-
default: 30,
2188-
description:
2189-
'The maximum number of messages before history is truncated.',
2190-
showInDialog: true,
2191-
},
2192-
agentHistoryRetainedMessages: {
2193-
type: 'number',
2194-
label: 'Agent History Retained Messages',
2195-
category: 'Experimental',
2196-
requiresRestart: true,
2197-
default: 15,
2198-
description:
2199-
'The number of recent messages to retain after truncation.',
2200-
showInDialog: true,
2201-
},
2202-
agentHistorySummarization: {
2203-
type: 'boolean',
2204-
label: 'Agent History Summarization',
2205-
category: 'Experimental',
2206-
requiresRestart: true,
2207-
default: false,
2208-
description:
2209-
'Enable summarization of truncated content via a small model for the Agent History Provider.',
2178+
description: 'Enable logic for context management.',
22102179
showInDialog: true,
22112180
},
22122181
topicUpdateNarration: {
@@ -2485,6 +2454,118 @@ const SETTINGS_SCHEMA = {
24852454
},
24862455
},
24872456

2457+
contextManagement: {
2458+
type: 'object',
2459+
label: 'Context Management',
2460+
category: 'Experimental',
2461+
requiresRestart: true,
2462+
default: {},
2463+
description:
2464+
'Settings for agent history and tool distillation context management.',
2465+
showInDialog: false,
2466+
properties: {
2467+
historyWindow: {
2468+
type: 'object',
2469+
label: 'History Window Settings',
2470+
category: 'Context Management',
2471+
requiresRestart: true,
2472+
default: {},
2473+
showInDialog: false,
2474+
properties: {
2475+
maxTokens: {
2476+
type: 'number',
2477+
label: 'Max Tokens',
2478+
category: 'Context Management',
2479+
requiresRestart: true,
2480+
default: 150_000,
2481+
description:
2482+
'The number of tokens to allow before triggering compression.',
2483+
showInDialog: false,
2484+
},
2485+
retainedTokens: {
2486+
type: 'number',
2487+
label: 'Retained Tokens',
2488+
category: 'Context Management',
2489+
requiresRestart: true,
2490+
default: 40_000,
2491+
description: 'The number of tokens to always retain.',
2492+
showInDialog: false,
2493+
},
2494+
},
2495+
},
2496+
messageLimits: {
2497+
type: 'object',
2498+
label: 'Message Limits',
2499+
category: 'Context Management',
2500+
requiresRestart: true,
2501+
default: {},
2502+
showInDialog: false,
2503+
properties: {
2504+
normalMaxTokens: {
2505+
type: 'number',
2506+
label: 'Normal Maximum Tokens',
2507+
category: 'Context Management',
2508+
requiresRestart: true,
2509+
default: 2500,
2510+
description:
2511+
'The target number of tokens to budget for a normal conversation turn.',
2512+
showInDialog: false,
2513+
},
2514+
retainedMaxTokens: {
2515+
type: 'number',
2516+
label: 'Retained Maximum Tokens',
2517+
category: 'Context Management',
2518+
requiresRestart: true,
2519+
default: 12000,
2520+
description:
2521+
'The maximum number of tokens a single conversation turn can consume before truncation.',
2522+
showInDialog: false,
2523+
},
2524+
normalizationHeadRatio: {
2525+
type: 'number',
2526+
label: 'Normalization Head Ratio',
2527+
category: 'Context Management',
2528+
requiresRestart: true,
2529+
default: 0.25,
2530+
description:
2531+
'The ratio of tokens to retain from the beginning of a truncated message (0.0 to 1.0).',
2532+
showInDialog: false,
2533+
},
2534+
},
2535+
},
2536+
toolDistillation: {
2537+
type: 'object',
2538+
label: 'Tool Distillation',
2539+
category: 'Context Management',
2540+
requiresRestart: true,
2541+
default: {},
2542+
showInDialog: false,
2543+
properties: {
2544+
maxOutputTokens: {
2545+
type: 'number',
2546+
label: 'Max Output Tokens',
2547+
category: 'Context Management',
2548+
requiresRestart: true,
2549+
default: 10_000,
2550+
description:
2551+
'Maximum tokens to show when truncating large tool outputs.',
2552+
showInDialog: false,
2553+
},
2554+
summarizationThresholdTokens: {
2555+
type: 'number',
2556+
label: 'Tool Summarization Threshold',
2557+
category: 'Context Management',
2558+
requiresRestart: true,
2559+
default: 20_000,
2560+
description:
2561+
'Threshold above which truncated tool outputs will be summarized by an LLM.',
2562+
showInDialog: false,
2563+
},
2564+
},
2565+
},
2566+
},
2567+
},
2568+
24882569
admin: {
24892570
type: 'object',
24902571
label: 'Admin',

0 commit comments

Comments
 (0)