Skip to content

Commit e832b5b

Browse files
committed
feat(model): 实现基于上下文窗口的动态隐式输出预留逻辑
- 引入根据总上下文 20% 动态推导预留输出的逻辑,并将其收敛在 4096 到 30000 token 之间。 - 更新配置存储与 Provider 核心逻辑以支持该动态预留,并同步修订了多语言文档与配置元数据。 - 增加针对不同上下文规模的测试用例,并优化了 Commit 消息生成器的提示词指令。
1 parent e4aa32f commit e832b5b

8 files changed

Lines changed: 135 additions & 86 deletions

File tree

DEV.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -161,7 +161,8 @@ npm run test:pages
161161
- `anthropic`:请求 `baseUrl + /messages`
162162
- `coding-plans.vendors[].usageUrl` 为可选套餐 usage 接口;当前默认按 `Authorization: Bearer <API Key>` 轮询,并将识别到的小时额度、周额度或次数额度以百分比显示在状态栏。
163163
- `coding-plans.vendors[].models[].contextSize` 现在是描述模型上下文的首选字段。
164-
- `coding-plans.advanced.defaultReservedOutput` 的默认值为 `60000`,用于全局输出预算;发送请求时会自动按模型上限收敛。
164+
- 未显式设置 `maxOutputTokens` 时,运行时会按总上下文动态推导隐式输出预留:`min(30000, max(4096, floor(totalContextWindow * 0.2)))`;极小上下文窗口会再按总窗口安全收敛。
165+
- `coding-plans.advanced.defaultReservedOutput` 的默认值为 `60000`,用于请求侧输出预算覆盖;发送请求时会自动按模型上限收敛,不改变模型隐式默认输出预留的推导公式。
165166
- `coding-plans.vendors[].models[].maxInputTokens` / `maxOutputTokens` 已标记为 deprecated,保留兼容旧配置与特殊覆盖用途。两者仍允许配置为 `0`。其中 `maxInputTokens: 0` 的语义为”未设置”;`maxOutputTokens` 默认值就是 `0`,表示”未设置”;在 `openai-chat` / `openai-responses` 下不主动下发 `max_tokens` / `max_output_tokens`,但当上游协议端点强制要求 `max_tokens` 时需自动补发兼容值。`maxInputTokens` 仍仅用于本地元数据和预算,不直接传给 API。自动刷新/写回 `vendors` 配置时不再默认补入这两个字段;只有用户显式配置的现有模型项会被原样保留。
166167
- 新增采样参数:
167168
- `coding-plans.vendors[].defaultTemperature` / `defaultTopP`:供应商默认采样值

README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -159,10 +159,10 @@ code --install-extension techfetch-dev.coding-plans-for-copilot
159159
| `coding-plans.vendors[].models[].temperature` | `number` | 继承供应商 | 模型级 temperature 覆盖。 |
160160
| `coding-plans.vendors[].models[].topP` | `number` | 继承供应商 | 模型级 topP 覆盖。 |
161161
| `coding-plans.vendors[].models[].capabilities` | `object` | `{ tools: true, vision: false }` | 模型能力声明。 |
162-
| `coding-plans.vendors[].models[].contextSize` | `number` || 模型总上下文窗口。 |
162+
| `coding-plans.vendors[].models[].contextSize` | `number` || 模型总上下文窗口;未显式设置 `maxOutputTokens` 时,运行时会基于它动态推导隐式输出预留|
163163
| `coding-plans.vendors[].models[].maxInputTokens` | `number` || 已废弃,建议使用 `contextSize`|
164-
| `coding-plans.vendors[].models[].maxOutputTokens` | `number` | `0` | 已废弃,建议使用 `contextSize`|
165-
| `coding-plans.advanced.defaultReservedOutput` | `number` | `60000` | 全局默认输出 token 预算。 |
164+
| `coding-plans.vendors[].models[].maxOutputTokens` | `number` | `0` | 已废弃,建议使用 `contextSize``0` 表示未设置,此时运行时默认按总上下文的 `20%` 推导隐式输出预留,并收敛到 `4096-30000` |
165+
| `coding-plans.advanced.defaultReservedOutput` | `number` | `60000` | 请求侧默认输出 token 预算;仅作为发送请求时的预算覆盖值,最终仍会按模型输出上限收敛|
166166
| `coding-plans.commitMessage.showGenerateCommand` | `boolean` | `true` | 是否显示"生成 Commit 消息"命令。 |
167167
| `coding-plans.commitMessage.language` | `string` | `en` | 提交消息语言:`en` / `zh-cn`|
168168
| `coding-plans.commitMessage.useRecentCommitStyle` | `boolean` | `false` | 是否参考最近 20 条 commit 风格。 |

README_en.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -159,10 +159,10 @@ To switch to OpenAI-compatible endpoints, modify the vendor's `baseUrl` and `def
159159
| `coding-plans.vendors[].models[].temperature` | `number` | Inherit from vendor | Model-level temperature override. |
160160
| `coding-plans.vendors[].models[].topP` | `number` | Inherit from vendor | Model-level topP override. |
161161
| `coding-plans.vendors[].models[].capabilities` | `object` | `{ tools: true, vision: false }` | Model capability declaration. |
162-
| `coding-plans.vendors[].models[].contextSize` | `number` | Empty | Model total context window. |
162+
| `coding-plans.vendors[].models[].contextSize` | `number` | Empty | Model total context window. When `maxOutputTokens` is unset, runtime derives the implicit reserved output budget from this total window. |
163163
| `coding-plans.vendors[].models[].maxInputTokens` | `number` | Empty | Deprecated,建议使用 `contextSize`. |
164-
| `coding-plans.vendors[].models[].maxOutputTokens` | `number` | `0` | Deprecated,建议使用 `contextSize`. |
165-
| `coding-plans.advanced.defaultReservedOutput` | `number` | `60000` | Global default output token budget. |
164+
| `coding-plans.vendors[].models[].maxOutputTokens` | `number` | `0` | Deprecated,建议使用 `contextSize`. `0` means unset; runtime then derives an implicit reserved output budget as 20% of total context, clamped to 4096-30000. |
165+
| `coding-plans.advanced.defaultReservedOutput` | `number` | `60000` | Request-side default output token budget. It only overrides request budgeting and is still capped by the model output limit. |
166166
| `coding-plans.commitMessage.showGenerateCommand` | `boolean` | `true` | Whether to show "Generate Commit Message" command. |
167167
| `coding-plans.commitMessage.language` | `string` | `en` | Commit message language: `en` / `zh-cn`. |
168168
| `coding-plans.commitMessage.useRecentCommitStyle` | `boolean` | `false` | Whether to reference the style of the last 20 commits. |
@@ -242,4 +242,4 @@ MIT License
242242
2. Create a feature branch (`git checkout -b feature/AmazingFeature`)
243243
3. Commit changes (`git commit -m 'Add some AmazingFeature'`)
244244
4. Push to the branch (`git push origin feature/AmazingFeature`)
245-
5. Open a Pull Request
245+
5. Open a Pull Request

package.json

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
"name": "coding-plans-for-copilot",
33
"displayName": "%displayName%",
44
"description": "%description%",
5-
"version": "0.6.28",
5+
"version": "0.7.0",
66
"publisher": "techfetch-dev",
77
"repository": {
88
"type": "git",
@@ -315,7 +315,7 @@
315315
},
316316
"contextSize": {
317317
"type": "number",
318-
"description": "Preferred total context window size for the model. Language model context display uses this value directly when provided."
318+
"description": "Preferred total context window size for the model. Language model context display uses this value directly when provided. When maxOutputTokens is left unset, runtime derives an implicit reserved output budget from this total window."
319319
},
320320
"maxInputTokens": {
321321
"type": "number",
@@ -325,7 +325,7 @@
325325
"maxOutputTokens": {
326326
"type": "number",
327327
"default": 0,
328-
"description": "Deprecated: legacy maximum output tokens override. Prefer contextSize for model context. When contextSize is provided and this value exceeds it, it will be capped to contextSize. Defaults to 0, which treats it as unset. For openai-chat/openai-responses it suppresses proactively sending output-token limits, while protocol endpoints that require max_tokens will trigger an automatic compatible retry.",
328+
"description": "Deprecated: legacy maximum output tokens override. Prefer contextSize for model context. When contextSize is provided and this value exceeds it, it will be capped to contextSize. Defaults to 0, which treats it as unset; runtime then derives an implicit reserved output budget from total context (20%, clamped to 4096-30000). For openai-chat/openai-responses it suppresses proactively sending output-token limits, while protocol endpoints that require max_tokens will trigger an automatic compatible retry.",
329329
"deprecationMessage": "Deprecated: prefer contextSize to describe model context. Keep maxOutputTokens only when you need a legacy output-cap override."
330330
},
331331
"apiStyle": {

src/config/configStore.ts

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5,8 +5,8 @@ import {
55
DEFAULT_MODEL_CAPABILITIES_TOOLS,
66
DEFAULT_MODEL_CAPABILITIES_VISION,
77
DEFAULT_CONTEXT_WINDOW_SIZE,
8-
DEFAULT_RESERVED_OUTPUT_TOKENS,
9-
VENDOR_API_KEY_PREFIX
8+
VENDOR_API_KEY_PREFIX,
9+
resolveImplicitReservedOutputTokens
1010
} from '../constants';
1111

1212
export type VendorApiStyle = 'openai-chat' | 'openai-responses' | 'anthropic';
@@ -520,7 +520,7 @@ export class ConfigStore implements vscode.Disposable {
520520
): { maxInputTokens: number; maxOutputTokens: number } {
521521
const hasExplicitTotalContextWindow = legacyContextWindow !== undefined;
522522
const fallbackTotal = Math.max(2, Math.floor(legacyContextWindow ?? DEFAULT_CONTEXT_WINDOW_SIZE));
523-
const defaultReservedOutputTokens = Math.max(1, Math.min(DEFAULT_RESERVED_OUTPUT_TOKENS, fallbackTotal - 1));
523+
const defaultReservedOutputTokens = resolveImplicitReservedOutputTokens(fallbackTotal);
524524
const normalizeTokenValue = (value: number | undefined): number | undefined => {
525525
if (value === undefined) {
526526
return undefined;

src/constants.ts

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -79,9 +79,24 @@ export const DEFAULT_TOKEN_SIDE_LIMIT = 200000;
7979
export const DEFAULT_CONTEXT_WINDOW_SIZE = DEFAULT_TOKEN_SIDE_LIMIT;
8080
export const DEFAULT_REQUEST_MAX_TOKENS = DEFAULT_TOKEN_SIDE_LIMIT;
8181
export const DEFAULT_RESERVED_OUTPUT_TOKENS = 30000;
82+
export const MIN_DYNAMIC_RESERVED_OUTPUT_TOKENS = 4096;
83+
export const DEFAULT_RESERVED_OUTPUT_RATIO = 0.2;
8284
export const DEFAULT_MODEL_CAPABILITIES_TOOLS = true;
8385
export const DEFAULT_MODEL_CAPABILITIES_VISION = false;
8486

87+
export function resolveImplicitReservedOutputTokens(totalContextWindow: number): number {
88+
const normalizedTotalContextWindow = Math.max(2, Math.floor(totalContextWindow));
89+
const desiredReservedOutputTokens = Math.min(
90+
DEFAULT_RESERVED_OUTPUT_TOKENS,
91+
Math.max(
92+
MIN_DYNAMIC_RESERVED_OUTPUT_TOKENS,
93+
Math.floor(normalizedTotalContextWindow * DEFAULT_RESERVED_OUTPUT_RATIO)
94+
)
95+
);
96+
97+
return Math.max(1, Math.min(desiredReservedOutputTokens, normalizedTotalContextWindow - 1));
98+
}
99+
85100
export const LOG_LEVEL_PRIORITY = {
86101
DEBUG: 10,
87102
INFO: 20,

src/providers/baseProvider.ts

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,8 @@ import {
44
DEFAULT_CONTEXT_WINDOW_SIZE,
55
DEFAULT_RESERVED_OUTPUT_TOKENS,
66
DEFAULT_TOKEN_SIDE_LIMIT,
7-
MODEL_VERSION_LABEL
7+
MODEL_VERSION_LABEL,
8+
resolveImplicitReservedOutputTokens
89
} from '../constants';
910
import { logger } from '../logging/outputChannelLogger';
1011

@@ -463,7 +464,7 @@ export abstract class BaseAIProvider implements vscode.Disposable {
463464
): Pick<ResolvedModelRuntimeSettings, 'maxTokens' | 'maxInputTokens' | 'maxOutputTokens'> {
464465
const hasExplicitTotalContextWindow = totalContextWindow !== undefined;
465466
const fallbackTotal = Math.max(2, Math.floor(totalContextWindow ?? DEFAULT_CONTEXT_WINDOW_SIZE));
466-
const defaultReservedOutputTokens = Math.max(1, Math.min(DEFAULT_RESERVED_OUTPUT_TOKENS, fallbackTotal - 1));
467+
const defaultReservedOutputTokens = resolveImplicitReservedOutputTokens(fallbackTotal);
467468
const normalizeTokenValue = (value: number | undefined): number | undefined => {
468469
if (value === undefined) {
469470
return undefined;

0 commit comments

Comments
 (0)