feat(claude): support AWS Bedrock Claude count_tokens forwarding#4926
feat(claude): support AWS Bedrock Claude count_tokens forwarding#4926B1F030 wants to merge 1 commit into
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds a /v1/messages/count_tokens endpoint with middleware to limit allowed channel types, propagates allowed-type filtering through service/model/cache selection, implements AWS Bedrock Claude token-counting, and integrates a dedicated relay handler and route short-circuit. ChangesClaude Token Counting with Channel Type Filtering
Sequence Diagram(s)sequenceDiagram
participant Client
participant Router
participant LimitChannelTypes
participant Distribute
participant Relay
participant Service
participant Model
participant Cache
participant AWSBedrock
Client->>Router: POST /v1/messages/count_tokens
Router->>LimitChannelTypes: apply allowed types (AWS)
LimitChannelTypes->>Distribute: store allowed types in context
Distribute->>Relay: call Relay (Claude format)
Relay->>Relay: detect RelayModeClaudeCountTokens -> relayClaudeCountTokens
Relay->>Service: CacheGetRandomSatisfiedChannel (AllowedChannelTypes)
Service->>Cache: GetRandomSatisfiedChannelWithTypes
Cache->>Model: GetChannelWithChannelTypes (if uncached)
Relay->>AWSBedrock: CountClaudeTokens request (InvokeModel)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
controller/relay.go (1)
126-129: ⚡ Quick winCount-tokens early return bypasses failure sampling.
At Line 128, this returns before the common
perfmetrics.RecordRelaySample(..., false, 0)path, so/messages/count_tokensfailures won’t be sampled like other relay failures.Proposed patch
if relayFormat == types.RelayFormatClaude && relayInfo.RelayMode == relayconstant.RelayModeClaudeCountTokens { newAPIError = relayClaudeCountTokens(c, relayInfo, request) + if newAPIError != nil { + gopool.Go(func() { + perfmetrics.RecordRelaySample(relayInfo, false, 0) + }) + } return }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@controller/relay.go` around lines 126 - 129, The early return for count-tokens bypasses the common failure sampling path; after calling relayClaudeCountTokens(...) but before returning, check if newAPIError != nil and call perfmetrics.RecordRelaySample with the same parameters used elsewhere (using relayInfo, relayFormat, false, 0) so failures for /messages/count_tokens are sampled; alternatively, change relayClaudeCountTokens to return an error state and let the existing common return path call perfmetrics.RecordRelaySample. Ensure you reference relayClaudeCountTokens, newAPIError, and perfmetrics.RecordRelaySample in the fix.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@middleware/distributor.go`:
- Around line 62-64: The forbidden message is hardcoded; replace the raw string
in the branch that calls abortWithOpenAiMessage after
service.IsChannelTypeAllowed(...) with a localized message via i18n.T(...). Use
a clear i18n key (e.g. "errors.channel_type_not_allowed") and pass
i18n.T("errors.channel_type_not_allowed") to abortWithOpenAiMessage so the
middleware uses the same localization system as other messages; update any
callers or tests expecting the raw string accordingly.
In `@relay/channel/aws/relay-aws.go`:
- Around line 173-177: CountClaudeTokens currently treats every error from
buildAwsCountTokensInput as a BadRequestBody/HTTP 400 with skip-retry; change
buildAwsCountTokensInput to return a typed *types.NewAPIError (rather than a
generic error) so callers can distinguish parsing/body errors (map to
ErrorCodeBadRequest + 400) from channel/server setup errors (map to
channel/internal error codes and appropriate HTTP 5xx or channel error status
without skip-retry). Update CountClaudeTokens to inspect the returned
*types.NewAPIError from buildAwsCountTokensInput and return it unchanged (or
remap its status/code) instead of always wrapping into BadRequestBody; reference
buildAwsCountTokensInput and CountClaudeTokens to locate the changes and ensure
retry flags (ErrOptionWithSkipRetry) are only applied for true client input
errors.
---
Nitpick comments:
In `@controller/relay.go`:
- Around line 126-129: The early return for count-tokens bypasses the common
failure sampling path; after calling relayClaudeCountTokens(...) but before
returning, check if newAPIError != nil and call perfmetrics.RecordRelaySample
with the same parameters used elsewhere (using relayInfo, relayFormat, false, 0)
so failures for /messages/count_tokens are sampled; alternatively, change
relayClaudeCountTokens to return an error state and let the existing common
return path call perfmetrics.RecordRelaySample. Ensure you reference
relayClaudeCountTokens, newAPIError, and perfmetrics.RecordRelaySample in the
fix.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 31233d9a-6691-4224-ab16-7844c9097c70
📒 Files selected for processing (15)
constant/context_key.gocontroller/relay.gomiddleware/channel_type_limit_test.gomiddleware/distributor.gomodel/ability.gomodel/channel_cache.gomodel/channel_cache_test.gorelay/channel/aws/relay-aws.gorelay/channel/aws/relay_aws_test.gorelay/claude_count_tokens.gorelay/claude_handler.gorelay/constant/relay_mode.gorelay/constant/relay_mode_test.gorouter/relay-router.goservice/channel_select.go
81f2c6c to
15a9519
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
controller/relay.go (1)
126-129: ⚡ Quick winPreserve failure perf sampling on count_tokens short-circuit path.
This early return skips the existing failure sampling at Line 250, so Claude
/count_tokensfailures won’t be recorded viaperfmetrics.RecordRelaySample.📊 Suggested patch
if relayFormat == types.RelayFormatClaude && relayInfo.RelayMode == relayconstant.RelayModeClaudeCountTokens { newAPIError = relayClaudeCountTokens(c, relayInfo, request) + if newAPIError != nil { + gopool.Go(func() { + perfmetrics.RecordRelaySample(relayInfo, false, 0) + }) + } return }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@controller/relay.go` around lines 126 - 129, The short-circuit handling for Claude count_tokens in the relay (the if checking relayFormat == types.RelayFormatClaude && relayInfo.RelayMode == relayconstant.RelayModeClaudeCountTokens that calls relayClaudeCountTokens and returns) bypasses the failure perf sampling at perfmetrics.RecordRelaySample; modify the flow so that after calling relayClaudeCountTokens you still invoke perfmetrics.RecordRelaySample on error (or propagate newAPIError without an early return so the existing sampling path runs), ensuring newAPIError is checked and recorded before returning.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@controller/relay.go`:
- Around line 126-129: The short-circuit handling for Claude count_tokens in the
relay (the if checking relayFormat == types.RelayFormatClaude &&
relayInfo.RelayMode == relayconstant.RelayModeClaudeCountTokens that calls
relayClaudeCountTokens and returns) bypasses the failure perf sampling at
perfmetrics.RecordRelaySample; modify the flow so that after calling
relayClaudeCountTokens you still invoke perfmetrics.RecordRelaySample on error
(or propagate newAPIError without an early return so the existing sampling path
runs), ensuring newAPIError is checked and recorded before returning.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 54c46eaf-fc26-4700-8067-2b5cabcaeb0f
📒 Files selected for processing (15)
constant/context_key.gocontroller/relay.gomiddleware/channel_type_limit_test.gomiddleware/distributor.gomodel/ability.gomodel/channel_cache.gomodel/channel_cache_test.gorelay/channel/aws/relay-aws.gorelay/channel/aws/relay_aws_test.gorelay/claude_count_tokens.gorelay/claude_handler.gorelay/constant/relay_mode.gorelay/constant/relay_mode_test.gorouter/relay-router.goservice/channel_select.go
🚧 Files skipped from review as they are similar to previous changes (14)
- constant/context_key.go
- relay/constant/relay_mode_test.go
- router/relay-router.go
- middleware/channel_type_limit_test.go
- relay/channel/aws/relay_aws_test.go
- relay/constant/relay_mode.go
- relay/claude_count_tokens.go
- middleware/distributor.go
- model/channel_cache_test.go
- service/channel_select.go
- model/channel_cache.go
- relay/channel/aws/relay-aws.go
- relay/claude_handler.go
- model/ability.go
|
/cc @Calcium-Ion Would you please take a look? Thanks! |
5c55900 to
1d37ac6
Compare
|
/cc @seefs001 Would you please take a look? Thanks! |
51dd734 to
d11c57b
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (2)
middleware/distributor.go (2)
213-245: ⚡ Quick winLocalize the hardcoded error message.
Line 223 returns the raw English string
"invalid JSON request body", while the rest of this middleware usesi18n.T(...)for localization (e.g., lines 46, 52, 200, 208). Please route this error through i18n for consistent client localization.🌐 Proposed fix
Add an i18n key and use it:
if !gjson.ValidBytes(requestBody) { - return nil, errors.New("invalid JSON request body") + return nil, errors.New(i18n.T(c, i18n.MsgDistributorInvalidJSON)) }Then define
MsgDistributorInvalidJSONin your i18n keys/locales.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@middleware/distributor.go` around lines 213 - 245, getModelFromJSONBody currently returns a hardcoded error "invalid JSON request body"; replace that literal with a localized message by calling i18n.T with a new key (e.g., MsgDistributorInvalidJSON) so the function uses i18n.T("MsgDistributorInvalidJSON") instead of the raw string; add the MsgDistributorInvalidJSON entry to your i18n/locales and update any callers expecting the error string if necessary.
247-255: ⚡ Quick winLocalize the error message.
Line 252 uses
fmt.Errorfwith a raw English string, while the rest of this middleware usesi18n.T(...)for consistency. To localize this error, you would need to pass the*gin.ContextintogetJSONStringValueso it can calli18n.T(c, ...).🌐 Proposed fix
Refactor to accept context:
-func getJSONStringValue(result gjson.Result, field string) (string, error) { +func getJSONStringValue(c *gin.Context, result gjson.Result, field string) (string, error) { if !result.Exists() || result.Type == gjson.Null { return "", nil } if result.Type != gjson.String { - return "", fmt.Errorf("field %s must be a string", field) + return "", errors.New(i18n.T(c, i18n.MsgDistributorFieldMustBeString, map[string]any{"Field": field})) } return result.String(), nil }Update callers on lines 227 and 231 to pass
c.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@middleware/distributor.go` around lines 247 - 255, Change getJSONStringValue to accept a *gin.Context (e.g., func getJSONStringValue(c *gin.Context, result gjson.Result, field string) (string, error)), replace the fmt.Errorf(...) call with a localized error using i18n.T(c, "...") and update every caller of getJSONStringValue to pass the current *gin.Context instance so the middleware continues to compile and returns localized errors.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@middleware/distributor.go`:
- Around line 213-245: getModelFromJSONBody currently returns a hardcoded error
"invalid JSON request body"; replace that literal with a localized message by
calling i18n.T with a new key (e.g., MsgDistributorInvalidJSON) so the function
uses i18n.T("MsgDistributorInvalidJSON") instead of the raw string; add the
MsgDistributorInvalidJSON entry to your i18n/locales and update any callers
expecting the error string if necessary.
- Around line 247-255: Change getJSONStringValue to accept a *gin.Context (e.g.,
func getJSONStringValue(c *gin.Context, result gjson.Result, field string)
(string, error)), replace the fmt.Errorf(...) call with a localized error using
i18n.T(c, "...") and update every caller of getJSONStringValue to pass the
current *gin.Context instance so the middleware continues to compile and returns
localized errors.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 030fd4b3-6d29-4d96-b1a1-1ab1015b95eb
📒 Files selected for processing (4)
controller/relay.gomiddleware/distributor.gomodel/channel_cache.gorelay/claude_handler.go
🚧 Files skipped from review as they are similar to previous changes (3)
- model/channel_cache.go
- controller/relay.go
- relay/claude_handler.go
Signed-off-by: B1F030 <b1fzhang@gmail.com>
a5b3e75 to
71c35a2
Compare
Important
📝 变更描述 / Description
新增 Anthropic 兼容的
POST /v1/messages/count_tokens接口,当前仅面向 AWS Bedrock Claude 渠道。Claude SDK / CLI 会在发送部分请求前调用
messages/count_tokens获取输入 token 数。对于 AWS Bedrock Claude,Bedrock Runtime 已提供官方CountTokens能力,因此本 PR 不在本地估算 token,而是在现有鉴权、限速和渠道分发流程之后,将请求转发到 AWS BedrockCountTokens。实现上新增了 Claude count tokens relay mode,并为该路由增加 AWS-only 渠道过滤,避免同一模型同时存在非 AWS 渠道时被错误分发。controller 中为该模式单独分支处理,请求成功时直接返回 Anthropic 兼容结果
{"input_tokens": <number>},不进入本地 token 估算、模型价格计算、预扣费或消费结算流程。AWS channel 内部使用 Bedrock SDK 的
CountTokens,输入采用InvokeModelTokensRequest,并复用现有 AWS Claude 请求转换、模型映射、cross-region model id、header override 与 param override 逻辑。非 Claude AWS 模型(例如 Nova)会被拒绝,不提供本地 fallback。🚀 变更类型 / Type of change
🔗 关联任务 / Related Issue
/v1/messages/count_tokens的场景说明;本 PR 仅覆盖 AWS Bedrock Claude,并转发上游而非本地估算✅ 提交前检查项 / Checklist
Bug fix,我已提交或关联对应 Issue,且不会将设计取舍、预期不一致或理解偏差直接归类为 bug。📸 运行证明 / Proof of Work
已运行以下本地检查:
Summary by CodeRabbit
New Features
Tests
Localization