Skip to content

feat: add useNativeTokenCount flag to skip token counting API calls#2255

Merged
opieter-aws merged 2 commits intostrands-agents:mainfrom
opieter-aws:feat/native-token-counting
May 6, 2026
Merged

feat: add useNativeTokenCount flag to skip token counting API calls#2255
opieter-aws merged 2 commits intostrands-agents:mainfrom
opieter-aws:feat/native-token-counting

Conversation

@opieter-aws
Copy link
Copy Markdown
Contributor

@opieter-aws opieter-aws commented May 6, 2026

Description

Model providers with native token counting APIs (Bedrock, Anthropic, Gemini, OpenAI Responses, llama.cpp) make an additional API call on every count_tokens() invocation. In scenarios where latency or cost of these calls is undesirable — such as high-frequency proactive compression checks — users need a way to opt out and fall back to the local estimator.

Each provider that overrides count_tokens() with a native API call gains a new optional use_native_token_count field in its config:

# Before: native API always called (with estimator fallback on error)
model = BedrockModel(model_id="us.anthropic.claude-sonnet-4-20250514")

# After: skip native API, always use local estimator
model = BedrockModel(
    model_id="us.anthropic.claude-sonnet-4-20250514",
    use_native_token_count=False,
)

Defaults to True (current behavior). Available on BedrockConfig, AnthropicConfig, GeminiConfig, OpenAIResponsesConfig, and LlamaCppConfig — the five providers that override count_tokens() with native API calls. The base Model class is unaffected since it only has the local estimator implementation.

Related Issues

Python port of strands-agents/sdk-typescript#1009

Documentation PR

NA

Type of Change

New feature

Testing

How have you tested the change? Verify that the changes do not break functionality or introduce warnings in consuming repositories: agents-docs, agents-tools, agents-cli

  • I ran hatch run prepare

Checklist

  • I have read the CONTRIBUTING document
  • I have added any necessary tests that prove my fix is effective or my feature works
  • I have updated the documentation accordingly
  • I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 6, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 6, 2026

Issue: This PR adds a new configuration parameter to every public model provider constructor (Anthropic, Bedrock, Gemini, LlamaCpp, OpenAIResponses). This is a public API change that affects all customers.

Per the API Bar Raising guidelines, the PR should:

  1. Have the needs-api-review label
  2. Document expected use cases in the PR description
  3. Provide example code snippets showing usage
  4. Include complete API signatures

Suggestion: Please update the PR description with:

  • The motivation/use case (e.g., reducing API call costs, reducing latency in high-throughput scenarios)
  • Example usage: model = BedrockModel(model_id="...", use_native_token_count=False)
  • Any alternatives considered (e.g., a class-level method override vs. config flag)

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 6, 2026

Issue: The PR checklist items are all unchecked — hatch run prepare hasn't been run, CONTRIBUTING.md hasn't been read, and no tests/documentation confirmation. The PR description sections (Description, Related Issues, Documentation PR, Type of Change) are all left as template placeholders.

Suggestion: Please complete the PR template:

  • Fill in the description explaining why this change is needed
  • Link to any related issue
  • Select the correct change type (this is "New feature")
  • Mark checklist items that have been completed

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 6, 2026

Review Summary

Assessment: Request Changes

This PR adds a use_native_token_count configuration option to 5 model providers, allowing users to skip native token-counting API calls in favor of the local heuristic estimator. The feature is straightforward and tests are provided for each provider.

Key Areas to Address
  • API Bar Raising: This modifies the public constructor API of all major model providers. Per project guidelines, it needs the needs-api-review label and proper documentation in the PR description (use cases, examples, API signatures).
  • DRY / Architecture: The same field definition and 2-line check is duplicated across 5 files. This belongs in BaseModelConfig and the base Model class to ensure consistency for current and future providers.
  • Documentation Accuracy: The docstring claims "True (default)" but no default is set — the behavior relies on None is not False. This should be clarified.
  • Naming: Consider whether use_native_token_count clearly communicates intent (it reads as a noun, not a verb/action).
  • PR Description: Template is unfilled — no motivation, use cases, or linked issues.

The implementation is functional and well-tested; the main concerns are around API design process and code organization.

@opieter-aws opieter-aws marked this pull request as ready for review May 6, 2026 16:09
@opieter-aws opieter-aws enabled auto-merge (squash) May 6, 2026 19:28
@opieter-aws opieter-aws merged commit 800e7c4 into strands-agents:main May 6, 2026
36 of 38 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants