Skip to content

[DOC] Agent-level Token Limit parameter #12430

@GugaGlonti

Description

@GugaGlonti

What do you want to do?

  • Request a change to existing documentation
  • Add new documentation
  • Report a technical problem with the documentation
  • Other

Tell us about your request. Provide a summary of the request.
The documentation should be updated to include the new token_limit feature proposed in issue #4728.

Add a new paragraph under the existing Tracking token usage section explaining the token_limit parameter and its behavior.

The documentation should clarify that token limiting works through an over-exhaustion mechanism: token usage is validated after LLM calls complete by checking the cumulative token usage against the configured token_limit. This means execution is stopped once the accumulated token usage exceeds the configured threshold.

Also mention that token limit support has been added for:

Plan-Execute-Reflect (PRE) agents
V1 agents
V2 agents

The explanation should make it clear that token usage aggregation includes all LLM calls performed during execution, including planning, execution, reflection, and subagent calls.

Version: List the OpenSearch version to which this issue applies, e.g. 2.14, 2.12--2.14, or all.
Upcoming 3.7

What other resources are available? Provide links to related issues, POCs, steps for testing, etc.
Issue: opensearch-project/ml-commons#4728
PR: opensearch-project/ml-commons#4820

Metadata

Metadata

Assignees

Labels

Backlog - DEVDeveloper assigned to issue is responsible for creating PR.v3.7.0

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions