What do you want to do?
Tell us about your request. Provide a summary of the request.
The documentation should be updated to include the new token_limit feature proposed in issue #4728.
Add a new paragraph under the existing Tracking token usage section explaining the token_limit parameter and its behavior.
The documentation should clarify that token limiting works through an over-exhaustion mechanism: token usage is validated after LLM calls complete by checking the cumulative token usage against the configured token_limit. This means execution is stopped once the accumulated token usage exceeds the configured threshold.
Also mention that token limit support has been added for:
Plan-Execute-Reflect (PRE) agents
V1 agents
V2 agents
The explanation should make it clear that token usage aggregation includes all LLM calls performed during execution, including planning, execution, reflection, and subagent calls.
Version: List the OpenSearch version to which this issue applies, e.g. 2.14, 2.12--2.14, or all.
Upcoming 3.7
What other resources are available? Provide links to related issues, POCs, steps for testing, etc.
Issue: opensearch-project/ml-commons#4728
PR: opensearch-project/ml-commons#4820
What do you want to do?
Tell us about your request. Provide a summary of the request.
The documentation should be updated to include the new token_limit feature proposed in issue #4728.
Add a new paragraph under the existing Tracking token usage section explaining the token_limit parameter and its behavior.
The documentation should clarify that token limiting works through an over-exhaustion mechanism: token usage is validated after LLM calls complete by checking the cumulative token usage against the configured token_limit. This means execution is stopped once the accumulated token usage exceeds the configured threshold.
Also mention that token limit support has been added for:
Plan-Execute-Reflect (PRE) agents
V1 agents
V2 agents
The explanation should make it clear that token usage aggregation includes all LLM calls performed during execution, including planning, execution, reflection, and subagent calls.
Version: List the OpenSearch version to which this issue applies, e.g. 2.14, 2.12--2.14, or all.
Upcoming 3.7
What other resources are available? Provide links to related issues, POCs, steps for testing, etc.
Issue: opensearch-project/ml-commons#4728
PR: opensearch-project/ml-commons#4820