feat!: Track step_count, token_usage and tool_call_counts in Agent's State#11427
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub. 1 Skipped Deployment
|
Coverage reportClick to see where and how coverage changed
This report was generated by python-coverage-comment-action |
||||||||||||||||||||||||||||||||||||
julian-risch
left a comment
There was a problem hiding this comment.
Looks quite good to me already. Minor comments.
| A dictionary with the following keys: | ||
| - "messages": List of all messages exchanged during the LLM's run. | ||
| - "last_message": The last message exchanged during the LLM's run. | ||
| - "token_usage": Token usage from the LLM call (e.g. prompt_tokens, completion_tokens). Empty if the |
There was a problem hiding this comment.
What about adding this to the release note under enhancements? I suggest we do that.
| A dictionary with the following keys: | ||
| - "messages": List of all messages exchanged during the agent's run. | ||
| - "last_message": The last message exchanged during the agent's run. | ||
| - "step_count": The number of steps the agent ran. |
There was a problem hiding this comment.
I thought a bit about the naming of step_count and whether it's intuitive enough to users.
llm_call_count or iteration_count came to mind as alternatives but step_count seems best to me as long as we explain in documentation what we refer to as one step.
| "http_client_kwargs": None, | ||
| }, | ||
| }, | ||
| "tools": [ |
There was a problem hiding this comment.
The changes here in test_from_dict_state_schema_none seem unrelated?
There was a problem hiding this comment.
yes that's true, I did this to help reduce line count in the tests. The tools are unnecessary for what the unit test is claiming to test and so it's wasted lines.
| agent.run([ChatMessage.from_user("Hello")]) | ||
| result = agent.run([ChatMessage.from_user("Hello")]) | ||
| assert "Agent reached maximum agent steps" in caplog.text | ||
| assert result["step_count"] == 0 |
There was a problem hiding this comment.
When max_steps is exceeded the loop returns before incrementing the counter. Should we maybe increase the step_count before exiting to count the attempted steps?
There was a problem hiding this comment.
Actually in this case the chat generator nor any tools where called, the whole loop is skipped since the max run steps is set to 0. If it were set to 1 then step_count would also be returned as 1.
There was a problem hiding this comment.
Ah, that's great. Thanks for the explanation!
| **kwargs, | ||
| ) | ||
| # Inherited Agent-internal bookkeeping that isn't useful at the LLM surface. | ||
| result.pop("step_count", None) |
There was a problem hiding this comment.
If we add another key to the Agent's _INTERNAL_STATE_KEYS, we need to remember to also pop it here. Just a comment. No change request.
Co-authored-by: Julian Risch <julian.risch@deepset.ai>
Related Issues
Proposed Changes:
Automatically add three new keys into Agent's
state_schema:step_count,token_usageandtool_call_counts. These are managed by the Agent within State and are exposed as outputs from the Agent.Besides this information being generally useful for downstream consumption, this is part of a larger goal to add callbacks to the Agent loop to allow tools to be triggered by the current status of the State object. See the
Condition paramsection here https://github.com/deepset-ai/haystack-private/issues/329#issuecomment-4327141039 for more details on this idea.How did you test it?
Added new tests
Notes for the reviewer
This is partially a breaking change because we now raise an error if a user tries to add a state_schema key that we are now reserving. I don't imagine this will cause many users problems, but I've added an upgrade section to the reno and added an entry to the Migration guide.
Checklist
fix:,feat:,build:,chore:,ci:,docs:,style:,refactor:,perf:,test:and added!in case the PR includes breaking changes.