Skip to content

LLM class makes duplicate API calls — wastes tokens and increases latency #1137

@MervinPraison

Description

@MervinPraison

Summary

The LLM class has a known issue documented directly in code at praisonaiagents/llm/llm.py:73-74:

# TODO: Include in-build tool calling in LLM class
# TODO: Restructure so that duplicate calls are not made (Sync with agent.py)

Impact

  • Direct cost impact: Duplicate LLM API calls double token usage and billing
  • Latency: Each duplicate call adds round-trip time to LLM provider
  • Rate limiting: Doubles the chance of hitting provider rate limits
  • This affects every agent interaction since LLM is the core execution path

Context

The issue appears to stem from the interaction between llm.py and agent.py — the agent layer and LLM layer are not properly synchronized, causing the same request to be sent twice.

Suggested Fix

  1. Audit the call path from agent.pyllm.py to identify where duplication occurs
  2. Restructure so tool calling is handled in a single pass
  3. Add request deduplication or caching for identical prompts within the same turn
  4. Add telemetry to track and alert on duplicate calls
  5. Add a test that verifies exactly N API calls are made for N-turn conversations

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions