Skip to content

fix: handle JSON parsing for Gemini self-reflection#800

Merged
MervinPraison merged 2 commits intomainfrom
claude/issue-796-20250710_222402
Jul 10, 2025
Merged

fix: handle JSON parsing for Gemini self-reflection#800
MervinPraison merged 2 commits intomainfrom
claude/issue-796-20250710_222402

Conversation

@MervinPraison
Copy link
Copy Markdown
Owner

@MervinPraison MervinPraison commented Jul 10, 2025

User description

Fixes #796

Problem

When using Gemini models with self-reflection enabled, streaming responses were causing incomplete JSON chunks to be parsed, resulting in the error "Expecting property name enclosed in double quotes".

Solution

  • Disable streaming for custom LLMs during self-reflection to avoid incomplete JSON chunks
  • Add manual JSON parsing for custom LLM reflection responses
  • Use clean_json_output method to handle markdown-wrapped JSON
  • Ensure reflection regeneration also uses non-streaming for custom LLMs

This fix ensures that Gemini and other custom LLM providers can use self-reflection without encountering JSON parsing errors.

Generated with Claude Code


PR Type

Bug fix


Description

  • Fix JSON parsing errors for Gemini self-reflection

  • Disable streaming for custom LLMs during reflection

  • Add manual JSON parsing for custom LLM responses

  • Ensure reflection regeneration uses non-streaming for custom LLMs


Changes diagram

flowchart LR
  A["Custom LLM Request"] --> B["Disable Streaming"]
  B --> C["Get Complete Response"]
  C --> D["Clean JSON Output"]
  D --> E["Manual JSON Parsing"]
  E --> F["Create Reflection Object"]
Loading

Changes walkthrough 📝

Relevant files
Bug fix
agent.py
Implement custom LLM self-reflection with JSON parsing     

src/praisonai-agents/praisonaiagents/agent/agent.py

  • Replace custom LLM self-reflection skip with proper JSON handling
  • Add manual JSON parsing for custom LLM reflection responses
  • Use clean_json_output method to handle markdown-wrapped JSON
  • Disable streaming for custom LLMs during reflection regeneration
  • +36/-21 
    Tests
    test_gemini_fix.py
    Add Gemini self-reflection test script                                     

    src/praisonai-agents/test_gemini_fix.py

  • Create test script for Gemini self-reflection functionality
  • Configure agent with Gemini model and self-reflection enabled
  • Test simple prompt to verify JSON parsing fix
  • +32/-0   

    Need help?
  • Type /help how to ... in the comments thread for any questions about Qodo Merge usage.
  • Check out the documentation for more information.
  • Summary by CodeRabbit

    • New Features

      • Added a test script to verify improved JSON parsing for Gemini model self-reflection.
    • Bug Fixes

      • Enhanced self-reflection support for custom LLM providers by ensuring reflection output is properly parsed and handled, even when not using OpenAI.
    • Chores

      • Improved error handling and output visibility in the new test script for easier debugging.

    - Disable streaming for custom LLMs during self-reflection to avoid incomplete JSON chunks
    - Add manual JSON parsing for custom LLM reflection responses
    - Use clean_json_output method to handle markdown-wrapped JSON
    - Ensure reflection regeneration also uses non-streaming for custom LLMs
    
    This fixes the "Expecting property name enclosed in double quotes" error
    when using Gemini models with self-reflection enabled.
    
    Fixes #796
    
    Co-authored-by: Mervin Praison <MervinPraison@users.noreply.github.com>
    @coderabbitai
    Copy link
    Copy Markdown
    Contributor

    coderabbitai bot commented Jul 10, 2025

    Caution

    Review failed

    The pull request is closed.

    Walkthrough

    The chat method in the Agent class has been updated to improve self-reflection handling for custom LLM providers, particularly when OpenAI is not used. Instead of skipping reflection, the method now manually parses and cleans JSON responses from custom LLMs. Additionally, a new test script demonstrates this fix with the Gemini model.

    Changes

    File(s) Change Summary
    src/praisonai-agents/praisonaiagents/agent/agent.py Updated self-reflection logic in chat to handle custom LLMs: manual JSON parsing, explicit streaming control.
    src/praisonai-agents/gemini_self_reflection.py Added a new test script to verify Gemini JSON parsing and improved self-reflection handling.

    Sequence Diagram(s)

    sequenceDiagram
        participant User
        participant Agent
        participant LLM (Custom/OpenAI)
    
        User->>Agent: chat(prompt, reflection enabled)
        Agent->>LLM (Custom): send prompt (stream=False for reflection)
        LLM (Custom)-->>Agent: response (possibly malformed JSON)
        Agent->>Agent: clean and parse JSON manually
        Agent->>User: return structured reflection output
    
    Loading

    Assessment against linked issues

    Objective Addressed Explanation
    Fix JSON parsing error with Gemini LLM: ensure agent handles malformed/non-JSON responses correctly (#796)
    Adjust agent to handle plain text or malformed JSON from Gemini without decode errors (#796)
    Do not attempt JSON parsing if LLM response is not valid JSON, consistent with "response_format": "text" (#796)

    Possibly related PRs

    Poem

    A rabbit hopped through JSON streams,
    Parsing dreams with clever schemes.
    Gemini’s quirks now understood,
    Reflection flows just as it should.
    With custom LLMs, no more fright—
    The answers now come out just right!
    🐇✨


    📜 Recent review details

    Configuration used: CodeRabbit UI
    Review profile: CHILL
    Plan: Pro

    📥 Commits

    Reviewing files that changed from the base of the PR and between 1869e72 and fc20e4c.

    📒 Files selected for processing (1)
    • src/praisonai-agents/gemini_self_reflection.py (1 hunks)
    ✨ Finishing Touches
    • 📝 Generate Docstrings

    Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

    ❤️ Share
    🪧 Tips

    Chat

    There are 3 ways to chat with CodeRabbit:

    • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
      • I pushed a fix in commit <commit_id>, please review it.
      • Explain this complex logic.
      • Open a follow-up GitHub issue for this discussion.
    • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
      • @coderabbitai explain this code block.
      • @coderabbitai modularize this function.
    • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
      • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
      • @coderabbitai read src/utils.ts and explain its main purpose.
      • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
      • @coderabbitai help me debug CodeRabbit configuration file.

    Support

    Need help? Create a ticket on our support page for assistance with any issues or questions.

    Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

    CodeRabbit Commands (Invoked using PR comments)

    • @coderabbitai pause to pause the reviews on a PR.
    • @coderabbitai resume to resume the paused reviews.
    • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
    • @coderabbitai full review to do a full review from scratch and review all the files again.
    • @coderabbitai summary to regenerate the summary of the PR.
    • @coderabbitai generate docstrings to generate docstrings for this PR.
    • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
    • @coderabbitai resolve resolve all the CodeRabbit review comments.
    • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
    • @coderabbitai help to get help.

    Other keywords and placeholders

    • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
    • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
    • Add @coderabbitai anywhere in the PR title to generate the title automatically.

    CodeRabbit Configuration File (.coderabbit.yaml)

    • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
    • Please see the configuration documentation for more information.
    • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

    Documentation and Community

    • Visit our Documentation for detailed information on how to use CodeRabbit.
    • Join our Discord Community to get help, request features, and share feedback.
    • Follow us on X/Twitter for updates and announcements.

    @MervinPraison
    Copy link
    Copy Markdown
    Owner Author

    @claude review this solution again and is this the better fix with bacward compatibility and then no feature removed.

    Copy link
    Copy Markdown
    Contributor

    @gemini-code-assist gemini-code-assist bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Summary of Changes

    Hello @MervinPraison, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

    This pull request addresses a critical issue where custom LLMs, particularly Gemini, encountered JSON parsing errors during self-reflection due to incomplete streaming responses. My changes implement a robust solution by disabling streaming for these models during reflection, performing manual JSON parsing, and ensuring proper handling of markdown-wrapped JSON outputs. This significantly improves the reliability of self-reflection for non-OpenAI models.

    Highlights

    • Custom LLM Reflection Handling: I've modified the self-reflection logic to specifically handle custom LLMs (like Gemini) by disabling streaming (stream=False) for reflection requests. This ensures complete JSON responses are received, preventing parsing errors that occur with partial streams.
    • Manual JSON Parsing and Cleaning: For custom LLM reflection responses, I've implemented manual JSON parsing. This includes calling self.clean_json_output to correctly extract JSON from potentially markdown-wrapped text and then using json.loads() to parse it. A CustomReflectionOutput class was introduced to map the parsed data to the expected reflection object structure.
    • Reflection Regeneration Consistency: I've ensured that the reflection regeneration process also forces non-streaming for custom LLMs. This maintains consistency across all reflection attempts and prevents similar JSON parsing issues in subsequent iterations.
    • New Test Case: A new dedicated test script (test_gemini_fix.py) has been added to specifically verify the fix for Gemini models when using self-reflection, confirming the resolution of the JSON parsing problem.
    Using Gemini Code Assist

    The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

    Invoking Gemini

    You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

    Feature Command Description
    Code Review /gemini review Performs a code review for the current pull request in its current state.
    Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
    Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
    Help /gemini help Displays a list of available commands.

    Customization

    To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

    Limitations & Feedback

    Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

    You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

    Footnotes

    1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

    @qodo-code-review
    Copy link
    Copy Markdown

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
    🧪 PR contains tests
    🔒 No security concerns identified
    ⚡ Recommended focus areas for review

    Code Duplication

    The CustomReflectionOutput class is defined inline within the method, which could lead to code duplication if similar functionality is needed elsewhere. Consider extracting this to a module-level class or using a more generic approach.

    class CustomReflectionOutput:
        def __init__(self, data):
            self.reflection = data.get('reflection', '')
            self.satisfactory = data.get('satisfactory', 'no').lower()
    
    reflection_output = CustomReflectionOutput(reflection_data)
    Error Handling

    The JSON parsing for custom LLMs lacks proper error handling. If json.loads fails or clean_json_output returns invalid JSON, the exception will propagate without context about the reflection process.

    reflection_data = json.loads(cleaned_json)
    Test Coverage

    The test script only covers the happy path scenario. It should include edge cases like malformed JSON responses, network failures, or invalid reflection outputs to ensure robustness.

    try:
        response = agent.start("What is 2+2? Explain briefly.")
        print(f"\nFinal response: {response}")

    @claude
    Copy link
    Copy Markdown
    Contributor

    claude bot commented Jul 10, 2025

    Claude finished @MervinPraison's task —— View job


    Code Review for Gemini Self-Reflection Fix

    Todo List:

    • Read and analyze the changed files
    • Assess backward compatibility
    • Check if any features were removed
    • Review the implementation approach
    • Provide comprehensive feedback

    Summary

    Yes, this is a better fix that maintains backward compatibility and doesn't remove any features. The solution properly addresses the JSON parsing issues while preserving existing functionality for both OpenAI and custom LLM providers.

    Detailed Review

    ✅ Strengths of the Solution:

    1. Backward Compatibility Preserved

      • The fix conditionally handles custom LLMs (like Gemini) without affecting OpenAI models
      • OpenAI models continue to use structured output parsing (response_format=ReflectionOutput)
      • Custom LLMs get special handling to avoid streaming-related JSON parsing errors
      • No features are removed; all existing functionality remains intact
    2. Root Cause Properly Addressed

      • Disabling streaming (stream=False) for custom LLMs during reflection ensures complete JSON responses
      • This prevents the "Expecting property name enclosed in double quotes" error caused by incomplete JSON chunks
    3. Robust JSON Handling

      • Uses clean_json_output method to extract JSON from markdown-wrapped responses
      • Manual JSON parsing with json.loads() provides better control for custom LLM responses
      • CustomReflectionOutput class mimics the expected structure, ensuring compatibility
    4. Comprehensive Fix

      • Handles both initial reflection (line 1281) and reflection regeneration (line 1354-1355)
      • Error handling remains in place with appropriate fallback behavior

    ⚠️ Minor Suggestions for Improvement:

    1. Error Context Enhancement (src/praisonai-agents/praisonaiagents/agent/agent.py:1292)

      • Consider adding more specific error handling for JSON parsing:
      try:
          reflection_data = json.loads(cleaned_json)
      except json.JSONDecodeError as e:
          logging.error(f"Failed to parse reflection JSON: {e}")
          logging.debug(f"Raw response: {reflection_text}")
          logging.debug(f"Cleaned JSON: {cleaned_json}")
          raise Exception(f"Invalid JSON in reflection response: {e}")
    2. Type Safety (src/praisonai-agents/praisonaiagents/agent/agent.py:1295-1300)

      • The inline CustomReflectionOutput class could be moved to module level or use a TypedDict for better maintainability
    3. Test Coverage (src/praisonai-agents/test_gemini_fix.py)

      • The test script is good for manual verification but consider adding automated unit tests
      • Test edge cases like malformed JSON responses or network failures

    ✅ Key Benefits:

    • Fixes the immediate problem without breaking existing functionality
    • Maintains separation between OpenAI and custom LLM handling
    • Preserves all features while adding necessary fixes

    🎯 Conclusion:
    This is a well-thought-out fix that successfully addresses the JSON parsing issue for Gemini and other custom LLMs during self-reflection. The solution is backward compatible, doesn't remove any features, and properly handles the streaming-related parsing problems. The implementation is clean and follows the existing code patterns.


    @qodo-code-review
    Copy link
    Copy Markdown

    qodo-code-review bot commented Jul 10, 2025

    PR Code Suggestions ✨

    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Impact
    Possible issue
    Add JSON parsing error handling

    Add error handling around JSON parsing to prevent crashes when the cleaned JSON
    is still malformed. This is critical since custom LLMs may return unpredictable
    formats even after cleaning.

    src/praisonai-agents/praisonaiagents/agent/agent.py [1291-1292]

     # Parse the JSON manually
    -reflection_data = json.loads(cleaned_json)
    +try:
    +    reflection_data = json.loads(cleaned_json)
    +except json.JSONDecodeError as e:
    +    logging.error(f"Failed to parse reflection JSON: {e}")
    +    raise Exception(f"Invalid JSON format in reflection response: {cleaned_json}")
    • Apply / Chat
    Suggestion importance[1-10]: 8

    __

    Why: The suggestion correctly identifies that json.loads can fail if the LLM output is malformed despite cleaning, and adding a try-except block is a critical improvement for robustness.

    Medium
    General
    Validate satisfactory field values

    Validate that satisfactory contains only 'yes' or 'no' values to prevent logic
    errors in reflection processing. Invalid values could cause the reflection loop
    to behave unexpectedly.

    src/praisonai-agents/praisonaiagents/agent/agent.py [1295-1298]

     class CustomReflectionOutput:
         def __init__(self, data):
             self.reflection = data.get('reflection', '')
    -        self.satisfactory = data.get('satisfactory', 'no').lower()
    +        satisfactory_raw = data.get('satisfactory', 'no').lower()
    +        self.satisfactory = satisfactory_raw if satisfactory_raw in ['yes', 'no'] else 'no'
    • Apply / Chat
    Suggestion importance[1-10]: 6

    __

    Why: This suggestion improves data sanitization by ensuring the satisfactory attribute is always one of the expected values ('yes' or 'no'), making the reflection logic more robust against unexpected LLM outputs.

    Low
    • Update

    Copy link
    Copy Markdown
    Contributor

    @gemini-code-assist gemini-code-assist bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Code Review

    This pull request effectively addresses a JSON parsing error for Gemini models with self-reflection by disabling streaming and implementing manual parsing. The core logic changes in agent.py are sound, and the addition of a test script in test_gemini_fix.py is a great step towards ensuring the fix is robust. My review includes a few suggestions to enhance maintainability and adhere to Python best practices, such as using more specific exception types, reusing existing data structures for consistency, and properly structuring the new test script.

    reflection_response = self._chat_completion(messages, temperature=temperature, tools=None, stream=False, reasoning_steps=False)

    if not reflection_response or not reflection_response.choices:
    raise Exception("No response from reflection request")
    Copy link
    Copy Markdown
    Contributor

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    medium

    Using the generic Exception class is generally discouraged as it can catch unexpected errors and make debugging more difficult. It's better to use a more specific exception type to allow for more granular error handling. In this case, ValueError would be more appropriate, as the error condition relates to an invalid or missing response from the reflection request.

    Suggested change
    raise Exception("No response from reflection request")
    raise ValueError("No response from reflection request")

    Comment on lines +1294 to +1300
    # Create a reflection output object manually
    class CustomReflectionOutput:
    def __init__(self, data):
    self.reflection = data.get('reflection', '')
    self.satisfactory = data.get('satisfactory', 'no').lower()

    reflection_output = CustomReflectionOutput(reflection_data)
    Copy link
    Copy Markdown
    Contributor

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    medium

    Defining a class inside a method is generally not recommended as it can be inefficient (the class is redefined on every call) and can harm readability.

    For better consistency and maintainability, I suggest using the existing ReflectionOutput Pydantic model, which is already used for the OpenAI-specific path. This ensures a consistent data structure for reflection output across all LLM providers and leverages Pydantic's validation capabilities.

    Suggested change
    # Create a reflection output object manually
    class CustomReflectionOutput:
    def __init__(self, data):
    self.reflection = data.get('reflection', '')
    self.satisfactory = data.get('satisfactory', 'no').lower()
    reflection_output = CustomReflectionOutput(reflection_data)
    # Create a reflection output object using the existing ReflectionOutput model
    # for consistency and validation.
    reflection_data['satisfactory'] = reflection_data.get('satisfactory', 'no').lower()
    if 'reflection' not in reflection_data:
    reflection_data['reflection'] = ''
    reflection_output = ReflectionOutput.model_validate(reflection_data)

    Comment on lines +23 to +32
    # Test with a simple prompt
    print("Testing Gemini with self-reflection...")
    try:
    response = agent.start("What is 2+2? Explain briefly.")
    print(f"\nFinal response: {response}")
    print("\nTest completed successfully!")
    except Exception as e:
    print(f"\nError occurred: {e}")
    import traceback
    traceback.print_exc() No newline at end of file
    Copy link
    Copy Markdown
    Contributor

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    medium

    It's a Python best practice to wrap script execution logic in a if __name__ == '__main__': block. This prevents the code from running automatically if the script is imported into another module, making it more reusable and aligning with standard Python script structure.

    Suggested change
    # Test with a simple prompt
    print("Testing Gemini with self-reflection...")
    try:
    response = agent.start("What is 2+2? Explain briefly.")
    print(f"\nFinal response: {response}")
    print("\nTest completed successfully!")
    except Exception as e:
    print(f"\nError occurred: {e}")
    import traceback
    traceback.print_exc()
    if __name__ == "__main__":
    # Test with a simple prompt
    print("Testing Gemini with self-reflection...")
    try:
    response = agent.start("What is 2+2? Explain briefly.")
    print(f"\nFinal response: {response}")
    print("\nTest completed successfully!")
    except Exception as e:
    print(f"\nError occurred: {e}")
    import traceback
    traceback.print_exc()

    cursor[bot]

    This comment was marked as outdated.

    Copy link
    Copy Markdown
    Contributor

    @coderabbitai coderabbitai bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Actionable comments posted: 2

    🔭 Outside diff range comments (1)
    src/praisonai-agents/praisonaiagents/agent/agent.py (1)

    1386-1396: Avoid duplicating the JSON‐cleanup logic

    The clean_json_output method is already implemented in praisonaiagents.agents.agents. Please remove the local duplicate in agent.py and import the existing one:

    • File: src/praisonai-agents/praisonaiagents/agent/agent.py
    – Delete the clean_json_output definition at lines 1386–1396
    – At the top of the file, add:
    python from praisonaiagents.agents.agents import clean_json_output
    – Replace any calls to self.clean_json_output(...) with clean_json_output(...)

    Example diff:

    --- a/src/praisonai-agents/praisonaiagents/agent/agent.py
    +++ b/src/praisonai-agents/praisonaiagents/agent/agent.py
    @@ -1,6 +1,7 @@
     from praisonaiagents.knowledge import Knowledge
    +from praisonaiagents.agents.agents import clean_json_output
    
     class YourAgent:
    @@ -1383,14 +1383,6 @@ class YourAgent:
         def some_method(self, reflection_text: str):
    -    def clean_json_output(self, output: str) -> str:
    -        """Clean and extract JSON from response text."""
    -        cleaned = output.strip()
    -        if cleaned.startswith("```json"):
    -            cleaned = cleaned[len("```json"):].strip()
    -        if cleaned.startswith("```"):
    -            cleaned = cleaned[len("```"):].strip()
    -        if cleaned.endswith("```"):
    -            cleaned = cleaned[:-3].strip()
    -        return cleaned  
     
             # before: cleaned_json = self.clean_json_output(reflection_text)
    -        cleaned_json = self.clean_json_output(reflection_text)
    +        cleaned_json = clean_json_output(reflection_text)
    📜 Review details

    Configuration used: CodeRabbit UI
    Review profile: CHILL
    Plan: Pro

    📥 Commits

    Reviewing files that changed from the base of the PR and between b71009b and 1869e72.

    📒 Files selected for processing (2)
    • src/praisonai-agents/praisonaiagents/agent/agent.py (2 hunks)
    • src/praisonai-agents/test_gemini_fix.py (1 hunks)
    🧰 Additional context used
    🧠 Learnings (3)
    📓 Common learnings
    Learnt from: CR
    PR: MervinPraison/PraisonAI#0
    File: src/praisonai-agents/CLAUDE.md:0-0
    Timestamp: 2025-06-30T10:06:17.673Z
    Learning: Use the `Agent` class from `praisonaiagents/agent/` for core agent implementations, supporting LLM integration, tool calling, and self-reflection.
    
    Learnt from: CR
    PR: MervinPraison/PraisonAI#0
    File: src/praisonai-ts/.cursorrules:0-0
    Timestamp: 2025-06-30T10:05:51.843Z
    Learning: Applies to src/praisonai-ts/src/agent/agent.ts : The 'Agent' class in 'src/agent/agent.ts' should encapsulate a single agent's role, name, and methods for calling the LLM using 'aisdk'.
    
    Learnt from: CR
    PR: MervinPraison/PraisonAI#0
    File: src/praisonai-ts/.cursorrules:0-0
    Timestamp: 2025-06-30T10:05:51.843Z
    Learning: Applies to src/praisonai-ts/src/llm/llm.ts : The 'LLM' class in 'llm.ts' should wrap 'aisdk.generateText' calls for generating text responses.
    
    src/praisonai-agents/test_gemini_fix.py (5)
    Learnt from: CR
    PR: MervinPraison/PraisonAI#0
    File: src/praisonai-agents/CLAUDE.md:0-0
    Timestamp: 2025-06-30T10:06:17.673Z
    Learning: Applies to src/praisonai-agents/tests/**/*.py : Test files should be placed in the `tests/` directory and demonstrate specific usage patterns, serving as both test and documentation.
    
    Learnt from: CR
    PR: MervinPraison/PraisonAI#0
    File: src/praisonai-ts/.windsurfrules:0-0
    Timestamp: 2025-06-30T10:06:44.129Z
    Learning: Applies to src/praisonai-ts/src/tools/test.ts : The 'src/tools/test.ts' file should serve as a script for running internal tests or examples for each tool.
    
    Learnt from: CR
    PR: MervinPraison/PraisonAI#0
    File: src/praisonai-agents/CLAUDE.md:0-0
    Timestamp: 2025-06-30T10:06:17.673Z
    Learning: Applies to src/praisonai-agents/praisonaiagents/{memory,knowledge}/**/*.py : Place memory-related implementations in `praisonaiagents/memory/` and knowledge/document processing in `praisonaiagents/knowledge/`.
    
    Learnt from: CR
    PR: MervinPraison/PraisonAI#0
    File: src/praisonai-ts/.cursorrules:0-0
    Timestamp: 2025-06-30T10:05:51.843Z
    Learning: Applies to src/praisonai-ts/src/tools/test.ts : The 'src/tools/test.ts' file should provide a script for running each tool's internal test or example.
    
    Learnt from: CR
    PR: MervinPraison/PraisonAI#0
    File: src/praisonai-agents/CLAUDE.md:0-0
    Timestamp: 2025-06-30T10:06:17.673Z
    Learning: Run individual test files as scripts (e.g., `python tests/basic-agents.py`) rather than using a formal test runner.
    
    src/praisonai-agents/praisonaiagents/agent/agent.py (3)
    Learnt from: CR
    PR: MervinPraison/PraisonAI#0
    File: src/praisonai-agents/CLAUDE.md:0-0
    Timestamp: 2025-06-30T10:06:17.673Z
    Learning: Use the `Agent` class from `praisonaiagents/agent/` for core agent implementations, supporting LLM integration, tool calling, and self-reflection.
    
    Learnt from: CR
    PR: MervinPraison/PraisonAI#0
    File: src/praisonai-ts/.cursorrules:0-0
    Timestamp: 2025-06-30T10:05:51.843Z
    Learning: Applies to src/praisonai-ts/src/llm/llm.ts : The 'LLM' class in 'llm.ts' should wrap 'aisdk.generateText' calls for generating text responses.
    
    Learnt from: CR
    PR: MervinPraison/PraisonAI#0
    File: src/praisonai-ts/.cursorrules:0-0
    Timestamp: 2025-06-30T10:05:51.843Z
    Learning: Applies to src/praisonai-ts/src/agent/agent.ts : The 'Agent' class in 'src/agent/agent.ts' should encapsulate a single agent's role, name, and methods for calling the LLM using 'aisdk'.
    
    🧬 Code Graph Analysis (1)
    src/praisonai-agents/praisonaiagents/agent/agent.py (4)
    src/praisonai-agents/praisonaiagents/agents/agents.py (1)
    • clean_json_output (253-261)
    src/praisonai-agents/praisonaiagents/main.py (2)
    • json (390-393)
    • ReflectionOutput (375-377)
    src/praisonai-agents/praisonaiagents/llm/openai_client.py (1)
    • sync_client (244-246)
    src/praisonai-agents/praisonaiagents/llm/llm.py (1)
    • response (1945-2029)
    ⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
    • GitHub Check: Cursor BugBot
    • GitHub Check: test-core (3.11)
    • GitHub Check: quick-test
    🔇 Additional comments (1)
    src/praisonai-agents/praisonaiagents/agent/agent.py (1)

    1354-1355: Good fix for ensuring complete reflection responses.

    Disabling streaming for custom LLMs during reflection regeneration is the correct approach to prevent incomplete JSON chunks. The conditional logic properly preserves the original streaming behavior for non-custom LLMs.

    Comment on lines +1277 to +1300
    # Check if we're using a custom LLM (like Gemini)
    if self._using_custom_llm or self._openai_client is None:
    # For custom LLMs, we need to handle reflection differently
    # Use non-streaming to get complete JSON response
    reflection_response = self._chat_completion(messages, temperature=temperature, tools=None, stream=False, reasoning_steps=False)

    if not reflection_response or not reflection_response.choices:
    raise Exception("No response from reflection request")

    reflection_text = reflection_response.choices[0].message.content.strip()

    # Clean the JSON output
    cleaned_json = self.clean_json_output(reflection_text)

    # Parse the JSON manually
    reflection_data = json.loads(cleaned_json)

    # Create a reflection output object manually
    class CustomReflectionOutput:
    def __init__(self, data):
    self.reflection = data.get('reflection', '')
    self.satisfactory = data.get('satisfactory', 'no').lower()

    reflection_output = CustomReflectionOutput(reflection_data)
    Copy link
    Copy Markdown
    Contributor

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    🛠️ Refactor suggestion

    Refactor inline class definition and improve error handling.

    The inline CustomReflectionOutput class should be defined at the module level for better maintainability and reusability. Additionally, the JSON parsing should validate required fields and provide more specific error messages.

    Define the class at the module level near other imports:

    class CustomReflectionOutput:
        """Custom reflection output for non-OpenAI LLMs."""
        def __init__(self, data: dict):
            if 'reflection' not in data:
                raise ValueError("Missing required field 'reflection' in reflection response")
            if 'satisfactory' not in data:
                raise ValueError("Missing required field 'satisfactory' in reflection response")
            self.reflection = data['reflection']
            self.satisfactory = data.get('satisfactory', 'no').lower()

    Then update the reflection parsing:

    -# Create a reflection output object manually
    -class CustomReflectionOutput:
    -    def __init__(self, data):
    -        self.reflection = data.get('reflection', '')
    -        self.satisfactory = data.get('satisfactory', 'no').lower()
    -
    -reflection_output = CustomReflectionOutput(reflection_data)
    +try:
    +    reflection_output = CustomReflectionOutput(reflection_data)
    +except ValueError as e:
    +    raise Exception(f"Invalid reflection response format: {e}")
    🤖 Prompt for AI Agents
    In src/praisonai-agents/praisonaiagents/agent/agent.py around lines 1277 to
    1300, move the inline CustomReflectionOutput class definition to the module
    level near other imports for better maintainability. Update the class
    constructor to validate that the required fields 'reflection' and 'satisfactory'
    exist in the input data and raise ValueError with clear messages if missing.
    Then, in the reflection parsing code, instantiate this module-level class and
    handle JSON parsing errors with specific error messages to improve error
    handling.
    

    Comment on lines +1 to +32
    #!/usr/bin/env python3
    """Test script to verify the Gemini JSON parsing fix."""

    from praisonaiagents import Agent

    # Test with minimal configuration to isolate the issue
    llm_config = {
    "model": "gemini/gemini-1.5-flash-latest",
    "temperature": 0.7,
    "max_tokens": 500,
    }

    # Create agent with self-reflection enabled
    agent = Agent(
    instructions="You are a helpful assistant. Be concise and clear.",
    llm=llm_config,
    verbose=True,
    self_reflect=True,
    max_reflect=2,
    min_reflect=1
    )

    # Test with a simple prompt
    print("Testing Gemini with self-reflection...")
    try:
    response = agent.start("What is 2+2? Explain briefly.")
    print(f"\nFinal response: {response}")
    print("\nTest completed successfully!")
    except Exception as e:
    print(f"\nError occurred: {e}")
    import traceback
    traceback.print_exc() No newline at end of file
    Copy link
    Copy Markdown
    Contributor

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    💡 Verification agent

    🧩 Analysis chain

    Move test to proper directory and add comprehensive test coverage.

    According to the project's testing conventions from the learnings, test files should be placed in the tests/ directory. Additionally, the test lacks assertions to verify that self-reflection actually works correctly with the fix.

    Consider moving this file to tests/test_gemini_reflection_fix.py and enhance it with proper assertions:

    #!/usr/bin/env python3
    """Test script to verify the Gemini JSON parsing fix for self-reflection."""
    
    import json
    import logging
    from unittest.mock import patch, MagicMock
    from praisonaiagents import Agent
    
    # Enable debug logging to capture reflection attempts
    logging.basicConfig(level=logging.DEBUG)
    
    def test_gemini_self_reflection_with_valid_json():
        """Test that Gemini handles self-reflection with properly formatted JSON."""
        llm_config = {
            "model": "gemini/gemini-1.5-flash-latest",
            "temperature": 0.7,
            "max_tokens": 500,
        }
        
        agent = Agent(
            instructions="You are a helpful assistant. Be concise and clear.",
            llm=llm_config,
            verbose=True,
            self_reflect=True,
            max_reflect=2,
            min_reflect=1
        )
        
        # Test with a simple prompt
        response = agent.start("What is 2+2? Explain briefly.")
        
        # Assertions
        assert response is not None, "Agent should return a response"
        assert len(response) > 0, "Response should not be empty"
        assert "4" in response or "four" in response.lower(), "Response should contain the correct answer"
        print(f"✓ Valid JSON test passed: {response}")
    
    def test_gemini_self_reflection_with_malformed_json():
        """Test that Gemini handles malformed JSON in reflection responses gracefully."""
        # This would require mocking the LLM response to return malformed JSON
        # Implementation depends on the internal structure
        pass
    
    def test_gemini_without_self_reflection():
        """Test that Gemini works without self-reflection as a baseline."""
        llm_config = {
            "model": "gemini/gemini-1.5-flash-latest",
            "temperature": 0.7,
            "max_tokens": 500,
        }
        
        agent = Agent(
            instructions="You are a helpful assistant. Be concise and clear.",
            llm=llm_config,
            verbose=True,
            self_reflect=False  # Disabled
        )
        
        response = agent.start("What is 2+2?")
        assert response is not None, "Agent should return a response without self-reflection"
        print(f"✓ No reflection test passed: {response}")
    
    if __name__ == "__main__":
        print("Testing Gemini with self-reflection JSON parsing fix...\n")
        
        try:
            test_gemini_self_reflection_with_valid_json()
            test_gemini_without_self_reflection()
            print("\n✅ All tests completed successfully!")
        except AssertionError as e:
            print(f"\n❌ Test failed: {e}")
            import traceback
            traceback.print_exc()
        except Exception as e:
            print(f"\n❌ Unexpected error: {e}")
            import traceback
            traceback.print_exc()

    🏁 Script executed:

    #!/bin/bash
    # Description: Check if tests directory exists and verify testing conventions
    
    # Check for tests directory
    echo "Checking for tests directory:"
    fd -t d "tests" src/praisonai-agents/ --max-depth 2
    
    # Check existing test files for patterns
    echo -e "\nChecking existing test files:"
    fd -e py "test" src/praisonai-agents/tests/ 2>/dev/null || echo "No test files found in tests/"
    
    # Check if there are other test files in the root
    echo -e "\nChecking for other test files in src/praisonai-agents/:"
    fd -e py "^test_" src/praisonai-agents/ --max-depth 1

    Length of output: 1917


    Move and strengthen Gemini JSON parsing fix tests

    The test_gemini_fix.py file is currently mis-located and lacks coverage for key behaviors. To align with project conventions and ensure robustness:

    • Relocate

    • Move
      src/praisonai-agents/test_gemini_fix.py

      src/praisonai-agents/tests/test_gemini_reflection_fix.py

    • Enhance assertions

    • Verify that agent.start(...) returns a non-empty response containing “4” (or “four”).
    • Assert that self-reflection steps were actually invoked (e.g., by inspecting logged reflection prompts or mocking the LLM).

    • Add edge-case tests

    • Mock the LLM to return malformed JSON during reflection and assert the agent handles it gracefully (no uncaught exceptions, fallback behavior).
    • Include a baseline test with self_reflect=False to confirm normal operation.

    • Example structure (in tests/test_gemini_reflection_fix.py):

    def test_reflection_valid_json():
        # …assert response is correct and self-reflection occurred…
    
    def test_reflection_malformed_json(monkeypatch):
        # Monkeypatch LLM to return invalid JSON…
        # Assert graceful handling
    
    def test_without_self_reflection():
        # self_reflect=False → assert normal response

    By moving the file into tests/ and expanding coverage as above, we’ll both satisfy the project’s testing conventions and validate the Gemini JSON parsing fix under varied scenarios.

    🤖 Prompt for AI Agents
    In src/praisonai-agents/test_gemini_fix.py lines 1 to 32, move the test file to
    src/praisonai-agents/tests/test_gemini_reflection_fix.py to follow project
    conventions. Enhance the tests by adding assertions that verify the
    agent.start(...) returns a non-empty response containing "4" or "four" and
    confirm that self-reflection steps are invoked, for example by inspecting logs
    or mocking the LLM. Add edge-case tests that mock the LLM to return malformed
    JSON during reflection and assert the agent handles it gracefully without
    uncaught exceptions. Also include a baseline test with self_reflect=False to
    confirm normal operation. Structure the new tests as separate functions for
    valid reflection, malformed JSON handling, and no self-reflection scenarios.
    

    @MervinPraison MervinPraison merged commit 016f020 into main Jul 10, 2025
    8 of 10 checks passed
    @MervinPraison
    Copy link
    Copy Markdown
    Owner Author

    @claude again i get this error.

    Implement fix and create a new pr branch

    code

    from praisonaiagents import AutoAgents
    from langchain_community.utilities.tavily_search import TavilySearchAPIWrapper
    
    def tavily(query: str):
        api_wrapper = TavilySearchAPIWrapper()
        results = api_wrapper.results(query=query, max_results=5)
        return results
        
    # Create AutoAgents instance
    agents = AutoAgents(
        instructions="Search for information about AI Agents",
        tools=[tavily],
        process="sequential",
        verbose=True,
        max_agents=3  # Maximum number of agents to create
    )
    
    # Start the agents
    result = agents.start()
    print(result)
    
    ##or
    
    from praisonaiagents import AutoAgents
    from praisonaiagents.tools import (
        evaluate, solve_equation, convert_units,
        calculate_statistics, calculate_financial
    )
    
    
    def get_stock_price(company_name: str) -> str:
        """
        Get the stock price of a company
        
        Args:
            company_name (str): The name of the company
            
        Returns:
            str: The stock price of the company
        """
    
        if company_name.lower() == "apple" or company_name.lower() == "aapl":
            return f"The stock price of {company_name} is 100"
        elif company_name.lower() == "google" or company_name.lower() == "googl":
            return f"The stock price of {company_name} is 200"
        else:
            return f"The stock price of {company_name} is 50"
    
    # Create AutoAgents instance  
    agents = AutoAgents(
        instructions="Get the stock price of google and compare it to the stock price of apple",
        tools=[evaluate, get_stock_price],
        process="sequential",
        llm="gemini/gemini-2.5-flash-lite-preview-06-17",
        self_reflect=False,
        verbose=False,
        max_agents=3  # Maximum number of agents to create
    )
    
    # Start the agents
    result = agents.start()
    print(result)
    
    ....
    ....
    ...
    ERROR:root:Error generating configuration: 4 validation errors for AutoAgentsConfig
    agents.0.tasks.0
      Input should be a valid dictionary or instance of TaskConfig [type=model_type, input_value='Get the current stock price for Google (GOOG).', input_type=str]
        For further information visit https://errors.pydantic.dev/2.10/v/model_type
    agents.0.tasks.1
      Input should be a valid dictionary or instance of TaskConfig [type=model_type, input_value='Get the current stock price for Apple (AAPL).', input_type=str]
        For further information visit https://errors.pydantic.dev/2.10/v/model_type
    agents.1.tasks.0
      Input should be a valid dictionary or instance of TaskConfig [type=model_type, input_value='Compare the stock price ...e stock price of Apple.', input_type=str]
        For further information visit https://errors.pydantic.dev/2.10/v/model_type
    agents.1.tasks.1
      Input should be a valid dictionary or instance of TaskConfig [type=model_type, input_value='Determine which stock has a higher price.', input_type=str]
        For further information visit https://errors.pydantic.dev/2.10/v/model_type
    Traceback (most recent call last):
      File "/Users/praison/praisonai-package/src/praisonai-agents/autoagents-tools.py", line 50, in <module>
        agents = AutoAgents(
                 ^^^^^^^^^^^
      File "/Users/praison/praisonai-package/src/praisonai-agents/praisonaiagents/agents/autoagents.py", line 124, in __init__
        config = self._generate_config()
                 ^^^^^^^^^^^^^^^^^^^^^^^
      File "/Users/praison/praisonai-package/src/praisonai-agents/praisonaiagents/agents/autoagents.py", line 306, in _generate_config
        config = AutoAgentsConfig(**config_dict)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/Users/praison/miniconda3/envs/praisonai-package/lib/python3.11/site-packages/pydantic/main.py", line 214, in __init__
        validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    pydantic_core._pydantic_core.ValidationError: 4 validation errors for AutoAgentsConfig
    agents.0.tasks.0
      Input should be a valid dictionary or instance of TaskConfig [type=model_type, input_value='Get the current stock price for Google (GOOG).', input_type=str]
        For further information visit https://errors.pydantic.dev/2.10/v/model_type
    agents.0.tasks.1
      Input should be a valid dictionary or instance of TaskConfig [type=model_type, input_value='Get the current stock price for Apple (AAPL).', input_type=str]
        For further information visit https://errors.pydantic.dev/2.10/v/model_type
    agents.1.tasks.0
      Input should be a valid dictionary or instance of TaskConfig [type=model_type, input_value='Compare the stock price ...e stock price of Apple.', input_type=str]
        For further information visit https://errors.pydantic.dev/2.10/v/model_type
    agents.1.tasks.1
      Input should be a valid dictionary or instance of TaskConfig [type=model_type, input_value='Determine which stock has a higher price.', input_type=str]
        For further information visit https://errors.pydantic.dev/2.10/v/model_type
    
    

    @claude
    Copy link
    Copy Markdown
    Contributor

    claude bot commented Jul 10, 2025

    Claude finished @MervinPraison's task —— View jobclaude/pr-800-20250710_225454Create PR ➔


    Todo List

    • Read and analyze the AutoAgents implementation
    • Identify where tasks are being generated as strings
    • Fix the task generation to create proper TaskConfig objects
    • Test the fix locally if possible
    • Commit the changes
    • Provide PR creation link

    Solution Implemented

    I've successfully fixed the AutoAgents validation error. The issue was that some LLMs (like Gemini) were returning tasks as simple strings instead of proper TaskConfig objects with required fields.

    Changes made:

    1. Added _normalize_config method that:

      • Detects when tasks are provided as strings
      • Converts string tasks to proper TaskConfig dictionaries with all required fields
      • Handles incomplete dictionary tasks by adding missing fields
    2. Updated _generate_config method to normalize configurations before validation

      • Ensures all tasks are properly formatted regardless of LLM response format
    3. Created test script to verify the fix handles both string and dictionary tasks correctly

    The fix maintains backward compatibility while handling edge cases where LLMs return tasks in unexpected formats.

    Create PR


    Copy link
    Copy Markdown

    @cursor cursor bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Bug: LLM Reflection Fails on Non-String Values

    The reflection logic for custom LLMs calls .lower() on the 'satisfactory' field of the parsed JSON response. This causes an AttributeError if the LLM returns a non-string value (e.g., null, boolean, or number) for this field, as these types lack a .lower() method.

    src/praisonai-agents/praisonaiagents/agent/agent.py#L1297-L1298

    self.reflection = data.get('reflection', '')
    self.satisfactory = data.get('satisfactory', 'no').lower()

    Fix in CursorFix in Web


    Was this report helpful? Give feedback by reacting with 👍 or 👎

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

    Projects

    None yet

    Development

    Successfully merging this pull request may close these issues.

    Expecting property name enclosed in double issue

    1 participant