Skip to content

Commit d723fb7

Browse files
author
“samuele
committed
vers 1.2.0
1 parent 003f17f commit d723fb7

35 files changed

Lines changed: 1110 additions & 128 deletions

CHANGELOG.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,3 +14,24 @@ All notable changes to this project will be documented in this file.
1414
- Modified the parameters passed between various functions to a dynamic dictionary, allowing data transfer in the flow even between non-sequential functions.
1515
- Added the `<finalAnswerDataLog>` tag for data extraction and writing in memory logs, enhancing the management of more complex tasks.
1616
- General improvements in error handling.
17+
18+
## [1.2.0] - 2025-02-11
19+
### Added
20+
- **Function Validator:**
21+
- Introduced a comprehensive function validator that leverages Python's `ast` module to analyze and verify function code before execution.
22+
- The validator performs multiple checks including:
23+
- **Syntax Validation:** Ensures the code is syntactically correct.
24+
- **Import Restrictions:** Verifies that only allowed libraries are imported and disallows relative (local) imports.
25+
- **Parameter Signature Checks:** Validates that function definitions adhere to expected parameter rules (e.g., no varargs, proper defaults for keyword-only arguments, and correct naming for subtask-specific parameters).
26+
- **Assignment Verification:** Confirms the presence of mandatory assignments like `updated_dict = previous_output.copy()` for subtasks with index > 0.
27+
- **Undefined Name Detection:** Identifies any use of undefined names within the function.
28+
- **Nesting Depth Control:** Ensures function nesting does not exceed one level (primary function plus optional helper functions).
29+
- **Dangerous Function Call Prevention:** Blocks execution of dangerous functions (e.g., `eval`, `exec`, `compile`, `__import__`, `os.system`, etc.).
30+
- **updated_dict.get Key Validation:** Checks that keys used in `updated_dict.get(...)` calls exist in the provided `previous_output`.
31+
- Upon successful validation, the function is automatically renamed to match the expected subtask name.
32+
- **RAG Llama Index Integration:**
33+
- Added capabilities for llama index retrieval and ingest, expanding the system's data ingestion and search functionality.
34+
- **Subtask Regeneration Function:**
35+
- Implemented a new function that automatically regenerates a subtask if:
36+
- The function validator fails during pre-execution checks, or
37+
- An error is encountered during execution.

README.md

Lines changed: 24 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# AutoCodeAgent - AI Agent for Complex Task Resolution
2-
![version](https://img.shields.io/badge/version-1.1.0-blue)
2+
![version](https://img.shields.io/badge/version-1.2.0-blue)
33

44
## Introduction
55
Welcome to the project! This section provides a general overview of the project, its goals, and its main features.
@@ -41,6 +41,9 @@ Detailed description of the Simple RAG technique.
4141
Detailed description of the Hybrid Vector Graph RAG technique.
4242
[Go to Hybrid Vector Graph RAG](#hybrid-vector-graph-rag)
4343

44+
### LlamaIndex RAG
45+
Discover how to use the LlamaIndex RAG.
46+
[Go to LlamaIndex RAG](#llama-index-rag)
4447

4548

4649
## Introduction <a name="introduction"></a>
@@ -57,6 +60,7 @@ AutoCodeAgent allows you to handle complex tasks such as:
5760

5861
- *"Please visit Booking.com and search for a Hotel in Milan that is available from June 1st to June 10th. Extract the name and price of the first hotel in the result. Then save it on simple rag database, send an email to (your_email) with the hotel's name and price."*
5962

63+
- *"Calculate the area of the triangle formed by Paris, Moscow, and Rome in square kilometers, and send me an email at samuele.giampieri1@gmail.com with the coordinates of the cities and the calculated area."*
6064

6165
AutoCodeAgent 2.0 introduces RAG (Retrieval-Augmented Generation) capabilities, empowering the system with multi RAG techniques, each having its own ingestion and retrieval tools.
6266
The system uses many persistent Database integrated in Docker, like Vector ChromaDB, Graph Neo4j, and Others.
@@ -92,9 +96,7 @@ Library-Based Tools: Easily integrate Python libraries by specifying their names
9296
Custom Function Tools: Define specific functions as tools. The agent identifies these and avoids auto-generating code for them, ensuring custom implementations remain intact and reliable.
9397

9498
### Iterative Evaluation Loop:
95-
A dedicated Evaluation Agent monitors execution logs, assesses whether the process ran successfully, and decides whether re-execution is necessary.
96-
If errors or strange outputs are detected, the agent regenerates a new JSON plan with improved code and repeats the execution process.
97-
The iterative loop continues until a satisfactory result is achieved or a maximum number of iterations is reached.
99+
A dedicated Evaluation Agent monitors execution logs, assesses success, and, if necessary, re-plans or regenerates subtasks until a satisfactory result is achieved.
98100

99101
### Memory Logging & Error Handling:
100102
Integrates a robust logging system to capture detailed execution logs. This allows for precise debugging and refinement of the agent's behavior.
@@ -105,8 +107,10 @@ The framework encourages reusability and modularity by consolidating related ope
105107
Designed to integrate seamlessly with various Python libraries, allowing for flexible tool expansion without significant modifications to the core agent logic.
106108

107109
### Safe and Secure Execution:
108-
Uses controlled namespaces and captures standard output to prevent unintended side effects during code execution.
109-
Sanitizes outputs from AI model responses to ensure robust JSON parsing and prevent syntax issues in dynamically generated code.
110+
Uses controlled namespaces and captures standard output to prevent unintended side effects.
111+
112+
### Python Function Validation & Task Regeneration:
113+
A function validator inspects each subtask’s code (via AST analysis) for syntax, dangerous constructs, parameter correctness, allowed libraries and other issues *before execution*. If validation or execution errors occur, the agent automatically regenerates the subtask to ensure successful task completion.
110114

111115
### RAG retrieval / ingestion
112116
- The agent now uses a vector database (ChromaDB) to store and retrieve information.
@@ -288,6 +292,7 @@ Some rules to follow:
288292
- The tool name (tool_name) must be unique.
289293
- Use exactly the same JSON structure you see, for example, for geopy.
290294
- For the function, always use this schema:
295+
- add default parameters in the function parameters only if you need fixed values to use in the function
291296

292297
```python
293298
def function_name(previous_output):
@@ -337,7 +342,7 @@ git clone https://github.com/samugit83/AutoCodeAgent
337342
cd interactive-multiagent
338343
```
339344

340-
3. Build the Docker image:
345+
3. Build the Docker image:
341346
```bash
342347
docker-compose build
343348
```
@@ -389,6 +394,9 @@ HYBRID_VECTOR_GRAPH_RAG_QUERY_MAX_DEPTH=3 # max depth for hybrid vector graph r
389394
HYBRID_VECTOR_GRAPH_RAG_QUERY_TOP_K=3 # top k for hybrid vector graph rag
390395
HYBRID_VECTOR_GRAPH_RAG_QUERY_MAX_CONTEXT_LENGTH=10000 # max context length for hybrid vector graph rag
391396

397+
GMAILUSER=your_email@gmail.com
398+
PASSGMAILAPP=your_password
399+
392400
TOOL_HELPER_MODEL=gpt-4o # tool helper model
393401
JSON_PLAN_MODEL=gpt-4o # json plan model
394402
EVALUATION_MODEL=gpt-4o # evaluation model
@@ -466,7 +474,6 @@ In this section, we dive into the diverse RAG techniques that AutoCodeAgent 2.0
466474

467475
Think of these RAG techniques as your personal data assistants, ready to fetch, store, and process information at your command. With AutoCodeAgent 2.0, you’re not just working with data—you’re orchestrating it. Let’s explore how each RAG technique can transform the way you interact with information, making your tasks smarter, faster, and more intuitive.
468476

469-
---
470477

471478
### Simple RAG <a name="simple-rag"></a>
472479
Simple RAG is your go-to tool for straightforward data retrieval and ingestion tasks. It leverages vector embeddings to store and retrieve text chunks efficiently, making it ideal for scenarios where quick access to relevant information is crucial. Whether you're saving web search results or retrieving documents based on a query, Simple RAG ensures that your data is always within reach.
@@ -485,7 +492,7 @@ Simple RAG is your go-to tool for straightforward data retrieval and ingestion t
485492
- *"Search for the latest news on AI advancements and save it in the database using the tool: `ingest_simple_rag`."*
486493
- *"Retrieve information about climate change from the database using the tool: `retrieve_simple_rag`."*
487494

488-
---
495+
489496

490497
### Hybrid Vector Graph RAG <a name="hybrid-vector-graph-rag"></a>
491498
[Hybrid Vector Graph RAG Video Demo](https://youtu.be/a9Ul6CxYsFM).
@@ -582,7 +589,7 @@ Checks if the currently gathered context is sufficient to answer the question by
582589
End Retrieval Marks the completion of the retrieval process, signaling that the system has successfully answered the user's question.
583590

584591
- No: Expand BFS to Next Depth Continues the BFS to explore more related chunks in Neo4j. This involves searching deeper into the graph database to find additional relevant information. Retrieve Neighbors Above Threshold Fetches neighboring chunks connected via SIMILAR_TO relationships with similarity scores above a defined threshold. Only connections that meet or exceed this similarity level are considered for further exploration. Update Visited and Queue Updates the set of visited chunks and adds new chunks to the BFS queue for further exploration. This ensures that the system efficiently tracks which pieces of information have been examined. Check Max Depth Ensures that the BFS does not exceed the maximum allowed depth. If the maximum depth is reached, the system proceeds to generate the final answer regardless of whether the context is fully sufficient. Loop Back to "Is Context Enough?" Re-evaluates if the newly accumulated context meets the sufficiency criteria. The system checks again whether the gathered information is adequate to answer the question.
585-
592+
586593
6. Generate Final Answer
587594
After gathering sufficient context or reaching the maximum BFS depth, the system compiles the final answer using the collected information. This step synthesizes all relevant data into a coherent response.
588595

@@ -592,10 +599,13 @@ Delivers the generated answer to the user. The system presents the final respons
592599
8. End Retrieval
593600
Concludes the retrieval process by finalizing all tasks and ensuring that all data is properly stored and connections are closed. It confirms that the system has successfully processed the user's query.
594601

595-
---
596602

597-
With these RAG techniques, AutoCodeAgent 2.0 transforms the way you interact with data, making it easier than ever to store, retrieve, and analyze information. Whether you're working on simple tasks or tackling complex data challenges, these tools are here to empower your workflow and unlock new possibilities.
603+
### Llama Index RAG <a name="llama-index-rag"></a>
604+
In addition to the techniques above, the agent now integrates the Llama Index for even more advanced data retrieval and ingestion, enhancing its ability to work with complex datasets. Llama Index has been added as a default tool, so it is possible to customize the execution of ingestion and retrieval code by adding other parameters provided by the Llama Index documentation.
605+
Example Prompt for retrieval: "Find the latest market trends using the Llama Index."
606+
Example prompt for ingestion: "Find the latest market trends from the web and save it in the database using the Llama Index."
598607

608+
With these RAG techniques, AutoCodeAgent 2.0 transforms the way you interact with data, making it easier than ever to store, retrieve, and analyze information. Whether you're working on simple tasks or tackling complex data challenges, these tools are here to empower your workflow and unlock new possibilities.
599609

600610

601611
## Contribution Guidelines
@@ -620,3 +630,5 @@ We welcome contributions from the community! If you'd like to contribute, please
620630
By contributing, you agree that your changes will be licensed under the same license as the project.
621631

622632
Thank you for helping improve this project! 🚀
633+
634+

app.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -30,11 +30,11 @@ def run_code_agent():
3030
tools = [
3131
{
3232
"tool_name": "numpy",
33-
"lib_name": ["numpy"]
33+
"lib_names": ["numpy"]
3434
},
3535
{
3636
"tool_name": "geopy",
37-
"lib_name": ["geopy"],
37+
"lib_names": ["geopy"],
3838
"instructions": "A library to get the coordinates of a given location.",
3939
"code_example": """
4040
0 Bytes
Binary file not shown.
10.1 KB
Binary file not shown.
23 Bytes
Binary file not shown.
1.48 KB
Binary file not shown.
24.7 KB
Binary file not shown.
-5 Bytes
Binary file not shown.
3.97 KB
Binary file not shown.

0 commit comments

Comments
 (0)