You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CHANGELOG.md
+21Lines changed: 21 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,3 +14,24 @@ All notable changes to this project will be documented in this file.
14
14
- Modified the parameters passed between various functions to a dynamic dictionary, allowing data transfer in the flow even between non-sequential functions.
15
15
- Added the `<finalAnswerDataLog>` tag for data extraction and writing in memory logs, enhancing the management of more complex tasks.
16
16
- General improvements in error handling.
17
+
18
+
## [1.2.0] - 2025-02-11
19
+
### Added
20
+
-**Function Validator:**
21
+
- Introduced a comprehensive function validator that leverages Python's `ast` module to analyze and verify function code before execution.
22
+
- The validator performs multiple checks including:
23
+
-**Syntax Validation:** Ensures the code is syntactically correct.
24
+
-**Import Restrictions:** Verifies that only allowed libraries are imported and disallows relative (local) imports.
25
+
-**Parameter Signature Checks:** Validates that function definitions adhere to expected parameter rules (e.g., no varargs, proper defaults for keyword-only arguments, and correct naming for subtask-specific parameters).
26
+
-**Assignment Verification:** Confirms the presence of mandatory assignments like `updated_dict = previous_output.copy()` for subtasks with index > 0.
27
+
-**Undefined Name Detection:** Identifies any use of undefined names within the function.
28
+
-**Nesting Depth Control:** Ensures function nesting does not exceed one level (primary function plus optional helper functions).
29
+
-**Dangerous Function Call Prevention:** Blocks execution of dangerous functions (e.g., `eval`, `exec`, `compile`, `__import__`, `os.system`, etc.).
30
+
-**updated_dict.get Key Validation:** Checks that keys used in `updated_dict.get(...)` calls exist in the provided `previous_output`.
31
+
- Upon successful validation, the function is automatically renamed to match the expected subtask name.
32
+
-**RAG Llama Index Integration:**
33
+
- Added capabilities for llama index retrieval and ingest, expanding the system's data ingestion and search functionality.
34
+
-**Subtask Regeneration Function:**
35
+
- Implemented a new function that automatically regenerates a subtask if:
36
+
- The function validator fails during pre-execution checks, or
Welcome to the project! This section provides a general overview of the project, its goals, and its main features.
@@ -41,6 +41,9 @@ Detailed description of the Simple RAG technique.
41
41
Detailed description of the Hybrid Vector Graph RAG technique.
42
42
[Go to Hybrid Vector Graph RAG](#hybrid-vector-graph-rag)
43
43
44
+
### LlamaIndex RAG
45
+
Discover how to use the LlamaIndex RAG.
46
+
[Go to LlamaIndex RAG](#llama-index-rag)
44
47
45
48
46
49
## Introduction <aname="introduction"></a>
@@ -57,6 +60,7 @@ AutoCodeAgent allows you to handle complex tasks such as:
57
60
58
61
-*"Please visit Booking.com and search for a Hotel in Milan that is available from June 1st to June 10th. Extract the name and price of the first hotel in the result. Then save it on simple rag database, send an email to (your_email) with the hotel's name and price."*
59
62
63
+
-*"Calculate the area of the triangle formed by Paris, Moscow, and Rome in square kilometers, and send me an email at samuele.giampieri1@gmail.com with the coordinates of the cities and the calculated area."*
60
64
61
65
AutoCodeAgent 2.0 introduces RAG (Retrieval-Augmented Generation) capabilities, empowering the system with multi RAG techniques, each having its own ingestion and retrieval tools.
62
66
The system uses many persistent Database integrated in Docker, like Vector ChromaDB, Graph Neo4j, and Others.
@@ -92,9 +96,7 @@ Library-Based Tools: Easily integrate Python libraries by specifying their names
92
96
Custom Function Tools: Define specific functions as tools. The agent identifies these and avoids auto-generating code for them, ensuring custom implementations remain intact and reliable.
93
97
94
98
### Iterative Evaluation Loop:
95
-
A dedicated Evaluation Agent monitors execution logs, assesses whether the process ran successfully, and decides whether re-execution is necessary.
96
-
If errors or strange outputs are detected, the agent regenerates a new JSON plan with improved code and repeats the execution process.
97
-
The iterative loop continues until a satisfactory result is achieved or a maximum number of iterations is reached.
99
+
A dedicated Evaluation Agent monitors execution logs, assesses success, and, if necessary, re-plans or regenerates subtasks until a satisfactory result is achieved.
98
100
99
101
### Memory Logging & Error Handling:
100
102
Integrates a robust logging system to capture detailed execution logs. This allows for precise debugging and refinement of the agent's behavior.
@@ -105,8 +107,10 @@ The framework encourages reusability and modularity by consolidating related ope
105
107
Designed to integrate seamlessly with various Python libraries, allowing for flexible tool expansion without significant modifications to the core agent logic.
106
108
107
109
### Safe and Secure Execution:
108
-
Uses controlled namespaces and captures standard output to prevent unintended side effects during code execution.
109
-
Sanitizes outputs from AI model responses to ensure robust JSON parsing and prevent syntax issues in dynamically generated code.
110
+
Uses controlled namespaces and captures standard output to prevent unintended side effects.
111
+
112
+
### Python Function Validation & Task Regeneration:
113
+
A function validator inspects each subtask’s code (via AST analysis) for syntax, dangerous constructs, parameter correctness, allowed libraries and other issues *before execution*. If validation or execution errors occur, the agent automatically regenerates the subtask to ensure successful task completion.
110
114
111
115
### RAG retrieval / ingestion
112
116
- The agent now uses a vector database (ChromaDB) to store and retrieve information.
@@ -288,6 +292,7 @@ Some rules to follow:
288
292
- The tool name (tool_name) must be unique.
289
293
- Use exactly the same JSON structure you see, for example, for geopy.
290
294
- For the function, always use this schema:
295
+
- add default parameters in the function parameters only if you need fixed values to use in the function
@@ -389,6 +394,9 @@ HYBRID_VECTOR_GRAPH_RAG_QUERY_MAX_DEPTH=3 # max depth for hybrid vector graph r
389
394
HYBRID_VECTOR_GRAPH_RAG_QUERY_TOP_K=3 # top k for hybrid vector graph rag
390
395
HYBRID_VECTOR_GRAPH_RAG_QUERY_MAX_CONTEXT_LENGTH=10000 # max context length for hybrid vector graph rag
391
396
397
+
GMAILUSER=your_email@gmail.com
398
+
PASSGMAILAPP=your_password
399
+
392
400
TOOL_HELPER_MODEL=gpt-4o # tool helper model
393
401
JSON_PLAN_MODEL=gpt-4o # json plan model
394
402
EVALUATION_MODEL=gpt-4o # evaluation model
@@ -466,7 +474,6 @@ In this section, we dive into the diverse RAG techniques that AutoCodeAgent 2.0
466
474
467
475
Think of these RAG techniques as your personal data assistants, ready to fetch, store, and process information at your command. With AutoCodeAgent 2.0, you’re not just working with data—you’re orchestrating it. Let’s explore how each RAG technique can transform the way you interact with information, making your tasks smarter, faster, and more intuitive.
468
476
469
-
---
470
477
471
478
### Simple RAG <aname="simple-rag"></a>
472
479
Simple RAG is your go-to tool for straightforward data retrieval and ingestion tasks. It leverages vector embeddings to store and retrieve text chunks efficiently, making it ideal for scenarios where quick access to relevant information is crucial. Whether you're saving web search results or retrieving documents based on a query, Simple RAG ensures that your data is always within reach.
@@ -485,7 +492,7 @@ Simple RAG is your go-to tool for straightforward data retrieval and ingestion t
485
492
-*"Search for the latest news on AI advancements and save it in the database using the tool: `ingest_simple_rag`."*
486
493
-*"Retrieve information about climate change from the database using the tool: `retrieve_simple_rag`."*
[Hybrid Vector Graph RAG Video Demo](https://youtu.be/a9Ul6CxYsFM).
@@ -582,7 +589,7 @@ Checks if the currently gathered context is sufficient to answer the question by
582
589
End Retrieval Marks the completion of the retrieval process, signaling that the system has successfully answered the user's question.
583
590
584
591
- No: Expand BFS to Next Depth Continues the BFS to explore more related chunks in Neo4j. This involves searching deeper into the graph database to find additional relevant information. Retrieve Neighbors Above Threshold Fetches neighboring chunks connected via SIMILAR_TO relationships with similarity scores above a defined threshold. Only connections that meet or exceed this similarity level are considered for further exploration. Update Visited and Queue Updates the set of visited chunks and adds new chunks to the BFS queue for further exploration. This ensures that the system efficiently tracks which pieces of information have been examined. Check Max Depth Ensures that the BFS does not exceed the maximum allowed depth. If the maximum depth is reached, the system proceeds to generate the final answer regardless of whether the context is fully sufficient. Loop Back to "Is Context Enough?" Re-evaluates if the newly accumulated context meets the sufficiency criteria. The system checks again whether the gathered information is adequate to answer the question.
585
-
592
+
586
593
6. Generate Final Answer
587
594
After gathering sufficient context or reaching the maximum BFS depth, the system compiles the final answer using the collected information. This step synthesizes all relevant data into a coherent response.
588
595
@@ -592,10 +599,13 @@ Delivers the generated answer to the user. The system presents the final respons
592
599
8. End Retrieval
593
600
Concludes the retrieval process by finalizing all tasks and ensuring that all data is properly stored and connections are closed. It confirms that the system has successfully processed the user's query.
594
601
595
-
---
596
602
597
-
With these RAG techniques, AutoCodeAgent 2.0 transforms the way you interact with data, making it easier than ever to store, retrieve, and analyze information. Whether you're working on simple tasks or tackling complex data challenges, these tools are here to empower your workflow and unlock new possibilities.
603
+
### Llama Index RAG <aname="llama-index-rag"></a>
604
+
In addition to the techniques above, the agent now integrates the Llama Index for even more advanced data retrieval and ingestion, enhancing its ability to work with complex datasets. Llama Index has been added as a default tool, so it is possible to customize the execution of ingestion and retrieval code by adding other parameters provided by the Llama Index documentation.
605
+
Example Prompt for retrieval: "Find the latest market trends using the Llama Index."
606
+
Example prompt for ingestion: "Find the latest market trends from the web and save it in the database using the Llama Index."
598
607
608
+
With these RAG techniques, AutoCodeAgent 2.0 transforms the way you interact with data, making it easier than ever to store, retrieve, and analyze information. Whether you're working on simple tasks or tackling complex data challenges, these tools are here to empower your workflow and unlock new possibilities.
599
609
600
610
601
611
## Contribution Guidelines
@@ -620,3 +630,5 @@ We welcome contributions from the community! If you'd like to contribute, please
620
630
By contributing, you agree that your changes will be licensed under the same license as the project.
0 commit comments