Conversation
Sync branches
…hain, openai, pydantic, and numpy
…onment.yml and requirements.txt
The interpreter tool previously relied on codeinterpreterapi (remote sandbox) which is no longer compatible with the current dependency stack. This replaces it with langchain_experimental PythonREPL (local execution) while preserving the same agent contract. Key changes: - tool_interpreter.py: rewrite using PythonREPL — injects file previews (columns, dtypes, 3-row sample) into the LLM prompt so generated code uses exact column names; truncates REPL stdout at 2000 chars to prevent context overflow; wires shared llm_instance from the agent instead of instantiating a separate model - agent.py: passes llm_instance to tool constructor via import_tools introspection - params.ini: bump default OpenAI models from gpt-4o / gpt-4o-mini to gpt-5.4 / gpt-5.4-mini - requirements.txt / environment.yml: pin langchain_experimental==0.3.4 and remove codeinterpreterapi / codeboxapi dependencies - Remove custom_sqlite_file.py (unused after langgraph checkpoint refactor) Security note: PythonREPL executes LLM-generated code in-process without sandboxing. A warning comment has been added; restrict to trusted deployment environments.
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughMigrate many tools from LangChain pydantic shims to native pydantic, add explicit type annotations, replace in-process code-interpreter with LLM-driven script generation + sandboxed subprocess execution, add trusted-mode and session/file security, remove a custom SQLite checkpointer, update memory fallback, add tests, and bump dependencies. Changes
Sequence Diagram(s)sequenceDiagram
actor User
participant Agent as InterpreterTool
participant DB as Database
participant LLM as OpenAI Chat LLM
participant Sandbox as SandboxRunner
participant FS as FileSystem
User->>Agent: invoke InterpreterTool(input, session_id)
alt no explicit file paths
Agent->>DB: query session files
DB-->>Agent: file_paths (JSON)
else explicit paths provided
Agent-->>Agent: parse quoted/unquoted paths
end
Agent->>FS: read previews (truncated)
FS-->>Agent: previews
Agent->>LLM: request single fenced Python script
LLM-->>Agent: returns ```python ... ```
Agent->>Sandbox: run script via sandbox_runner (json config)
Sandbox-->>Agent: result {success, stdout, stderr, error}
Agent->>FS: snapshot session dir, detect new/changed files
Agent->>DB: persist outputs and discovered file paths
Agent-->>User: return truncated output + files_created
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Review Summary by QodoReplace code interpreter with PythonREPL and fix deprecated pydantic imports
WalkthroughsDescription• Replace codeinterpreterapi with langchain_experimental PythonREPL for local code execution - Inject file previews into LLM prompts for accurate column name usage - Truncate REPL output at 2000 chars to prevent context overflow - Wire shared llm_instance from agent instead of separate model instantiation • Fix deprecated pydantic imports across all tool files - Replace langchain.pydantic_v1 with native pydantic imports - Add type annotations to args_schema and class variables • Update dependencies to latest versions - langchain, langchain_core, langchain_openai, langgraph, litellm, openai, pydantic - Add langchain_experimental==0.3.4 for PythonREPL support • Update default OpenAI models from gpt-4o to gpt-5.4 in configuration • Remove custom_sqlite_file.py and use langgraph's built-in MemorySaver • Improve evaluation script with dataset creation and better error handling Diagramflowchart LR
A["codeinterpreterapi<br/>remote sandbox"] -->|"replaced with"| B["PythonREPL<br/>local execution"]
B --> C["File previews<br/>injected to LLM"]
C --> D["Generated code<br/>with exact columns"]
D --> E["Output truncated<br/>at 2000 chars"]
F["Deprecated<br/>langchain.pydantic_v1"] -->|"replaced with"| G["Native pydantic<br/>imports"]
H["gpt-4o"] -->|"updated to"| I["gpt-5.4"]
J["custom_sqlite_file.py"] -->|"removed"| K["langgraph MemorySaver"]
File Changes1. app/core/agents/interpreter/tool_interpreter.py
|
Code Review by Qodo
1.
|
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
PR Code Suggestions ✨No code suggestions found for the PR. |
|
Tip For best results, initiate chat on the files or code changes.
Just let me know what you need! |
|
@CodeRabbit review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
app/core/tests/evaluation.py (1)
141-143:⚠️ Potential issue | 🟡 MinorUpdate
project_metadata.modelto reflect the actual default model.The hardcoded
"gpt-4o"in the project metadata is outdated. The workflow usesllm_owhich is now configured asgpt-5.4inparams.ini. Either read the actual model name from themodelsdictionary (e.g., extract frommodels["llm_o"]) or update the literal to"gpt-5.4"to ensure LangSmith runs are tagged with the correct model.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/core/tests/evaluation.py` around lines 141 - 143, The project_metadata currently hardcodes "gpt-4o"; update it to reflect the actual default model by using the configured models map or the new literal. Replace the literal with the value from the models dictionary (e.g., project_metadata={"model": models["llm_o"]} or models.get("llm_o", "gpt-5.4")) or directly set it to "gpt-5.4"; ensure the symbol models and key "llm_o" are in scope where project_metadata is constructed so LangSmith runs are tagged with the correct model.
🧹 Nitpick comments (2)
app/core/memory/database_manager.py (1)
29-36: Minor: redundantif class_path:guard.
class_pathis guaranteed truthy here because the outerifalready requiresos.getenv("MEMORY_DATABASE_MANAGER_CLASS")(and the same applies intools_database). You can drop the inner branch, which also removes the implicitNonereturn if it ever fell through. Same nit applies to lines 13-18.♻️ Suggested simplification
- if os.getenv('DATABASE_URL') and os.getenv("MEMORY_DATABASE_MANAGER_CLASS"): - class_path = os.getenv('MEMORY_DATABASE_MANAGER_CLASS') - if class_path: - # Dynamically import the module and class based on the environment variable - module_name, class_name = class_path.rsplit('.', 1) - module = importlib.import_module(module_name) - manager_class = getattr(module, class_name) - return manager_class() - else: - # Use langgraph's built-in MemorySaver which implements the current checkpoint API - from langgraph.checkpoint.memory import MemorySaver - return MemorySaver() + class_path = os.getenv("MEMORY_DATABASE_MANAGER_CLASS") + if os.getenv("DATABASE_URL") and class_path: + module_name, class_name = class_path.rsplit(".", 1) + module = importlib.import_module(module_name) + manager_class = getattr(module, class_name) + return manager_class() + + # Use langgraph's built-in MemorySaver which implements the current checkpoint API + from langgraph.checkpoint.memory import MemorySaver + return MemorySaver()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/core/memory/database_manager.py` around lines 29 - 36, Remove the redundant inner guard around class_path in the environment-driven dynamic import logic: the outer condition already checks os.getenv("MEMORY_DATABASE_MANAGER_CLASS"), so delete the inner "if class_path:" branch and directly perform module_name, class_name = class_path.rsplit('.', 1), module = importlib.import_module(module_name), manager_class = getattr(module, class_name) and return manager_class(); apply the same simplification to the analogous tools_database block (the earlier block that checks the same env var) to avoid the unnecessary implicit None return path.app/core/agents/enpkg/tool_smiles.py (1)
4-39: LGTM on the pydantic migration.
BaseModel/Fielddirectly frompydanticand the typedargs_schemaare correct for the 0.3.x stack.Nit (pre-existing):
from langchain.tools import BaseToolis imported twice (lines 3 and 14). Can be cleaned up while touching this file.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/core/agents/enpkg/tool_smiles.py` around lines 4 - 39, The file imports BaseTool twice; remove the redundant import and keep a single import of BaseTool (referenced by the class SMILESResolver) so there is no duplicate import statement; ensure the remaining imports still include BaseModel and Field from pydantic, Optional if used elsewhere, CallbackManagerForToolRun if needed, and setup_logger, then run linters/import-sort to confirm clean import ordering.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@app/core/agents/interpreter/tool_interpreter.py`:
- Around line 144-160: The current diff only records newly created files using
existing_files and new_files, so files that were overwritten or modified in
place (saved to session_dir with the same name) are missed; to fix, before
calling PythonREPL.run(code) capture a mapping of existing files in session_dir
to their diagnostics (e.g., stat().st_mtime and st_size or a content hash) using
session_dir.iterdir() (existing_files -> existing_meta), then after execution
enumerate session_dir again and compare each path's new mtime/size/hash to the
saved meta to build a set of changed_or_new_files, replace the new_files
calculation with that set, and populate filepaths from changed_or_new_files so
modified files (not just newly created ones) are included in the DB entry and
logs (referencing session_dir, existing_files/existing_meta, new_files,
PythonREPL.run, and filepaths).
- Around line 147-153: The PythonREPL.run call is missing a timeout, which can
cause the agent to hang; update the call to use the timeout parameter (e.g.,
repl.run(code, timeout=30) or a configurable value) and wrap the call in a
try/except to catch the timeout exception (TimeoutError or the specific
exception thrown by langchain_experimental) and any other execution errors, set
execution_output to a clear timeout/error message, and log the incident via
logger.info/logger.error so the agent handles timeouts gracefully (refer to
PythonREPL.run, execution_output, and logger).
In `@app/core/memory/database_manager.py`:
- Around line 22-40: The current memory_database() falls back to
langgraph.checkpoint.memory.MemorySaver (ephemeral) which makes state
non-persistent for Streamlit and the compiled workflow; decide whether
persistence is intended and either (A) change the default fallback to a
persistent saver such as langgraph.checkpoint.sqlite.SqliteSaver (or SqliteSaver
from langgraph-checkpoint-sqlite) so memory_database() returns a persistent
checkpointer when MEMORY_DATABASE_MANAGER_CLASS is not set, or (B) make
persistence explicit via a new env var (e.g., MEMORY_PERSISTENT=true) and, when
set, instantiate the persistent class (SqliteSaver) otherwise keep MemorySaver;
update memory_database(), reference to MEMORY_DATABASE_MANAGER_CLASS,
MemorySaver and tools_database for consistency with tools default
(SqliteToolsDatabaseManager) so behavior is consistent across environments.
In `@app/core/tests/evaluation.py`:
- Around line 118-128: In evaluate_result, avoid using `_input.get("question")
or _input["messages"][0]["content"]` because an empty string will incorrectly
fall through and can cause KeyError/IndexError; instead explicitly check for
None/absence of the "question" key (e.g., `question is not None`) and only then
use it, otherwise validate that `_input` contains a non-empty "messages" list
and that `messages[0]["content"]` exists; if validation fails raise a clear
ValueError describing the missing field so callers of evaluate_result get a
meaningful error before constructing the HumanMessage and calling app.invoke.
- Around line 44-77: Narrow the broad except around client.read_dataset to only
catch the LangSmith "not found" error (e.g., LangSmithNotFoundError) instead of
Exception, preserve original failure causes when re-raising, and change the
FileNotFoundError raise to use "raise ... from None" to avoid chained context;
after parsing rows with json.loads into inputs/outputs (used by
client.create_examples and dataset.id), guard against calling
client.create_examples if inputs is empty (log and abort dataset creation) and
fix the double space in the dataset description ("MetaboT Benchmark") so the
created dataset uses the corrected description.
In `@app/tests/installation_test.py`:
- Around line 2-7: The import of main occurs before the REPO_ROOT sys.path
bootstrap, causing ModuleNotFoundError when running the test directly; move the
sys.path mutation (REPO_ROOT calculation and insertion into sys.path) above the
import and then import main (from app.core.main import main) after ensuring
REPO_ROOT is inserted, so adjust the file to compute REPO_ROOT, insert it into
sys.path if missing, and only then import main.
---
Outside diff comments:
In `@app/core/tests/evaluation.py`:
- Around line 141-143: The project_metadata currently hardcodes "gpt-4o"; update
it to reflect the actual default model by using the configured models map or the
new literal. Replace the literal with the value from the models dictionary
(e.g., project_metadata={"model": models["llm_o"]} or models.get("llm_o",
"gpt-5.4")) or directly set it to "gpt-5.4"; ensure the symbol models and key
"llm_o" are in scope where project_metadata is constructed so LangSmith runs are
tagged with the correct model.
---
Nitpick comments:
In `@app/core/agents/enpkg/tool_smiles.py`:
- Around line 4-39: The file imports BaseTool twice; remove the redundant import
and keep a single import of BaseTool (referenced by the class SMILESResolver) so
there is no duplicate import statement; ensure the remaining imports still
include BaseModel and Field from pydantic, Optional if used elsewhere,
CallbackManagerForToolRun if needed, and setup_logger, then run
linters/import-sort to confirm clean import ordering.
In `@app/core/memory/database_manager.py`:
- Around line 29-36: Remove the redundant inner guard around class_path in the
environment-driven dynamic import logic: the outer condition already checks
os.getenv("MEMORY_DATABASE_MANAGER_CLASS"), so delete the inner "if class_path:"
branch and directly perform module_name, class_name = class_path.rsplit('.', 1),
module = importlib.import_module(module_name), manager_class = getattr(module,
class_name) and return manager_class(); apply the same simplification to the
analogous tools_database block (the earlier block that checks the same env var)
to avoid the unnecessary implicit None return path.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: da086e6d-0f23-4e28-838b-a779bdfe6d75
⛔ Files ignored due to path filters (4)
app/config/params.iniis excluded by!**/*.iniapp/data/big_benchmark.csvis excluded by!**/*.csv,!**/*.csvapp/data/evaluation_dataset.csvis excluded by!**/*.csv,!**/*.csvenvironment.ymlis excluded by!**/*.yml
📒 Files selected for processing (18)
app/core/agents/enpkg/tool_chemicals.pyapp/core/agents/enpkg/tool_smiles.pyapp/core/agents/enpkg/tool_target.pyapp/core/agents/enpkg/tool_taxon.pyapp/core/agents/entry/tool_filesparser.pyapp/core/agents/interpreter/agent.pyapp/core/agents/interpreter/tool_interpreter.pyapp/core/agents/interpreter/tool_spectrum.pyapp/core/agents/sparql/tool_merge_result.pyapp/core/agents/sparql/tool_sparql.pyapp/core/agents/sparql/tool_wikidata_query.pyapp/core/agents/validator/tool_validator.pyapp/core/memory/custom_sqlite_file.pyapp/core/memory/database_manager.pyapp/core/tests/evaluation.pyapp/core/workflow/langraph_workflow.pyapp/tests/installation_test.pyrequirements.txt
💤 Files with no reviewable changes (1)
- app/core/memory/custom_sqlite_file.py
| def memory_database(): | ||
| """ | ||
| Get the database manager based on the environment variables | ||
|
|
||
| :return: The database manager | ||
| """ | ||
|
|
||
| if os.getenv('DATABASE_URL') and os.getenv("MEMORY_DATABASE_MANAGER_CLASS"): | ||
| class_path = os.getenv('MEMORY_DATABASE_MANAGER_CLASS') | ||
| if class_path: | ||
| # Dynamically import the module and class based on the environment variable | ||
| module_name, class_name = class_path.rsplit('.', 1) | ||
| module = importlib.import_module(module_name) | ||
| manager_class = getattr(module, class_name) | ||
| return manager_class() | ||
| else: | ||
| # Use langgraph's built-in MemorySaver which implements the current checkpoint API | ||
| from langgraph.checkpoint.memory import MemorySaver | ||
| return MemorySaver() No newline at end of file |
There was a problem hiding this comment.
Default memory is now ephemeral — confirm this is intentional for Streamlit/production.
Falling back to MemorySaver means LangGraph checkpoints live only in the current Python process. For the two call sites of memory_database():
streamlit_webapp/streamlit_app.pystores the saver inst.session_state.memory. Any server restart (or Streamlit worker recycle) will drop all conversation state silently — previously the SQLite checkpointer would persist across restarts.app/core/workflow/langraph_workflow.pyuses it as the compiled workflow's checkpointer, so thread state won't survive restarts either.
This also creates an inconsistency with tools_database() above, which still defaults to the persistent SqliteToolsDatabaseManager. If ephemeral memory is desired for dev, great; if not, consider defaulting to a persistent LangGraph checkpointer (e.g. SqliteSaver from langgraph-checkpoint-sqlite) to preserve the previous persistence guarantee.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@app/core/memory/database_manager.py` around lines 22 - 40, The current
memory_database() falls back to langgraph.checkpoint.memory.MemorySaver
(ephemeral) which makes state non-persistent for Streamlit and the compiled
workflow; decide whether persistence is intended and either (A) change the
default fallback to a persistent saver such as
langgraph.checkpoint.sqlite.SqliteSaver (or SqliteSaver from
langgraph-checkpoint-sqlite) so memory_database() returns a persistent
checkpointer when MEMORY_DATABASE_MANAGER_CLASS is not set, or (B) make
persistence explicit via a new env var (e.g., MEMORY_PERSISTENT=true) and, when
set, instantiate the persistent class (SqliteSaver) otherwise keep MemorySaver;
update memory_database(), reference to MEMORY_DATABASE_MANAGER_CLASS,
MemorySaver and tools_database for consistency with tools default
(SqliteToolsDatabaseManager) so behavior is consistent across environments.
| try: | ||
| client.read_dataset(dataset_name=dataset_name) | ||
| logger.info(f"Dataset '{dataset_name}' found in LangSmith workspace.") | ||
| except Exception: | ||
| logger.info(f"Dataset '{dataset_name}' not found. Attempting to create it from local file...") | ||
|
|
||
|
|
||
| if not os.path.exists(local_data_path): | ||
| raise FileNotFoundError( | ||
| f"Could not find '{local_data_path}'. Ensure the local dataset file is shared alongside this script.") | ||
|
|
||
|
|
||
| # Load local data and create the dataset in the user's LangSmith account | ||
| df = pd.read_csv(local_data_path) | ||
| dataset = client.create_dataset(dataset_name=dataset_name, description="MetaboT Benchmark") | ||
|
|
||
| inputs = [] | ||
| outputs = [] | ||
|
|
||
| for _, row in df.iterrows(): | ||
| try: | ||
| # The CSV stores the JSON arrays as strings, so we need to parse them back into Python objects | ||
| parsed_messages = json.loads(row["messages"]) | ||
| parsed_end_state = json.loads(row["__end__"]) | ||
|
|
||
| inputs.append({"messages": parsed_messages}) | ||
| outputs.append({"__end__": parsed_end_state}) | ||
|
|
||
| except json.JSONDecodeError as e: | ||
| logger.warning(f"Skipping a row due to JSON parsing error: {e}") | ||
| continue | ||
|
|
||
| client.create_examples(inputs=inputs, outputs=outputs, dataset_id=dataset.id) | ||
| logger.info(f"Successfully created dataset '{dataset_name}' in LangSmith.") |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
langsmith Python client exception class raised when read_dataset does not find a dataset
💡 Result:
The exception class raised by the LangSmith Python client when read_dataset does not find a dataset is LangSmithNotFoundError.
Citations:
🏁 Script executed:
find . -name "evaluation.py" -path "*/core/tests/*" | head -5Repository: HolobiomicsLab/MetaboT
Length of output: 97
🏁 Script executed:
wc -l $(find . -name "evaluation.py" -path "*/core/tests/*" | head -1)Repository: HolobiomicsLab/MetaboT
Length of output: 101
🏁 Script executed:
sed -n '40,80p' $(find . -name "evaluation.py" -path "*/core/tests/*" | head -1) | cat -nRepository: HolobiomicsLab/MetaboT
Length of output: 1972
🏁 Script executed:
sed -n '115,130p' $(find . -name "evaluation.py" -path "*/core/tests/*" | head -1) | cat -nRepository: HolobiomicsLab/MetaboT
Length of output: 632
🏁 Script executed:
sed -n '135,150p' $(find . -name "evaluation.py" -path "*/core/tests/*" | head -1) | cat -nRepository: HolobiomicsLab/MetaboT
Length of output: 508
🏁 Script executed:
grep -n "from langsmith" $(find . -name "evaluation.py" -path "*/core/tests/*" | head -1)Repository: HolobiomicsLab/MetaboT
Length of output: 165
🏁 Script executed:
python3 << 'EOF'
try:
from langsmith.utils import LangSmithNotFoundError
print("✓ LangSmithNotFoundError import from langsmith.utils: SUCCESS")
except ImportError as e:
print(f"✗ LangSmithNotFoundError import from langsmith.utils: FAILED - {e}")
try:
from langsmith import LangSmithNotFoundError
print("✓ LangSmithNotFoundError import from langsmith: SUCCESS")
except ImportError as e:
print(f"✗ LangSmithNotFoundError import from langsmith: FAILED - {e}")
EOFRepository: HolobiomicsLab/MetaboT
Length of output: 242
🏁 Script executed:
rg "LangSmithNotFoundError|langsmith.*Error" --type py | head -20Repository: HolobiomicsLab/MetaboT
Length of output: 48
🏁 Script executed:
rg "\.get\(.*\)\s+or\s+" --type py -A 2 | head -40Repository: HolobiomicsLab/MetaboT
Length of output: 410
🏁 Script executed:
rg "read_dataset|create_dataset" --type py -B 2 -A 5 | head -50Repository: HolobiomicsLab/MetaboT
Length of output: 1125
Dataset-bootstrap block: narrow the exception, preserve the cause, and guard against an all-bad CSV.
Several robustness concerns in this newly added bootstrap path:
-
Line 47 — blind
except Exception. Any error (network, auth, permissions) is silently treated as "dataset missing" and triggers a create attempt. Narrow this to the LangSmith "not found" case (e.g.,LangSmithNotFoundError) so real failures surface. (Ruff BLE001) -
Lines 52–53 — re-raise inside
except. Useraise ... from Noneto avoid the misleading "During handling of the above exception…" chain. (Ruff B904) -
Lines 60–76 — empty-examples edge case. If every row fails
json.loads,inputs/outputsstay empty andcreate_examplesis still invoked, leaving an empty dataset created in the user's workspace. Skip the call wheninputsis empty. -
Line 58 — minor: double space in
"MetaboT Benchmark".
🛠️ Proposed fix
+from langsmith import LangSmithNotFoundError
+
try:
client.read_dataset(dataset_name=dataset_name)
logger.info(f"Dataset '{dataset_name}' found in LangSmith workspace.")
-except Exception:
+except LangSmithNotFoundError:
logger.info(f"Dataset '{dataset_name}' not found. Attempting to create it from local file...")
if not os.path.exists(local_data_path):
raise FileNotFoundError(
- f"Could not find '{local_data_path}'. Ensure the local dataset file is shared alongside this script.")
+ f"Could not find '{local_data_path}'. Ensure the local dataset file is shared alongside this script."
+ ) from None
# Load local data and create the dataset in the user's LangSmith account
df = pd.read_csv(local_data_path)
- dataset = client.create_dataset(dataset_name=dataset_name, description="MetaboT Benchmark")
+ dataset = client.create_dataset(dataset_name=dataset_name, description="MetaboT Benchmark")
inputs = []
outputs = []
for _, row in df.iterrows():
try:
# The CSV stores the JSON arrays as strings, so we need to parse them back into Python objects
parsed_messages = json.loads(row["messages"])
parsed_end_state = json.loads(row["__end__"])
inputs.append({"messages": parsed_messages})
outputs.append({"__end__": parsed_end_state})
except json.JSONDecodeError as e:
logger.warning(f"Skipping a row due to JSON parsing error: {e}")
continue
+ if not inputs:
+ raise ValueError(
+ f"No valid examples parsed from '{local_data_path}'; aborting dataset creation."
+ )
+
client.create_examples(inputs=inputs, outputs=outputs, dataset_id=dataset.id)
logger.info(f"Successfully created dataset '{dataset_name}' in LangSmith.")🧰 Tools
🪛 Ruff (0.15.11)
[warning] 47-47: Do not catch blind exception: Exception
(BLE001)
[warning] 52-53: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@app/core/tests/evaluation.py` around lines 44 - 77, Narrow the broad except
around client.read_dataset to only catch the LangSmith "not found" error (e.g.,
LangSmithNotFoundError) instead of Exception, preserve original failure causes
when re-raising, and change the FileNotFoundError raise to use "raise ... from
None" to avoid chained context; after parsing rows with json.loads into
inputs/outputs (used by client.create_examples and dataset.id), guard against
calling client.create_examples if inputs is empty (log and abort dataset
creation) and fix the double space in the dataset description ("MetaboT
Benchmark") so the created dataset uses the corrected description.
| def evaluate_result(_input, thread_id: int = 1): | ||
| """ | ||
| Evaluate the result based on input. | ||
|
|
||
| Args: | ||
| _input (dict): Input containing messages to process | ||
|
|
||
| Returns: | ||
| dict: The evaluation result | ||
| """ | ||
|
|
||
| # Prepare the message | ||
| """Evaluate the result based on input.""" | ||
| # Note: Adjust the 'question' key below if your dataset uses a different input key | ||
| input_text = _input.get("question") or _input["messages"][0]["content"] | ||
|
|
||
| message = { | ||
| "messages": [ | ||
| HumanMessage(content=_input["messages"][0]["content"]) | ||
| ] | ||
| "messages": [HumanMessage(content=input_text)] | ||
| } | ||
|
|
||
|
|
||
| response = app.invoke(message, | ||
| {"configurable": {"thread_id": thread_id}},) | ||
|
|
||
| response = app.invoke(message, {"configurable": {"thread_id": thread_id}}) | ||
| return {"output": response} |
There was a problem hiding this comment.
Falsy question falls through to messages[0]["content"].
_input.get("question") or _input["messages"][0]["content"] also triggers the fallback when question is present but an empty string, which will then KeyError if messages isn't populated. Use an explicit None check, and guard the messages indexing so a malformed example raises a clear error rather than a cryptic KeyError/IndexError.
🛠️ Proposed fix
-def evaluate_result(_input, thread_id: int = 1):
- """Evaluate the result based on input."""
- # Note: Adjust the 'question' key below if your dataset uses a different input key
- input_text = _input.get("question") or _input["messages"][0]["content"]
-
- message = {
- "messages": [HumanMessage(content=input_text)]
- }
+def evaluate_result(_input, thread_id: int = 1):
+ """Evaluate the result based on input."""
+ question = _input.get("question")
+ if question is not None:
+ input_text = question
+ else:
+ messages = _input.get("messages") or []
+ if not messages or "content" not in messages[0]:
+ raise ValueError(f"Example is missing both 'question' and a usable 'messages' entry: {_input!r}")
+ input_text = messages[0]["content"]
+
+ message = {"messages": [HumanMessage(content=input_text)]}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@app/core/tests/evaluation.py` around lines 118 - 128, In evaluate_result,
avoid using `_input.get("question") or _input["messages"][0]["content"]` because
an empty string will incorrectly fall through and can cause KeyError/IndexError;
instead explicitly check for None/absence of the "question" key (e.g., `question
is not None`) and only then use it, otherwise validate that `_input` contains a
non-empty "messages" list and that `messages[0]["content"]` exists; if
validation fails raise a clear ValueError describing the missing field so
callers of evaluate_result get a meaningful error before constructing the
HumanMessage and calling app.invoke.
| from pathlib import Path | ||
| from app.core.main import main | ||
|
|
||
| REPO_ROOT = Path(__file__).resolve().parents[2] | ||
| if str(REPO_ROOT) not in sys.path: | ||
| sys.path.insert(0, str(REPO_ROOT)) |
There was a problem hiding this comment.
Import order defeats the sys.path bootstrap.
from app.core.main import main on line 3 runs before the REPO_ROOT insertion on lines 5–7. If the repo root isn't already on sys.path (e.g., running python app/tests/installation_test.py directly, which is the scenario this bootstrap is meant to support), the import will fail with ModuleNotFoundError and the path fix below is never reached. Perform the sys.path mutation before importing from app.
🛠️ Proposed fix
import sys
from pathlib import Path
-from app.core.main import main
-
-REPO_ROOT = Path(__file__).resolve().parents[2]
-if str(REPO_ROOT) not in sys.path:
- sys.path.insert(0, str(REPO_ROOT))
+
+REPO_ROOT = Path(__file__).resolve().parents[2]
+if str(REPO_ROOT) not in sys.path:
+ sys.path.insert(0, str(REPO_ROOT))
+
+from app.core.main import main # noqa: E402 (import after sys.path bootstrap)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@app/tests/installation_test.py` around lines 2 - 7, The import of main occurs
before the REPO_ROOT sys.path bootstrap, causing ModuleNotFoundError when
running the test directly; move the sys.path mutation (REPO_ROOT calculation and
insertion into sys.path) above the import and then import main (from
app.core.main import main) after ensuring REPO_ROOT is inserted, so adjust the
file to compute REPO_ROOT, insert it into sys.path if missing, and only then
import main.
Fix node name mismatch
…mode - Added Interpreter Agent functionality to execute user-generated code in a controlled environment. - Introduced security measures to restrict imports and filesystem access within the interpreter. - Configured trusted mode to enable or disable Interpreter Agent features based on environment variables. - Updated load_config to conditionally include Interpreter Agent based on trusted mode. - Enhanced Streamlit app to reflect trusted mode status and manage file uploads accordingly. - Created sandbox_runner for executing code with resource limits and timeout handling. - Added security module for managing trusted mode and session path validation. - Implemented tests for security features and configuration loading behavior. - Documented changes and security implications in the agentic engineering log.
Improve interpreter free-text filepath extraction to handle quoted paths, spaces, and dotted filenames, and track files updated in place as well as newly created outputs. Add focused tests for path parsing, DB fallback behavior, and changed-file detection.
- Introduced SessionFilePreparationError for better error management during file staging. - Enhanced _prepare_session_files function to handle various file-related errors. - Added unit tests for session file preparation, covering valid file copying and error scenarios. - Documented changes in the agentic engineering log.
There was a problem hiding this comment.
Actionable comments posted: 8
🧹 Nitpick comments (5)
streamlit_webapp/streamlit_app.py (2)
29-29: Duplicateimport os.
osis already imported at line 4 inside the session-bootstrap block. The re-import at line 29 is harmless but redundant.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@streamlit_webapp/streamlit_app.py` at line 29, Remove the redundant "import os" statement (the duplicate import) so only the original import in the session-bootstrap block remains; locate the second occurrence of the import statement (the extra "import os") and delete it, then run linting/formatting to confirm no unused-import warnings remain.
373-400: Style: drop== True/== Falsecomparisons (Ruff E712).Static analysis flags lines 373, 374, 390, and 400 for explicit boolean equality. Direct truthiness checks read better and align with the rest of the file:
♻️ Suggested change
-if st.session_state.openai_key_success == True and st.session_state.endpoint_url_success == True: - if st.session_state.langgraph_app_created == False: +if st.session_state.openai_key_success and st.session_state.endpoint_url_success: + if not st.session_state.langgraph_app_created: @@ - if st.session_state.trusted_mode_enabled == False: + if not st.session_state.trusted_mode_enabled: @@ - if st.session_state.trusted_mode_enabled == True: + if st.session_state.trusted_mode_enabled:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@streamlit_webapp/streamlit_app.py` around lines 373 - 400, Replace explicit boolean equality checks like st.session_state.openai_key_success == True, st.session_state.endpoint_url_success == True, st.session_state.langgraph_app_created == False, and st.session_state.trusted_mode_enabled == False/== True with direct truthiness/negation checks (e.g., if st.session_state.openai_key_success:, if not st.session_state.langgraph_app_created:, if not st.session_state.trusted_mode_enabled:, if st.session_state.trusted_mode_enabled:) in the streamlit_app control flow around llm_creation and create_workflow so the logic is cleaner and Ruff E712 warnings are resolved.app/core/security.py (1)
22-38: Optional: handle invalid env var values gracefully.If any of
METABOT_INTERPRETER_TIMEOUT_SECONDS,METABOT_INTERPRETER_CPU_SECONDS,METABOT_INTERPRETER_MEMORY_MB, orMETABOT_INTERPRETER_MAX_FILE_MBare set to a non-integer (e.g.,"15s"),int(...)will raiseValueErrorat call-time, surfacing as a confusing error far from the configuration source. Consider wrapping the parse with a try/except that logs the offending var and falls back to the default.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/core/security.py` around lines 22 - 38, The parsing of environment variables in get_interpreter_timeout_seconds, get_interpreter_cpu_seconds, get_interpreter_memory_bytes, and get_interpreter_max_file_bytes can raise ValueError for non-integer values; update each function to catch parsing errors (ValueError) around int(...) calls, log the variable name and its invalid value (or use a provided logger), and fall back to the same default used now (e.g., "15" for timeout, use timeout value for CPU, "512" MB for memory, "64" MB for max file) before applying the min/max constraints and unit conversions so callers still get a valid int instead of an exception.app/core/agents/supervisor/prompt.py (1)
16-27: Note: Interpreter_agent is already absent from supervisor's enum when disabled.Defensive prompting is fine, but worth noting that when trusted mode is off,
Interpreter_agentis excluded frommembers(and hence from theroutefunction'senuminapp/core/agents/supervisor/agent.py:32). The model is structurally prevented from selecting it, so these prompt gates are belt-and-suspenders. Consider whether the multiple repetitions ("only when Interpreter_agent is available", etc.) actually help the model or just inflate the system prompt — a single paragraph plus the original instructions may suffice.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/core/agents/supervisor/prompt.py` around lines 16 - 27, The prompt repeats availability checks for Interpreter_agent which are redundant because Interpreter_agent is already removed from the supervisor members enum when disabled; simplify the wording to a single clear paragraph that: states only call Interpreter_agent when it appears in members/route enum, otherwise use SPARQL answer or state visualization is disabled; preserves behaviors around delegation from ENPKG_agent to Sparql_query_runner and when to mark FINISH; and references the relevant symbols Interpreter_agent, Sparql_query_runner, ENPKG_agent and the supervisor route/members enum so readers can locate and update the prompt in app/core/agents/supervisor/prompt.py and the route logic in app/core/agents/supervisor/agent.py.app/core/utils.py (1)
44-52: Fragile positional insertions — prefer semantic placement.Two positional assumptions baked in here:
- Line 45:
trusted_config["agents"].insert(3, interpreter_agent)assumes a specific ordering and at least 3 existing agents. Iflanggraph.jsonis reordered or shrinks, the interpreter agent ends up in an unexpected slot (or this raises unexpectedly when the list is shorter than 3 — actuallyinsertis forgiving, but ordering meaning is lost).- Line 52:
edge["targets"].insert(-1, interpreter_target)assumes the last target is alwaysFINISH/__end__. If a future edit reorders targets, the interpreter target gets placed before some other live target instead.Since insertion order doesn't matter for correctness (the routing dict in
langraph_workflow.pyis built from a comprehension),appendis safer:♻️ Suggested refactor
- if interpreter_agent not in trusted_config["agents"]: - trusted_config["agents"].insert(3, interpreter_agent) + if interpreter_agent not in trusted_config["agents"]: + trusted_config["agents"].append(interpreter_agent) @@ - for edge in trusted_config["conditional_edges"]: - if edge["source"] == "supervisor" and interpreter_target not in edge["targets"]: - edge["targets"].insert(-1, interpreter_target) + for edge in trusted_config["conditional_edges"]: + if edge["source"] == "supervisor" and interpreter_target not in edge["targets"]: + edge["targets"].append(interpreter_target)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/core/utils.py` around lines 44 - 52, The current code uses fragile positional inserts (trusted_config["agents"].insert(3, interpreter_agent) and edge["targets"].insert(-1, interpreter_target]) which rely on list ordering; change both to semantic appends so placement is not order-dependent: when adding the interpreter agent/edge/target in the block that checks trusted_config (symbols: trusted_config, interpreter_agent, interpreter_edge, interpreter_target, trusted_config["conditional_edges"], edge["targets"], trusted_config["agents"], trusted_config["edges"]) replace the positional inserts with .append(...) for agents/edges and append the interpreter_target to edge["targets"] instead of insert(-1) to ensure safe, order-independent addition.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@app/core/agents/interpreter/sandbox_runner.py`:
- Around line 49-63: The AST validation in validate_code currently only blocks
names but can be bypassed via attribute access; update validate_code to also
walk and reject ast.Attribute nodes where node.attr exists and node.attr starts
and ends with double underscores (node.attr.startswith("__") and
node.attr.endswith("__")), raising ValueError with a clear message (e.g.,
"Dunder attribute access is not allowed."); keep the existing
Import/ImportFrom/Name checks and add a brief comment in the function noting
this is defense-in-depth rather than a security boundary.
- Around line 248-258: The current try/finally around exec in execute_user_code
lets exceptions escape the stdout/stderr redirect so captured output is lost;
wrap the exec(...) call in a try/except BaseException block (inside the existing
redirect context) and on any exception return a structured result that includes
success: False, the stdout_buffer.getvalue(), stderr_buffer.getvalue(), and the
exception string (or traceback) so main() receives printed output even when user
code raises; keep the existing signal.alarm(0) in the finally block to clear the
timer.
In `@app/core/agents/interpreter/tool_interpreter.py`:
- Around line 203-214: The function extract_paths_from_db currently assumes
payload_json and its "output" field are dicts and can call .get directly, which
can raise AttributeError for malformed payloads; update extract_paths_from_db to
validate types before accessing keys: after json.loads(payload) check that
payload_json is a dict, retrieve output = payload_json.get("output") and if
isinstance(output, dict) return output.get("paths", []); if output is None or
not a dict, attempt to safely extract a top-level "paths" list (if
payload_json.get("paths") is a list) otherwise log a warning via logger.warning
about the unexpected payload shape (include tool_name and type info) and return
[] to fail softly; keep the existing JSONDecodeError handling and ensure no
AttributeErrors propagate.
- Around line 303-341: The result tempfile result_path is only unlinked on
timeout and after successful read, leaking it on other exceptions; modify the
try/finally around the subprocess invocation (the block that calls
subprocess.Popen and process.communicate) to ensure
result_path.unlink(missing_ok=True) runs in the same finally that currently
removes payload_path, and only skip deletion on the explicit success path after
you have read and processed the result file (e.g., set a local success flag or
delete conditionally after reading). Update references around subprocess.Popen,
process.communicate, payload_path.unlink, and any early-return branches so
result_path is always cleaned up on error paths.
- Around line 124-128: Wrap the llm.invoke call (where
SystemMessage/HumanMessage are passed) in a try/except to catch any exceptions
and log them, then return a user-friendly error string consistent with other
tool error paths; after a successful call, ensure response.content is normalized
to a str (e.g., if it's a list/dict/other, serialize or join into a string)
before passing it to extract_python_code so you avoid TypeError when
response.content is not a string. Keep references to llm.invoke, SystemMessage,
HumanMessage, response.content, and extract_python_code when implementing the
fix.
In `@app/core/agents/supervisor/prompt.py`:
- Line 33: Fix the typo "aproppriate" → "appropriate" in the supervisor prompt
text in app/core/agents/supervisor/prompt.py; locate the prompt string (e.g.,
the PROMPT_TEMPLATE or function build_supervisor_prompt / variable holding the
prompt) that contains the sentence about delegating interpretation to the
interpreter agent and correct the misspelling so the sentence reads "appropriate
agent".
In `@app/core/tests/test_interpreter_tool.py`:
- Around line 104-118: The test
test_find_changed_or_new_files_includes_overwritten_file is flaky on filesystems
with coarse mtime granularity because capture_session_file_metadata compares
(stat.st_mtime_ns, stat.st_size); ensure the file metadata actually changes by
modifying the test to either increase the sleep duration to >= 1 second between
writes or write content with different byte lengths (so stat.st_size differs)
before calling capture_session_file_metadata, referencing the test function name
and the methods capture_session_file_metadata and find_changed_or_new_files to
locate the code to update.
In `@streamlit_webapp/streamlit_app.py`:
- Around line 401-413: Sanitize client-supplied filenames before writing to
session_state.user_input_dir: do not trust file.name directly (from
st.file_uploader). Replace the direct join with a sanitized basename (e.g.,
Path(file.name).name or os.path.basename(file.name)) or pass the candidate path
through the existing resolve_session_path(..., must_exist=False) helper to
ensure path traversal is rejected, then open and write to that safe path; update
references around user_files loop where file_path is computed so uploads cannot
escape user_input_dir.
---
Nitpick comments:
In `@app/core/agents/supervisor/prompt.py`:
- Around line 16-27: The prompt repeats availability checks for
Interpreter_agent which are redundant because Interpreter_agent is already
removed from the supervisor members enum when disabled; simplify the wording to
a single clear paragraph that: states only call Interpreter_agent when it
appears in members/route enum, otherwise use SPARQL answer or state
visualization is disabled; preserves behaviors around delegation from
ENPKG_agent to Sparql_query_runner and when to mark FINISH; and references the
relevant symbols Interpreter_agent, Sparql_query_runner, ENPKG_agent and the
supervisor route/members enum so readers can locate and update the prompt in
app/core/agents/supervisor/prompt.py and the route logic in
app/core/agents/supervisor/agent.py.
In `@app/core/security.py`:
- Around line 22-38: The parsing of environment variables in
get_interpreter_timeout_seconds, get_interpreter_cpu_seconds,
get_interpreter_memory_bytes, and get_interpreter_max_file_bytes can raise
ValueError for non-integer values; update each function to catch parsing errors
(ValueError) around int(...) calls, log the variable name and its invalid value
(or use a provided logger), and fall back to the same default used now (e.g.,
"15" for timeout, use timeout value for CPU, "512" MB for memory, "64" MB for
max file) before applying the min/max constraints and unit conversions so
callers still get a valid int instead of an exception.
In `@app/core/utils.py`:
- Around line 44-52: The current code uses fragile positional inserts
(trusted_config["agents"].insert(3, interpreter_agent) and
edge["targets"].insert(-1, interpreter_target]) which rely on list ordering;
change both to semantic appends so placement is not order-dependent: when adding
the interpreter agent/edge/target in the block that checks trusted_config
(symbols: trusted_config, interpreter_agent, interpreter_edge,
interpreter_target, trusted_config["conditional_edges"], edge["targets"],
trusted_config["agents"], trusted_config["edges"]) replace the positional
inserts with .append(...) for agents/edges and append the interpreter_target to
edge["targets"] instead of insert(-1) to ensure safe, order-independent
addition.
In `@streamlit_webapp/streamlit_app.py`:
- Line 29: Remove the redundant "import os" statement (the duplicate import) so
only the original import in the session-bootstrap block remains; locate the
second occurrence of the import statement (the extra "import os") and delete it,
then run linting/formatting to confirm no unused-import warnings remain.
- Around line 373-400: Replace explicit boolean equality checks like
st.session_state.openai_key_success == True,
st.session_state.endpoint_url_success == True,
st.session_state.langgraph_app_created == False, and
st.session_state.trusted_mode_enabled == False/== True with direct
truthiness/negation checks (e.g., if st.session_state.openai_key_success:, if
not st.session_state.langgraph_app_created:, if not
st.session_state.trusted_mode_enabled:, if
st.session_state.trusted_mode_enabled:) in the streamlit_app control flow around
llm_creation and create_workflow so the logic is cleaner and Ruff E712 warnings
are resolved.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 2f10a01f-a2ee-46b8-931a-28b4590ac276
⛔ Files ignored due to path filters (1)
app/config/langgraph.jsonis excluded by!**/*.json
📒 Files selected for processing (11)
app/core/agents/agents_factory.pyapp/core/agents/interpreter/prompt.pyapp/core/agents/interpreter/sandbox_runner.pyapp/core/agents/interpreter/tool_interpreter.pyapp/core/agents/supervisor/agent.pyapp/core/agents/supervisor/prompt.pyapp/core/security.pyapp/core/tests/test_interpreter_tool.pyapp/core/tests/test_security.pyapp/core/utils.pystreamlit_webapp/streamlit_app.py
✅ Files skipped from review due to trivial changes (1)
- app/core/agents/interpreter/prompt.py
Improve interpreter sandbox validation and failure handling by blocking dunder attribute access, preserving captured output on execution errors, softening malformed DB payload parsing, and cleaning up subprocess tempfiles on all error paths. Also sanitize uploaded filenames in Streamlit, fix the supervisor prompt typo, and add focused regression tests for the sandbox runner and interpreter tool.
add CLI file input support for FILE_ANALYZER tool when running locally
|
@qodo-code-review |
…aunch instructions; add test for Streamlit app rendering
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (7)
app/core/tests/test_sparql_tool.py (2)
5-8: Minor:constructfallback will likely never hit on supported pydantic versions.The project is on pydantic v2 (langchain/openai stack in this PR), where
model_constructalways exists. Theconstructfallback is effectively dead code unless you explicitly support pydantic v1 in tests. Not blocking — just worth simplifying if v1 support isn't required.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/core/tests/test_sparql_tool.py` around lines 5 - 8, The helper _construct_tool contains a fallback to GraphSparqlQAChain.construct which is unnecessary on pydantic v2; simplify by removing the fallback and always using tool_sparql.GraphSparqlQAChain.model_construct(**kwargs) (update the _construct_tool function to call model_construct directly and remove the hasattr check and construct branch).
55-84: Optional: add coverage for theOPENAI_API_KEYenv fallback.
_construct_toolusesmodel_construct/constructwhich bypassesGraphSparqlQAChain.__init__, so theopenai_key or os.getenv("OPENAI_API_KEY")fallback attool_sparql.py:261is never exercised. Consider an additional test that instantiates via the real__init__(with mocked LLMChain) or at least pre-setsself.openai_key = Noneand monkeypatchesos.environto assert the env fallback feedsOpenAIEmbeddings.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/core/tests/test_sparql_tool.py` around lines 55 - 84, Add a new test that verifies the OPENAI_API_KEY env fallback by ensuring OpenAIEmbeddings receives the env value: instantiate the real GraphSparqlQAChain (or construct the same Tool instance but set its self.openai_key = None) instead of using _construct_tool so the __init__ path runs, monkeypatch os.environ["OPENAI_API_KEY"] to a test key, monkeypatch tool_sparql.OpenAIEmbeddings to capture the api_key, and call the search method (e.g., search_nodes) to assert captured["api_key"] equals the env value; reference GraphSparqlQAChain.__init__, _construct_tool, and tool_sparql.OpenAIEmbeddings to locate where to change the test.app/core/agents/sparql/tool_sparql.py (1)
237-261: Optional: clarify api_key fallback handlingThe current code safely falls back to the
OPENAI_API_KEYenvironment variable whenopenai_key=Noneis passed toOpenAIEmbeddings(). However, for better code clarity and to make the fallback intent explicit, consider only passingapi_keywhen a value is present:Suggested refactor
- embeddings = OpenAIEmbeddings(api_key=self.openai_key) + embeddings = ( + OpenAIEmbeddings(api_key=self.openai_key) + if self.openai_key + else OpenAIEmbeddings() + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/core/agents/sparql/tool_sparql.py` around lines 237 - 261, The constructor currently sets self.openai_key = openai_key or os.getenv("OPENAI_API_KEY"); update usages so the OpenAIEmbeddings instantiation (and any other OpenAI client creation) only includes the api_key parameter when self.openai_key is truthy—i.e., pass api_key=self.openai_key conditionally instead of always passing api_key=None—so that the fallback to the environment variable remains explicit and avoids sending a None value; locate this in code paths that construct OpenAIEmbeddings or OpenAI clients referenced from this class (use the __init__ and self.openai_key symbols to find the relevant spots).app/core/tests/test_main.py (2)
52-56: Replaceassert False+try/exceptwithpytest.raises(Ruff B011).
assert Falseis stripped underpython -O, which silently turns these negative-path tests into pass-throughs (Ruff B011). The idiomatic pytest pattern ispytest.raises, which also lets you assert the message via thematch=regex and removes the manualtry/exceptboilerplate at lines 52‑56, 76‑80, and 92‑96.♻️ Proposed refactor (apply to all three negative-path tests)
+import pytest @@ - try: - main_module._prepare_session_files("session-id", [str(source_dir)]) - assert False, "Expected SessionFilePreparationError" - except main_module.SessionFilePreparationError as exc: - assert "not a file" in str(exc) + with pytest.raises(main_module.SessionFilePreparationError, match="not a file"): + main_module._prepare_session_files("session-id", [str(source_dir)]) @@ - try: - main_module._prepare_session_files("session-id", [str(first_file), str(second_file)]) - assert False, "Expected SessionFilePreparationError" - except main_module.SessionFilePreparationError as exc: - assert "would overwrite" in str(exc) + with pytest.raises(main_module.SessionFilePreparationError, match="would overwrite"): + main_module._prepare_session_files("session-id", [str(first_file), str(second_file)]) @@ - try: - main_module._prepare_session_files("session-id", [str(staged_file)]) - assert False, "Expected SessionFilePreparationError" - except main_module.SessionFilePreparationError as exc: - assert "already staged" in str(exc) + with pytest.raises(main_module.SessionFilePreparationError, match="already staged"): + main_module._prepare_session_files("session-id", [str(staged_file)])Also applies to: 76-80, 92-96
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/core/tests/test_main.py` around lines 52 - 56, Replace the try/except + assert False negative-path tests with pytest.raises: for each occurrence testing main_module._prepare_session_files that expects main_module.SessionFilePreparationError (including the other two blocks at lines 76-80 and 92-96), use pytest.raises(main_module.SessionFilePreparationError, match="not a file") to both assert the exception is raised and verify the message; remove the manual try/except and assert False boilerplate and import pytest if not already imported.
100-100: Redundantsys.argvsave/restore —monkeypatchalready handles cleanup.
monkeypatch.setattr(sys, "argv", ...)is automatically reverted at test teardown, so savingoriginal_argvand re-monkeypatching it insidefinallyis unnecessary. Callingmonkeypatch.setattrfrom afinallyblock also adds an undo entry that fires after the test exits — harmless here, but noisy. You can drop theoriginal_argvbookkeeping and thetry/finallywrappingmain_module.main()in both tests.♻️ Proposed cleanup
-def test_main_passes_cli_api_key_and_reconfigures_logger(monkeypatch): - original_argv = sys.argv[:] - state = {"session_initialized": False, "setup_logger_states": []} +def test_main_passes_cli_api_key_and_reconfigures_logger(monkeypatch): + state = {"session_initialized": False, "setup_logger_states": []} @@ - try: - main_module.main() - finally: - monkeypatch.setattr(sys, "argv", original_argv) + main_module.main() @@ -def test_main_prints_user_friendly_error_for_bad_staged_file(monkeypatch, capsys): - original_argv = sys.argv[:] - - monkeypatch.setattr(sys, "argv", ["prog", "-c", "hello", "-f", "/missing/file.csv"]) +def test_main_prints_user_friendly_error_for_bad_staged_file(monkeypatch, capsys): + monkeypatch.setattr(sys, "argv", ["prog", "-c", "hello", "-f", "/missing/file.csv"]) @@ - try: - main_module.main() - finally: - monkeypatch.setattr(sys, "argv", original_argv) + main_module.main()If you also drop
import sys(no longer referenced), the module gets slightly tidier.Also applies to: 134-137, 148-148, 166-169
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/core/tests/test_main.py` at line 100, Remove the redundant save/restore of sys.argv and the try/finally around main_module.main(): drop the original_argv = sys.argv[:] variable and the finally block that calls monkeypatch.setattr(sys, "argv", original_argv), since monkeypatch.setattr(sys, "argv", ...) already auto-reverts at teardown; update both tests that wrap main_module.main() (and the other occurrences referenced) to call main_module.main() directly under the monkeypatch, and remove the now-unused import sys from the test module.app/core/main.py (1)
10-10: MoveChatOpenAIimport tolangchain_openaipackage.
ChatOpenAIis deprecated inlangchain_community.chat_modelsand has moved tolangchain_openai(already in your dependencies at v0.3.35). Importing from the old path triggers aLangChainDeprecationWarning. Split the imports soChatOpenAIcomes fromlangchain_openaiandChatLiteLLMremains inlangchain_community.Proposed import change
-from langchain_community.chat_models import ChatOpenAI, ChatLiteLLM +from langchain_openai import ChatOpenAI +from langchain_community.chat_models import ChatLiteLLM🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/core/main.py` at line 10, Update the imports so ChatOpenAI is imported from langchain_openai while ChatLiteLLM remains imported from langchain_community.chat_models: locate the import line that currently references both ChatOpenAI and ChatLiteLLM and split it into two imports, using ChatOpenAI from the langchain_openai package and ChatLiteLLM from langchain_community.chat_models to eliminate the deprecation warning.app/core/tests/test_sandbox_runner.py (1)
6-11: Preferpytest.raisesoverassert False(Ruff B011).
assert Falseis stripped underpython -O, which would silently turn this test into a no-op. Usingpytest.raisesalso gives clearer failure output.♻️ Suggested change
+import pytest + def test_validate_code_rejects_dunder_attribute_access(): - try: - validate_code("x = object().__class__") - assert False, "Expected ValueError" - except ValueError as exc: - assert "Dunder attribute access" in str(exc) + with pytest.raises(ValueError, match="Dunder attribute access"): + validate_code("x = object().__class__")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/core/tests/test_sandbox_runner.py` around lines 6 - 11, The test test_validate_code_rejects_dunder_attribute_access uses a try/except with assert False which is removed under python -O; replace that pattern with pytest.raises(ValueError) as excinfo and assert the error message contains "Dunder attribute access" using str(excinfo.value). Update the test to import pytest if needed and call validate_code("x = object().__class__") inside the pytest.raises context to ensure the ValueError is properly asserted.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@app/core/tests/test_sandbox_runner.py`:
- Around line 14-35: The test calls execute_user_code directly which mutates
global process state (os.chdir, apply_resource_limits setting RLIMITs,
disable_network monkey-patching, and tempfile.tempdir) and therefore must
instead spawn the real runner as a separate process; update the test to invoke
the sandbox entrypoint (the same invocation path used by
tool_interpreter.execute_in_subprocess that runs python -I sandbox_runner.py
...) with the same payload, capture stdout/stderr and exit status, and assert
the captured output/error/traceback from the subprocess result rather than
calling execute_user_code in-process, ensuring no global state (disable_network,
tempfile.tempdir, RLIMITs, CWD) is modified in the pytest worker.
---
Nitpick comments:
In `@app/core/agents/sparql/tool_sparql.py`:
- Around line 237-261: The constructor currently sets self.openai_key =
openai_key or os.getenv("OPENAI_API_KEY"); update usages so the OpenAIEmbeddings
instantiation (and any other OpenAI client creation) only includes the api_key
parameter when self.openai_key is truthy—i.e., pass api_key=self.openai_key
conditionally instead of always passing api_key=None—so that the fallback to the
environment variable remains explicit and avoids sending a None value; locate
this in code paths that construct OpenAIEmbeddings or OpenAI clients referenced
from this class (use the __init__ and self.openai_key symbols to find the
relevant spots).
In `@app/core/main.py`:
- Line 10: Update the imports so ChatOpenAI is imported from langchain_openai
while ChatLiteLLM remains imported from langchain_community.chat_models: locate
the import line that currently references both ChatOpenAI and ChatLiteLLM and
split it into two imports, using ChatOpenAI from the langchain_openai package
and ChatLiteLLM from langchain_community.chat_models to eliminate the
deprecation warning.
In `@app/core/tests/test_main.py`:
- Around line 52-56: Replace the try/except + assert False negative-path tests
with pytest.raises: for each occurrence testing
main_module._prepare_session_files that expects
main_module.SessionFilePreparationError (including the other two blocks at lines
76-80 and 92-96), use pytest.raises(main_module.SessionFilePreparationError,
match="not a file") to both assert the exception is raised and verify the
message; remove the manual try/except and assert False boilerplate and import
pytest if not already imported.
- Line 100: Remove the redundant save/restore of sys.argv and the try/finally
around main_module.main(): drop the original_argv = sys.argv[:] variable and the
finally block that calls monkeypatch.setattr(sys, "argv", original_argv), since
monkeypatch.setattr(sys, "argv", ...) already auto-reverts at teardown; update
both tests that wrap main_module.main() (and the other occurrences referenced)
to call main_module.main() directly under the monkeypatch, and remove the
now-unused import sys from the test module.
In `@app/core/tests/test_sandbox_runner.py`:
- Around line 6-11: The test test_validate_code_rejects_dunder_attribute_access
uses a try/except with assert False which is removed under python -O; replace
that pattern with pytest.raises(ValueError) as excinfo and assert the error
message contains "Dunder attribute access" using str(excinfo.value). Update the
test to import pytest if needed and call validate_code("x = object().__class__")
inside the pytest.raises context to ensure the ValueError is properly asserted.
In `@app/core/tests/test_sparql_tool.py`:
- Around line 5-8: The helper _construct_tool contains a fallback to
GraphSparqlQAChain.construct which is unnecessary on pydantic v2; simplify by
removing the fallback and always using
tool_sparql.GraphSparqlQAChain.model_construct(**kwargs) (update the
_construct_tool function to call model_construct directly and remove the hasattr
check and construct branch).
- Around line 55-84: Add a new test that verifies the OPENAI_API_KEY env
fallback by ensuring OpenAIEmbeddings receives the env value: instantiate the
real GraphSparqlQAChain (or construct the same Tool instance but set its
self.openai_key = None) instead of using _construct_tool so the __init__ path
runs, monkeypatch os.environ["OPENAI_API_KEY"] to a test key, monkeypatch
tool_sparql.OpenAIEmbeddings to capture the api_key, and call the search method
(e.g., search_nodes) to assert captured["api_key"] equals the env value;
reference GraphSparqlQAChain.__init__, _construct_tool, and
tool_sparql.OpenAIEmbeddings to locate where to change the test.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 33cc3106-5fc7-4586-a12d-c4a9c3a2f897
📒 Files selected for processing (14)
README.mdapp/core/agents/interpreter/sandbox_runner.pyapp/core/agents/interpreter/tool_interpreter.pyapp/core/agents/sparql/agent.pyapp/core/agents/sparql/tool_sparql.pyapp/core/agents/supervisor/prompt.pyapp/core/main.pyapp/core/tests/test_interpreter_tool.pyapp/core/tests/test_main.pyapp/core/tests/test_sandbox_runner.pyapp/core/tests/test_sparql_tool.pystreamlit_webapp/README.mdstreamlit_webapp/streamlit_app.pystreamlit_webapp/test_streamlit_app.py
✅ Files skipped from review due to trivial changes (2)
- README.md
- streamlit_webapp/README.md
🚧 Files skipped from review as they are similar to previous changes (1)
- app/core/agents/supervisor/prompt.py
Stabilize Streamlit startup/import handling, add a render smoke test, and update the README with v1.1.0 launch guidance. Also harden follow-up test coverage by moving sandbox-runner execution checks into a subprocess, cleaning up pytest assertions, and refining SPARQL/OpenAI key handling to avoid env-only regressions and deprecation warnings.
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (7)
app/core/tests/test_sparql_tool.py (1)
42-47: Optional: relax dict-equality to a subset check.
assert captured["tool_kwargs"] == {...}will break the momentcreate_agentforwards an extra kwarg (e.g., ifllm_instanceever flows intotool_parameters). Consider asserting individual keys or using a subset comparison so this test stays focused on theopenai_keyplumbing it's actually verifying.♻️ Suggested adjustment
- assert captured["tool_kwargs"] == { - "graph": "graph", - "llm": {sparql_agent.MODEL_CHOICE: "model"}, - "session_id": "session-123", - "openai_key": "cli-key", - } + assert captured["tool_kwargs"]["graph"] == "graph" + assert captured["tool_kwargs"]["llm"] == {sparql_agent.MODEL_CHOICE: "model"} + assert captured["tool_kwargs"]["session_id"] == "session-123" + assert captured["tool_kwargs"]["openai_key"] == "cli-key"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/core/tests/test_sparql_tool.py` around lines 42 - 47, The test currently asserts exact dict equality on captured["tool_kwargs"], which is brittle; change the assertion to only check the relevant keys (e.g., assert captured["tool_kwargs"]["openai_key"] == "cli-key" and assert captured["tool_kwargs"]["session_id"] == "session-123") or use a subset comparison (e.g., for each key in {"openai_key","session_id","graph"} assert key in captured["tool_kwargs"] and values match) so create_agent forwarding extra kwargs (like llm_instance) won't break the test.app/core/agents/sparql/tool_sparql.py (1)
397-399: Optional: simplify by always forwardingapi_key=self.openai_key.
OpenAIEmbeddingsinlangchain_openaitreatsapi_key=Noneas "use environment variable", so the conditional kwargs is functionally equivalent to passing the argument directly:♻️ Optional simplification
- embedding_kwargs = {"api_key": self.openai_key} if self.openai_key else {} - embeddings = OpenAIEmbeddings(**embedding_kwargs) + embeddings = OpenAIEmbeddings(api_key=self.openai_key)Keep the current form if you prefer being explicit about the conditional logic.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/core/agents/sparql/tool_sparql.py` around lines 397 - 399, The code currently builds embedding_kwargs conditionally before instantiating OpenAIEmbeddings and loading FAISS; simplify by passing api_key=self.openai_key directly to OpenAIEmbeddings (replace the embedding_kwargs + OpenAIEmbeddings(**embedding_kwargs) pattern with OpenAIEmbeddings(api_key=self.openai_key)), leaving FAISS.load_local(db_path, embeddings, allow_dangerous_deserialization=True) unchanged; update references to embedding_kwargs/embeddings accordingly and remove the now-unneeded embedding_kwargs variable.README.md (1)
109-116: Reconcile the two Streamlit launch recipes.Lines 109-116 advertise
python -m streamlit run streamlit_webapp/streamlit_app.pyfrom the repo root as the recommended/tested workflow, but lines 171-179 still document the legacyexport PYTHONPATH="$(pwd):${PYTHONPATH}"; streamlit run ...recipe. Now thatstreamlit_app.pyinserts the repo root intosys.pathitself, thePYTHONPATHworkaround is obsolete and risks confusing readers. Consider deleting the lines 175-178 block (or merging it into the v1.1.0-aligned one) so there is a single source of truth.Also applies to: 171-179
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@README.md` around lines 109 - 116, Remove the legacy PYTHONPATH launch recipe and keep a single, up-to-date Streamlit instruction: delete the export PYTHONPATH block (the legacy recipe that uses export PYTHONPATH="$(pwd):${PYTHONPATH}"; streamlit run ...) and ensure README only documents the repo-root invocation using python -m streamlit run streamlit_webapp/streamlit_app.py, since streamlit_app.py already injects the repo root into sys.path; alternatively, if you prefer consolidation, merge the legacy text into the v1.1.0-aligned paragraph and note that contributor/admin keys can still be set via environment variables.app/core/main.py (2)
347-357: Stage/validate files before initializing LLMs to fail fast.
llm_creation(...)constructsChatOpenAI/ChatLiteLLMclients (and validates API key configuration) before_prepare_session_filesruns. If the user passes a bad-fpath, they currently pay the full LLM-init cost (and may get a confusing key-related error first if the env is misconfigured) before hitting the file-staging error. Reordering so that staging happens immediately after argument parsing — or at minimum beforellm_creation— gives cleaner, faster CLI feedback.♻️ Proposed reorder
- # Initialize language models - models = llm_creation(api_key=args.api_key) - - # Stage user-provided files into the session's input directory if args.file: try: _prepare_session_files(session_id, args.file) except SessionFilePreparationError as exc: logger.error(str(exc)) print(f"Error: {exc}") return + + # Initialize language models + models = llm_creation(api_key=args.api_key)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/core/main.py` around lines 347 - 357, The LLM clients are being initialized by llm_creation(...) before staging/validating user files, which can cause costly/ confusing failures; move the _prepare_session_files(session_id, args.file) call (and its SessionFilePreparationError handling) to run immediately after argument parsing and before calling llm_creation(...), so the CLI validates and fails fast on bad -f paths; ensure args.file is checked the same way and that models = llm_creation(api_key=args.api_key) only executes after successful _prepare_session_files.
214-278: Solid staging logic; one optional hardening suggestion.The validation chain (exists → is_file → basename collision → already-staged → destination conflict → copy with error wrapping) is well-ordered and the test suite covers the main branches.
Path.nameplusresolve(strict=False)correctly neutralizes..traversal attempts in user-supplied paths.Optional:
shutil.copy2follows symlinks on the source by default, which matches typical staging semantics, but if you want to refuse symlinked inputs (defense in depth, given the staged files are later read by the interpreter agent in trusted mode), you could addif src.is_symlink(): raise SessionFilePreparationError(...)before the copy. Not blocking.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/core/main.py` around lines 214 - 278, Add a defensive check to refuse symlinked inputs in _prepare_session_files: after resolving src and validating existence/is_file, if src.is_symlink() raise SessionFilePreparationError(src, f"Input file is a symlink and is not allowed: {src}") (place this before the shutil.copy2 call and before the src == dest.resolve(...) check so symlinks are rejected early); this prevents staging symlinked sources even though shutil.copy2 would follow them.app/core/tests/test_main.py (2)
91-132: LGTM — verifies the new--api-keyandsession_idplumbing.Capturing both
llm_api_keyandworkflow_api_keyensures the CLI flag flows intollm_creationandcreate_workflow(matchingmain.py:348andmain.py:362-365). Thesetup_logger_states == [True]assertion neatly enforces that logger reconfiguration happens afterinitialize_session_context, which is the whole point of the global rebinding atmain.py:334-335.One small caveat to keep in mind: the
create_user_sessionmock (line 119) omits theuser_session_dirparameter that exists on the real signature — fine today, but ifmain()ever starts requestinguser_session_dir=True, this lambda will raiseTypeError. Consider**kwargsfor resilience.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/core/tests/test_main.py` around lines 91 - 132, The test's mock for create_user_session in test_main_passes_cli_api_key_and_reconfigures_logger does not match the real signature (it omits user_session_dir), risking TypeError if main() passes that kwarg; update the monkeypatched mock (the lambda passed to main_module.create_user_session) to accept the same optional parameters or use a catch-all like session_id=None, **kwargs and return "session-123" so it remains resilient if main() starts supplying user_session_dir or other kwargs.
143-150: Optional: replace the(_ for _ in ()).throw(...)idiom with a small helper.The generator-throw trick works but is hard to read at a glance. A named helper makes intent obvious and lets future maintainers grep for it.
♻️ Suggested refactor
+ def _raise_missing(session_id, file_paths): + raise main_module.SessionFilePreparationError( + Path(file_paths[0]), f"File not found: {file_paths[0]}" + ) + - monkeypatch.setattr( - main_module, - "_prepare_session_files", - lambda session_id, file_paths: (_ for _ in ()).throw( - main_module.SessionFilePreparationError(Path(file_paths[0]), f"File not found: {file_paths[0]}") - ), - ) - monkeypatch.setattr(main_module, "create_workflow", lambda *args, **kwargs: (_ for _ in ()).throw(AssertionError("workflow should not run"))) + def _fail_create_workflow(*args, **kwargs): + raise AssertionError("workflow should not run") + + monkeypatch.setattr(main_module, "_prepare_session_files", _raise_missing) + monkeypatch.setattr(main_module, "create_workflow", _fail_create_workflow)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/core/tests/test_main.py` around lines 143 - 150, The test uses the hard-to-read "(_ for _ in ()).throw(...)" trick when monkeypatching _prepare_session_files and create_workflow; add a small named helper (e.g., raise_when_called or raising) that takes an exception and returns a callable which raises that exception when invoked, then replace the generator-throw usages with calls to that helper for main_module._prepare_session_files (raising SessionFilePreparationError(...)) and main_module.create_workflow (raising AssertionError("workflow should not run")) so intent is clear and easy to grep.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@app/core/tests/test_sandbox_runner.py`:
- Around line 43-61: The subprocess.run invocation that calls the sandbox runner
(the call with sys.executable and str(Path(sandbox_runner.__file__).resolve())
using payload_path and result_path, executed in cwd=session_dir) needs an outer
timeout to avoid hanging the pytest worker if the sandbox's internal timeout
fails; add a timeout parameter (e.g., timeout=8 or a few seconds greater than
the sandbox's timeout_seconds: 5) to the subprocess.run call and ensure the test
treats subprocess.TimeoutExpired as a failure so the test fails fast rather than
hanging.
In `@README.md`:
- Line 32: Remove the stray Git merge-conflict marker by deleting the literal
line "=======" from README.md (the unresolved conflict delimiter), then verify
surrounding content for any leftover conflict markers like "<<<<<<<" or
">>>>>>>" and ensure the README renders correctly; commit the cleaned file with
a clear message indicating the merge conflict was resolved.
In `@streamlit_webapp/README.md`:
- Line 121: The README currently contains a literal placeholder "[Include
support contact information here]"; remove that placeholder or replace it with
the real support channel (e.g., a GitHub Issues URL, maintainer email, or
mailing list) so the published README doesn't show an unfilled template; update
the README line accordingly or delete it if no public contact is intended.
---
Nitpick comments:
In `@app/core/agents/sparql/tool_sparql.py`:
- Around line 397-399: The code currently builds embedding_kwargs conditionally
before instantiating OpenAIEmbeddings and loading FAISS; simplify by passing
api_key=self.openai_key directly to OpenAIEmbeddings (replace the
embedding_kwargs + OpenAIEmbeddings(**embedding_kwargs) pattern with
OpenAIEmbeddings(api_key=self.openai_key)), leaving FAISS.load_local(db_path,
embeddings, allow_dangerous_deserialization=True) unchanged; update references
to embedding_kwargs/embeddings accordingly and remove the now-unneeded
embedding_kwargs variable.
In `@app/core/main.py`:
- Around line 347-357: The LLM clients are being initialized by
llm_creation(...) before staging/validating user files, which can cause costly/
confusing failures; move the _prepare_session_files(session_id, args.file) call
(and its SessionFilePreparationError handling) to run immediately after argument
parsing and before calling llm_creation(...), so the CLI validates and fails
fast on bad -f paths; ensure args.file is checked the same way and that models =
llm_creation(api_key=args.api_key) only executes after successful
_prepare_session_files.
- Around line 214-278: Add a defensive check to refuse symlinked inputs in
_prepare_session_files: after resolving src and validating existence/is_file, if
src.is_symlink() raise SessionFilePreparationError(src, f"Input file is a
symlink and is not allowed: {src}") (place this before the shutil.copy2 call and
before the src == dest.resolve(...) check so symlinks are rejected early); this
prevents staging symlinked sources even though shutil.copy2 would follow them.
In `@app/core/tests/test_main.py`:
- Around line 91-132: The test's mock for create_user_session in
test_main_passes_cli_api_key_and_reconfigures_logger does not match the real
signature (it omits user_session_dir), risking TypeError if main() passes that
kwarg; update the monkeypatched mock (the lambda passed to
main_module.create_user_session) to accept the same optional parameters or use a
catch-all like session_id=None, **kwargs and return "session-123" so it remains
resilient if main() starts supplying user_session_dir or other kwargs.
- Around line 143-150: The test uses the hard-to-read "(_ for _ in
()).throw(...)" trick when monkeypatching _prepare_session_files and
create_workflow; add a small named helper (e.g., raise_when_called or raising)
that takes an exception and returns a callable which raises that exception when
invoked, then replace the generator-throw usages with calls to that helper for
main_module._prepare_session_files (raising SessionFilePreparationError(...))
and main_module.create_workflow (raising AssertionError("workflow should not
run")) so intent is clear and easy to grep.
In `@app/core/tests/test_sparql_tool.py`:
- Around line 42-47: The test currently asserts exact dict equality on
captured["tool_kwargs"], which is brittle; change the assertion to only check
the relevant keys (e.g., assert captured["tool_kwargs"]["openai_key"] ==
"cli-key" and assert captured["tool_kwargs"]["session_id"] == "session-123") or
use a subset comparison (e.g., for each key in
{"openai_key","session_id","graph"} assert key in captured["tool_kwargs"] and
values match) so create_agent forwarding extra kwargs (like llm_instance) won't
break the test.
In `@README.md`:
- Around line 109-116: Remove the legacy PYTHONPATH launch recipe and keep a
single, up-to-date Streamlit instruction: delete the export PYTHONPATH block
(the legacy recipe that uses export PYTHONPATH="$(pwd):${PYTHONPATH}"; streamlit
run ...) and ensure README only documents the repo-root invocation using python
-m streamlit run streamlit_webapp/streamlit_app.py, since streamlit_app.py
already injects the repo root into sys.path; alternatively, if you prefer
consolidation, merge the legacy text into the v1.1.0-aligned paragraph and note
that contributor/admin keys can still be set via environment variables.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: e7ef6a79-580d-4f04-93d1-fd8a636ff49f
📒 Files selected for processing (7)
README.mdapp/core/agents/sparql/tool_sparql.pyapp/core/main.pyapp/core/tests/test_main.pyapp/core/tests/test_sandbox_runner.pyapp/core/tests/test_sparql_tool.pystreamlit_webapp/README.md
Add how to run evaluation.py (LangSmith api key is needed)
User description
MetaboT v1.1 0 Update dependencies, replace code interpreter, someother improvements
PR Type
enhancement, bug_fix, dependencies, tests
Description
Replace
codeinterpreterapiwithPythonREPLfor local executionUpdate dependencies including
langchain,openai,pydantic, andnumpyFix deprecated
pydanticimports and update requirementsAdd
big_benchmark.csvfor automated evaluationDiagram Walkthrough
File Walkthrough
14 files
Update `pydantic` import and type annotationsUpdate `pydantic` import and type annotationsUpdate `pydantic` import and type annotationsUpdate `pydantic` import and type annotationsAdd `FileAnalyzerInput` schema and update importsPass `llm_instance` to tool constructorReplace `codeinterpreterapi` with `PythonREPL`Update `pydantic` import and type annotationsUpdate `pydantic` import and type annotationsUpdate `pydantic` import and type annotationsUpdate `pydantic` import and type annotationsUpdate `pydantic` import and type annotationsUpdate memory database manager to use `MemorySaver`Improve error logging with exception info1 files
Remove unused custom SQLite checkpoint file2 files
Add automated evaluation setup and checksUpdate installation test setup1 files
Update default OpenAI models to gpt-5.41 files
Update and add dependencies for new features