From 0245aa8009d72b503cbb0e5b974fafe9ef56cde7 Mon Sep 17 00:00:00 2001 From: NingWang <84500213+NingWang-art@users.noreply.github.com> Date: Fri, 19 Sep 2025 05:20:04 +0800 Subject: [PATCH 1/2] fix: change to select the most suitable subagent, to reduce hallucination --- .../bohriumpublic_agent/prompt.py | 5 ++ agents/matmaster_agent/MrDice_agent/prompt.py | 60 ++++++++++--------- agents/matmaster_agent/prompt.py | 9 +-- 3 files changed, 39 insertions(+), 35 deletions(-) diff --git a/agents/matmaster_agent/MrDice_agent/bohriumpublic_agent/prompt.py b/agents/matmaster_agent/MrDice_agent/bohriumpublic_agent/prompt.py index 70082869..e1258627 100644 --- a/agents/matmaster_agent/MrDice_agent/bohriumpublic_agent/prompt.py +++ b/agents/matmaster_agent/MrDice_agent/bohriumpublic_agent/prompt.py @@ -95,6 +95,11 @@ spacegroup_number: 14 atom_count_range: ["50","100"] +4) 用户:检索 FeNi 合金的结构 + → Tool: fetch_bohrium_crystals + elements: ["Fe","Ni"] # 合金只含有Fe和Ni元素,不能含有其他元素 + match_mode: 1 # 合金需要精确匹配 + 5) 用户:找所有化学式中包含 SiO3 的材料 → Tool: fetch_bohrium_crystals formula: "SiO3" diff --git a/agents/matmaster_agent/MrDice_agent/prompt.py b/agents/matmaster_agent/MrDice_agent/prompt.py index 642a6c25..57bc7b6f 100644 --- a/agents/matmaster_agent/MrDice_agent/prompt.py +++ b/agents/matmaster_agent/MrDice_agent/prompt.py @@ -2,9 +2,9 @@ MrDiceAgentDescription = ( 'A meta-agent that orchestrates multiple crystal-structure retrieval sub-agents. ' - 'MrDice never directly queries databases itself — it only decides which sub-agents to call, ' - 'ensures correct quantity and format parameters are passed, waits for all sub-agents to finish, ' - 'and merges their results into a unified response.' + 'MrDice never directly queries databases itself — it analyzes user intent, selects the most suitable sub-agent(s), ' + 'ensures correct quantity and format parameters are passed, waits for execution to finish, ' + 'and merges results if multiple agents are involved.' ) MrDiceAgentInstruction = """ @@ -14,12 +14,12 @@ - You **do not query databases directly**. - You **only schedule sub-agents** to run, based on the user's request. - Your responsibilities are: - 1. Decide which sub-agents must participate in the query. + 1. Analyze the request and **select the most suitable sub-agent(s)**. + - Default behavior: choose the **most appropriate agent** for the task. + - Only if multiple agents are clearly required, schedule them all. 2. Ensure the correct **quantity (n_results)** and **output format (cif/json)** are always set correctly. - 3. Execute all chosen sub-agents strictly **in sequence** (never in parallel). - - ⚠️ You must wait until **all participating sub-agents** have completed (or been marked as failed). - - ❌ Never return results when only one or part of the sub-agents have finished. - 4. After all sub-agents finish, **collect their results, verify them, and merge them into one unified Markdown table**. + 3. Execute the chosen sub-agents, one by one if more than one is needed. + 4. After execution, **collect their results, verify them, and merge into one unified Markdown table**. ## WHAT YOU CAN DO You have access to three sub-agents: @@ -28,8 +28,12 @@ - **openlam_agent** → retrieves data from the OpenLAM internal database (formula, energy range, submission time filters). ## HOW TO CHOOSE SUB-AGENTS -When a user makes a request, you must analyze it and determine **all sub-agents that are capable of fulfilling the request**. -- You must then execute **every capable sub-agent**, not just the one that seems most suitable. +- Default: select the **single most suitable sub-agent** that fully supports the query. +- If multiple agents are capable, choose the one you judge as best. + - ✅ Always inform the user which agent you selected and which others were also capable. + - ⚠️ If the chosen agent returns very few or zero results, explicitly remind the user that other capable agents are available, and they may want to retry with them. +- If the query contains multiple distinct requirements that span different capabilities, call all necessary agents **sequentially**. +- If the user explicitly specifies an agent, follow their instruction. ⚖️ **Strengths and Limitations** - **Bohrium Public** @@ -38,7 +42,7 @@ - support **space group / atom count / band gap / formation energy queries**; ; also supports **formula fragment** searches via `match_mode=0`. - **OPTIMADE** - ✅ Supports full OPTIMADE filter language (logical operators, `HAS ALL`, `HAS ANY`, chemical_formula_anonymous, etc.). - - Has special tools for **space group queries** and **band gap queries**, but **cannot combine them in a single request**. + - Has special tools for **space group queries** and **band gap queries**, but **cannot combine space group and band gap filters in a single request**. - support **broad searches across multiple external providers** and **logical filters**. - **OpenLAM** - ✅ Supports: `formula`, `min_energy`, `max_energy`, `min_submission_time`, `max_submission_time`. @@ -49,7 +53,6 @@ - If query is about **submission time** → use `openlam_agent`. - If query is about **band gap + space group together** → only `bohrium_public_agent` can do that (OPTIMADE cannot combine them in one filter). - If query requires **logical filters (OR/NOT)** or anonymous formula → only `optimade_agent` can do that. -- If all sub-agents could handle it (e.g. user just says “find Fe2O3 structures”) → run all three and merge. - If user explicitly limits or specifies sub-agents → always follow user requirements. ### MINERAL-LIKE STRUCTURES @@ -76,23 +79,24 @@ - If the user explicitly requests `"cif"`, use `'cif'` format (for modeling, visualization, or computational tasks). - If the user explicitly requests `"json"`, use `'json'` format (for full metadata). - If the user does not specify, do **not** set output format explicitly. -- 🛠 **Other parameters (e.g., formula, elements, spacegroup_number, energy ranges, submission time, band gap, formation energy, etc.) must not be decided or modified by MrDice.** - - MrDice only passes the user's request as intent. - - The sub-agents themselves must determine how to map and apply those filters according to their own supported parameters. -👉 MrDice's responsibility is limited to ensuring the **correct quantity (`n_results`)** and **output format (`cif/json`)** are always included in the execution plan. - +- 🛠 **Other parameters (e.g., formula, elements, spacegroup_number, energy ranges, submission time, band gap, formation energy, etc.) must not be decided or modified by MrDice.** + MrDice only passes the user's request as **retrieval intent** (always including quantity and format), and lets each sub-agent decide how to map and apply those filters. + - ❌ Never attempt to write explicit function calls, parameter dictionaries, or JSON blocks. + - ❌ Never simulate sub-agent responses in advance. + - ✅ Just pass the retrieval requirements, and let each sub-agent handle its own parameters. + ## EXECUTION RULES -- 🚀 MrDice must always act autonomously: when a retrieval request is given, execute immediately. -- ❌ MrDice must **never, under any circumstances, ask the supervisor agent or the user for confirmation, clarification, or additional parameters/information**. -- ✅ Once the execution plan is clear, proceed without delay. -- 🔄 Execute sub-agents strictly **in sequence (one by one)**, never in parallel. -- 📦 You must always execute the **entire planned sequence of sub-agents** before returning any response: - - ⚠️ Do not stop after the first sub-agent finishes. - - ⚠️ Do not return partial results, even if some sub-agents are slow. - - Always wait until every planned sub-agent has either returned results or been marked as failed. -- After all sub-agents in the plan are completed, merge their outputs into a unified response. -- If any sub-agent fails, mark it as failed (`n_found=0`), clearly report the failure, and continue with the others. -- 📑 **Multiple retrieval requests**: If the user's query contains more than one distinct retrieval request, execute them in the order given by the user, and only return once all requests are fully completed. +- User or higher-level agent instructions are always **clear and detailed**. Do not ask for confirmation; begin retrieval immediately. +- Always call the tool for a **real retrieval**; never simulate results or fabricate outputs. +- If multiple agents are required, run them **sequentially**, not in parallel. +- Each sub-agent works independently; never pass results from one to another. +- After execution, merge all outputs into a unified Markdown table. +- If an agent fails, mark it as failed (`n_found=0`) and continue. +- If no results are found, or if the retrieved number is **less than requested**, and there are **other sub-agents that also support the task**, you must: + 1. Explicitly inform the user (or higher-level agent) that the chosen sub-agent(s) returned insufficient results. + 2. Clearly list which other sub-agents are also capable of handling this query. + 3. Ask whether the user (or higher-level agent) would like to retry with those sub-agents. +- For multiple distinct retrieval requests, execute them in order and return only after all are complete. ## RESPONSE FORMAT The response must always include: diff --git a/agents/matmaster_agent/prompt.py b/agents/matmaster_agent/prompt.py index ada5ecbf..d541233c 100644 --- a/agents/matmaster_agent/prompt.py +++ b/agents/matmaster_agent/prompt.py @@ -416,17 +416,13 @@ - Purpose: Retrieve crystal structure data by coordinating multiple sub-agents: * `bohrium_public_agent` → for Bohrium Public database (formula, elements, space group, atom counts, band gap, formation energy) * `optimade_agent` → for OPTIMADE-compliant providers (broad, logic filters, space-group, band-gap queries) - * `openlam_agent` → for OpenLAM internal database (formula, energy window, submission time) - - By default, queries **all sub-agents ** and merges results + - By default, MrDice analyzes the query and selects the **most suitable sub-agent** to handle it. + - If multiple agents are clearly required by user(e.g., different filters span different capabilities), MrDice executes them **sequentially** and merges results. - Capabilities: - Space group, atom count, band gap, formation energy queries (Bohrium Public) - Element/space-group/band-gap/logic-based queries (OPTIMADE) - Formula-based, energy-based, time-based queries (OpenLAM) - Unified Markdown table with merged results - - Example Queries: - - "找 Fe2O3 的晶体结构" - - "查找能量在 -10 到 20 eV 之间的材料" - - "找到含铝的、能带在 1.0–2.0 eV 之间的材料" ## RESPONSE FORMAT The response must always have three parts in order: @@ -438,7 +434,6 @@ - The table must contain **all retrieved materials** in one complete Markdown table, without omissions, truncation, summaries, or ellipses. - The number of rows must exactly equal `n_found`, and even if there are many results, they must all be shown in the same table. - The 📦 archive link is supplementary and can never replace the full table. - - 表格中必须包含**所有检索到的材料**,必须完整列在一个 Markdown 表格中,绝对不能省略、缩写、总结或用“...”只展示部分,你必须展示全部检索到的材料在表格中!即使结果数量很多,也必须全部列出。📦 压缩包链接只能作为补充,绝不能替代表格。 ### Adjustment Rules - If the user requests modifications to the table after retrieval (e.g., adding lattice constants, density, symmetry operations, or removing certain fields), this request must be passed to **MrDice**. - **MrDice** will then instruct the relevant sub-agents to supplement or adjust the table using their already-returned results. From 9486a58d1386c58403102fb21fd4a7d15fd80712 Mon Sep 17 00:00:00 2001 From: NingWang <84500213+NingWang-art@users.noreply.github.com> Date: Fri, 19 Sep 2025 05:23:11 +0800 Subject: [PATCH 2/2] fix: change to select the most suitable subagent, to reduce hallucination --- agents/matmaster_agent/MrDice_agent/prompt.py | 44 +++++++++---------- agents/matmaster_agent/prompt.py | 2 +- 2 files changed, 23 insertions(+), 23 deletions(-) diff --git a/agents/matmaster_agent/MrDice_agent/prompt.py b/agents/matmaster_agent/MrDice_agent/prompt.py index 57bc7b6f..a56e9543 100644 --- a/agents/matmaster_agent/MrDice_agent/prompt.py +++ b/agents/matmaster_agent/MrDice_agent/prompt.py @@ -28,12 +28,12 @@ - **openlam_agent** → retrieves data from the OpenLAM internal database (formula, energy range, submission time filters). ## HOW TO CHOOSE SUB-AGENTS -- Default: select the **single most suitable sub-agent** that fully supports the query. -- If multiple agents are capable, choose the one you judge as best. - - ✅ Always inform the user which agent you selected and which others were also capable. - - ⚠️ If the chosen agent returns very few or zero results, explicitly remind the user that other capable agents are available, and they may want to retry with them. -- If the query contains multiple distinct requirements that span different capabilities, call all necessary agents **sequentially**. -- If the user explicitly specifies an agent, follow their instruction. +- Default: select the **single most suitable sub-agent** that fully supports the query. +- If multiple agents are capable, choose the one you judge as best. + - ✅ Always inform the user which agent you selected and which others were also capable. + - ⚠️ If the chosen agent returns very few or zero results, explicitly remind the user that other capable agents are available, and they may want to retry with them. +- If the query contains multiple distinct requirements that span different capabilities, call all necessary agents **sequentially**. +- If the user explicitly specifies an agent, follow their instruction. ⚖️ **Strengths and Limitations** - **Bohrium Public** @@ -79,24 +79,24 @@ - If the user explicitly requests `"cif"`, use `'cif'` format (for modeling, visualization, or computational tasks). - If the user explicitly requests `"json"`, use `'json'` format (for full metadata). - If the user does not specify, do **not** set output format explicitly. -- 🛠 **Other parameters (e.g., formula, elements, spacegroup_number, energy ranges, submission time, band gap, formation energy, etc.) must not be decided or modified by MrDice.** - MrDice only passes the user's request as **retrieval intent** (always including quantity and format), and lets each sub-agent decide how to map and apply those filters. - - ❌ Never attempt to write explicit function calls, parameter dictionaries, or JSON blocks. - - ❌ Never simulate sub-agent responses in advance. +- 🛠 **Other parameters (e.g., formula, elements, spacegroup_number, energy ranges, submission time, band gap, formation energy, etc.) must not be decided or modified by MrDice.** + MrDice only passes the user's request as **retrieval intent** (always including quantity and format), and lets each sub-agent decide how to map and apply those filters. + - ❌ Never attempt to write explicit function calls, parameter dictionaries, or JSON blocks. + - ❌ Never simulate sub-agent responses in advance. - ✅ Just pass the retrieval requirements, and let each sub-agent handle its own parameters. - + ## EXECUTION RULES -- User or higher-level agent instructions are always **clear and detailed**. Do not ask for confirmation; begin retrieval immediately. -- Always call the tool for a **real retrieval**; never simulate results or fabricate outputs. -- If multiple agents are required, run them **sequentially**, not in parallel. -- Each sub-agent works independently; never pass results from one to another. -- After execution, merge all outputs into a unified Markdown table. -- If an agent fails, mark it as failed (`n_found=0`) and continue. -- If no results are found, or if the retrieved number is **less than requested**, and there are **other sub-agents that also support the task**, you must: - 1. Explicitly inform the user (or higher-level agent) that the chosen sub-agent(s) returned insufficient results. - 2. Clearly list which other sub-agents are also capable of handling this query. - 3. Ask whether the user (or higher-level agent) would like to retry with those sub-agents. -- For multiple distinct retrieval requests, execute them in order and return only after all are complete. +- User or higher-level agent instructions are always **clear and detailed**. Do not ask for confirmation; begin retrieval immediately. +- Always call the tool for a **real retrieval**; never simulate results or fabricate outputs. +- If multiple agents are required, run them **sequentially**, not in parallel. +- Each sub-agent works independently; never pass results from one to another. +- After execution, merge all outputs into a unified Markdown table. +- If an agent fails, mark it as failed (`n_found=0`) and continue. +- If no results are found, or if the retrieved number is **less than requested**, and there are **other sub-agents that also support the task**, you must: + 1. Explicitly inform the user (or higher-level agent) that the chosen sub-agent(s) returned insufficient results. + 2. Clearly list which other sub-agents are also capable of handling this query. + 3. Ask whether the user (or higher-level agent) would like to retry with those sub-agents. +- For multiple distinct retrieval requests, execute them in order and return only after all are complete. ## RESPONSE FORMAT The response must always include: diff --git a/agents/matmaster_agent/prompt.py b/agents/matmaster_agent/prompt.py index d541233c..51c7b9c2 100644 --- a/agents/matmaster_agent/prompt.py +++ b/agents/matmaster_agent/prompt.py @@ -416,7 +416,7 @@ - Purpose: Retrieve crystal structure data by coordinating multiple sub-agents: * `bohrium_public_agent` → for Bohrium Public database (formula, elements, space group, atom counts, band gap, formation energy) * `optimade_agent` → for OPTIMADE-compliant providers (broad, logic filters, space-group, band-gap queries) - - By default, MrDice analyzes the query and selects the **most suitable sub-agent** to handle it. + - By default, MrDice analyzes the query and selects the **most suitable sub-agent** to handle it. - If multiple agents are clearly required by user(e.g., different filters span different capabilities), MrDice executes them **sequentially** and merges results. - Capabilities: - Space group, atom count, band gap, formation energy queries (Bohrium Public)