Skip to content

Feat/sear xng tool integration#953

Open
illuminate97 wants to merge 13 commits into
mainfrom
feat/searXNG-tool-integration
Open

Feat/sear xng tool integration#953
illuminate97 wants to merge 13 commits into
mainfrom
feat/searXNG-tool-integration

Conversation

@illuminate97
Copy link
Copy Markdown
Collaborator

@illuminate97 illuminate97 commented May 8, 2026

Summary

What changed?

Added a new InternetSearch tool for the core service.

  • Searches via the configured SearXNG instance.
  • Returns sourced results with title, URL, and snippet.
  • Adds INTERNET_SEARCH configuration support.
  • Registers the tool only when INTERNET_SEARCH.SEARXNG_URL is configured.
  • Configured the stack example/local config for https://searxng-test.muenchen.de/.
  • Added focused unit tests for URL building, placeholder handling, result formatting, and tool metadata.

Why

Why is this change needed?

The assistant currently has no built-in way to retrieve current external information. This tool enables internet search through the organization-controlled SearXNG instance, allowing answers to include up-to-date sourced
information without relying only on model training data.

Validation

How was this verified?

  • Automated tests
  • Manual verification
  • Not applicable

Verified with:

  • uv run ruff check app/agent/tools/internet_search.py app/agent/tools/tools.py app/config/settings.py tests/unit/test_internet_search_tool.py
  • uv run pytest tests/unit

Result: 38 passed, 5 skipped.

Live endpoint verification was not completed because the SearXNG instance is only reachable from the corporate network.

UI Changes (If applicable)

Please add screenshots or screencasts of any visual changes.

No direct UI changes. The tool appears through the existing tool discovery/selection mechanism once configured.

Change Areas

  • Frontend
  • Core service
  • Assistant service
  • Migrations / DB
  • Infrastructure / Docker / Compose / Keycloak
  • CI / CD
  • Docs

Risks / Rollout Notes

Is there anything reviewers or operators should pay attention to?
Examples: breaking changes, config changes, migration order, deployment considerations, rollback concerns.

  • New optional config section: INTERNET_SEARCH.
  • The tool is only registered when INTERNET_SEARCH.SEARXNG_URL is set.
  • The configured SearXNG instance must expose JSON search via /search?format=json.
  • The configured test endpoint is only reachable from the corporate network.
  • Search results depend on SearXNG availability, network access, and enabled engines.
  • No database migration required.

References

Related: #
Closes: #

Summary by CodeRabbit

  • New Features

    • Internet search (SearXNG-backed) added as an agent tool with streaming support
    • Tools listing endpoint gains a force_reload option to refresh tool metadata
  • Improvements

    • More robust tool loader with improved cache/lock handling, partial-result caching, and per-source failure tracking
    • Tool collection now conditionally includes internet search and refines agent state selection
  • Tests

    • Unit tests added for the internet search tool
  • Chores

    • Added internet search settings and updated config examples

Review Change Stack

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 8, 2026

Warning

Review limit reached

@Meteord, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 12 minutes and 11 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: eb84ee59-e104-4b77-ba60-f1babcc4ec9e

📥 Commits

Reviewing files that changed from the base of the PR and between f4e9b61 and cc5e3f7.

📒 Files selected for processing (4)
  • mucgpt-core-service/app/agent/prompt_pool/atlassian_scope_router.md
  • mucgpt-core-service/app/agent/prompt_pool/default_instructions.md
  • mucgpt-core-service/app/agent/tools/internet_search.py
  • mucgpt-core-service/tests/unit/test_internet_search_tool.py
📝 Walkthrough

Walkthrough

Adds a SearXNG-backed InternetSearch tool with Pydantic config and cached accessor, integrates it into the tool collection and API (force_reload), refactors MCP tool loading/caching with bounded cache-warmup and partial-result short-TTL, and adds unit tests and config examples.

Changes

Internet Search Feature and Tool System Enhancements

Layer / File(s) Summary
Internet Search Configuration & Models
mucgpt-core-service/app/config/settings.py
InternetSearchConfig model with SearXNG parameters, MCPToolDescription model, MCPConfig.FORCE_RELOAD flag, and get_internet_search_settings() cached accessor.
Internet Search Core Implementation
mucgpt-core-service/app/agent/tools/internet_search.py
Configuration validation, result formatting, core internet_search() with HTTP request/error handling and optional LangGraph streaming, and make_internet_search_tool() LangChain factory with MCP metadata.
Internet Search Unit Tests
mucgpt-core-service/tests/unit/test_internet_search_tool.py
Mocks (FakeClient, FakeResponse), tests for unconfigured placeholder handling, HTTP request parameter assertions, returned formatting, and tool metadata default (mcp_group: "default").
MCP Tool Loading: Force Reload & Caching Strategy
mucgpt-core-service/app/agent/tools/mcp.py
Adds force_reload to load_mcp_tools(), centralizes fetching, tracks per-source failures, caches partial results with short TTL on partial failure, and waits (bounded retries) for cache warm-up on lock contention rather than returning empty.
Tool Collection: Internet Search Integration
mucgpt-core-service/app/agent/tools/tools.py
Conditionally constructs and surfaces InternetSearch, list_tool_metadata() accepts force_reload, localized metadata entries for InternetSearch, and MCP loading gets force_reload.
API Endpoint & Configuration Examples
mucgpt-core-service/app/api/routers/tools_router.py, mucgpt-core-service/config.yaml.example, stack/core.config.yaml.example
list_tools endpoint adds force_reload query param; example configs include INTERNET_SEARCH block (SEARXNG URL, timeout, max results, language, safesearch).

Sequence Diagram

sequenceDiagram
  participant Client
  participant internet_search as internet_search()
  participant SearXNG as SearXNG_HTTP
  participant Writer as StreamWriter
  Client->>internet_search: call with query, max_results, language
  internet_search->>Writer: emit STARTED (if writer present)
  internet_search->>SearXNG: GET /search with params (format=json, q, language, safesearch)
  SearXNG-->>internet_search: JSON results
  internet_search->>internet_search: parse, filter, format numbered results
  internet_search->>Writer: emit ENDED (if writer present)
  internet_search->>Client: return formatted string
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I hopped through configs and code so neat,

SEARXNG traces the web for crumbs to eat,
Cache waits a heartbeat, tools line the way,
Results neatly formatted, ready to relay,
A rabbit's small cheer for a feature complete.

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 22.39% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'Feat/sear xng tool integration' is partially related to the changeset but contains a formatting issue and lacks clarity about the primary change being a SearXNG-backed InternetSearch tool. Clarify the title to be more specific and professional, e.g., 'Add InternetSearch tool with SearXNG integration' or 'Integrate SearXNG for internet search capabilities'.
✅ Passed checks (3 passed)
Check name Status Explanation
Description check ✅ Passed The pull request description is well-structured, follows the template, and comprehensively covers all major sections including Summary, Why, Validation, Change Areas, Risks, and References.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/searXNG-tool-integration

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 18

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@mucgpt-core-service/app/agent/agent_executor.py`:
- Line 143: The code is injecting a hardcoded "agent_state": {"current_scope":
"general"} into the request payload for both streaming and non-streaming paths,
which makes scope routing never run in the non-streaming (graph-bypassed) path
and creates a false impression of scope awareness; update the logic around where
the payload/dict is assembled (the place that currently adds the "agent_state"
key in agent_executor.py and the duplicate occurrence later) to only include
agent_state when the streaming/graph path is taken (or remove it entirely for
non-streaming), so non-streaming requests do not carry a misleading scope and
scope routing behavior remains correct.
- Around line 276-279: The non-streaming branch is bypassing the
MUCGPTReActAgent graph and therefore skips tools, data-source injection, scope
routing and ContextMiddleware; replace the raw LLM sync call (the block using
self.agent.model.with_config(config).invoke(msgs)) with a synchronous graph call
so the full ReAct loop runs (use self.agent.graph.invoke or the agent's sync
invocation API with the same config and msgs), ensuring enabled_tools,
data_sources, and agent_state/context middleware are honored during
non-streaming execution.

In `@mucgpt-core-service/app/agent/middleware.py`:
- Around line 216-221: The code is currently echoing raw exception text into the
ToolMessage (using str(exc)), which can leak sensitive info; update the two
spots where ToolMessage is constructed after exceptions (the block using
logger.exception(...) and returning ToolMessage(...
tool_call_id=request.tool_call["id"]) around the logger.exception call and the
similar block at lines 232-237) to: keep logger.exception(...) as-is to record
full details, but change the ToolMessage content to a generic error string
(e.g., "Tool execution failed" or "An internal error occurred while executing
the tool") without including exc or any exception details, while still
preserving tool_call id from request.tool_call["id"].

In `@mucgpt-core-service/app/agent/prompt_pool/atlassian_scope_router.md`:
- Around line 14-15: The classifier instruction string in
atlassian_scope_router.md contains a typo "fokus" that should be corrected to
"focus"; update the sentence "Important: fokus on the latest scope of the
conversation..." to "Important: focus on the latest scope of the
conversation..." so the user-facing agent instruction is correct and consistent.

In `@mucgpt-core-service/app/agent/prompt_pool/default_instructions.md`:
- Line 6: Fix the typo in the default instruction string "Responde in the same
language as the request." by changing "Responde" to "Respond" so the line reads
"Respond in the same language as the request." Update the text inside
default_instructions.md where that exact sentence appears to ensure generated
instructions use the correct spelling.

In `@mucgpt-core-service/app/agent/react_agent.py`:
- Around line 143-145: The code reads messages[0].content without ensuring
messages is non-empty in _prepare_run; update the logic where system_prompt is
set (the block using DEFAULT_INSTRUCTIONS and messages[0].content) and the
similar block at the later location so you first check that messages is
truthy/len(messages) > 0 before accessing messages[0]; if messages is empty,
treat the system_prompt branch as None (or the safe default) to avoid
IndexError. Ensure you reference _prepare_run, the messages variable, and
DEFAULT_INSTRUCTIONS when making the change.
- Around line 69-73: The _select_tools method currently returns an empty list
when enabled_tools is None, dropping all tools; change it so that when
enabled_tools is None it returns the full toolset (e.g., self.tools or a shallow
copy) instead of [], and keep the existing filtering behavior when enabled_tools
is provided; update the method _select_tools in react_agent.py to check for None
specifically and return the agent's tools attribute in that case so callers that
omit enabled_tools retain InternetSearch/MCP access.

In `@mucgpt-core-service/app/agent/state_models/atlassian_state.py`:
- Around line 11-14: The four required fields (current_scope, locked_scope,
initial_scope_checked, scope_confidence) in the Atlassian state model are
causing ValidationError when partial updates are supplied by
AtlassianScopePolicy; make them safe by providing sensible defaults: set
current_scope and locked_scope default to "general", initial_scope_checked
default to False, and scope_confidence default to 0.0 on the model (the
attributes current_scope, locked_scope, initial_scope_checked, scope_confidence)
so partial construction succeeds.

In `@mucgpt-core-service/app/agent/state_models/default_state.py`:
- Line 13: AgentState currently declares data_sources as a required field which
causes TypedDict validation failures; change the declaration of data_sources in
the AgentState TypedDict to be optional by wrapping it with
typing_extensions.NotRequired (or typing.NotRequired if available) so the field
is not required in all state dicts—update the symbol data_sources in the
AgentState definition in default_state.py to use NotRequired to fix the
TypedDict violation.
- Line 3: Update the import so it uses LangChain's public re-export: replace the
current import of AgentState from langchain.agents.middleware with importing
AgentState from langchain.agents (i.e., ensure the top of default_state.py
imports AgentState via "from langchain.agents import AgentState") to follow
LangChain 1.2.x standard conventions and avoid fragile internal import paths.

In `@mucgpt-core-service/app/agent/tools/internet_search.py`:
- Around line 171-172: The InternetSearch tool is incorrectly placed in the
Atlassian tool group via internet_search_tool.metadata = {"mcp_group":
"atlassian"}; remove or change that metadata so InternetSearch is not assigned
to the "atlassian" group (e.g., delete the metadata assignment or set a neutral
group) to prevent it from opt‑ing into AtlassianAgentState and the Atlassian
scope router.

In `@mucgpt-core-service/app/agent/tools/policies.py`:
- Around line 132-139: _tool_scope currently reads metadata["mcp_scope"] but the
loader (McpLoader) and make_internet_search_tool() populate
metadata["mcp_group"], so update _tool_scope() to read the same key(s): first
check metadata.get("mcp_group") and if missing fallback to
metadata.get("mcp_scope"); normalize the value (str(...).strip().lower()) and
return it when it matches the allowed set {"jira","confluence"} so tools grouped
via mcp_group are correctly scope-filtered.
- Around line 165-167: The current logger.info(f"Request: {request}") prints the
entire ModelRequest (including conversation text and injected docs); replace
that full-request logging with a minimal summary: log the message count (e.g.,
len(request.messages)) and any relevant state flags such as whether injected
documents are present (e.g., bool(getattr(request, "documents", None)) or a
similar field), and remove or redact any direct conversation/content fields;
keep the surrounding flow (the early return and the existing logger.info("No
messages in request; skipping initial scope check.") and return request) intact.

In `@mucgpt-core-service/app/agent/tools/tools.py`:
- Around line 37-44: The function select_agent_state_schema collects tool
metadata groups but doesn't handle the case where no tools have metadata,
causing list(tool_groups)[0] to crash; update select_agent_state_schema to first
check if tool_groups is empty and immediately return DefaultAgentState when so,
otherwise preserve the existing logic that returns DefaultAgentState when
multiple groups exist or looks up AGENT_STATE_SCHEMA_REGISTRY for the single
group key.

In `@mucgpt-core-service/config.yaml.example`:
- Around line 76-85: The INTERNET_SEARCH example block is active by default and
points to a corporate SearXNG URL; update the example so the entire
INTERNET_SEARCH block (keys SEARXNG_URL, TIMEOUT, MAX_RESULTS, LANGUAGE,
SAFESEARCH) is commented out and add a short inline note that this feature is
optional and only enabled when SEARXNG_URL is set (or via
MUCGPT_CORE_INTERNET_SEARCH__SEARXNG_URL env var). Locate the INTERNET_SEARCH
block in the example config and convert each line to a comment variant so users
copying the file don't unknowingly enable a broken corporate URL.

In `@mucgpt-core-service/tests/unit/test_internet_search_tool.py`:
- Around line 8-12: Add explicit type hints to the untyped test function and
mock methods: annotate raise_for_status() as -> None and json() as -> Dict[str,
Any]; import typing names (Dict, Any) at top of the test module. Also annotate
all test function signatures referenced (the functions around lines 27-37 and
42-76) with appropriate parameter and return types (e.g., def test_xxx() ->
None) to comply with the repository rule. Ensure the mock class methods keep
self parameter types implicit but include return types exactly as above and
update any other untyped functions in that file similarly.
- Around line 75-79: The unit test
test_make_internet_search_tool_has_default_metadata expects metadata
{"mcp_group": "default"} but the current
internet_search.make_internet_search_tool produces {"mcp_group": "atlassian"},
so update the assertion or the tool constructor to be consistent; either change
the test's expected value to {"mcp_group": "atlassian"} (update the assertion on
tool.metadata) or modify make_internet_search_tool to set metadata to
{"mcp_group": "default"} so tool.metadata matches the test.

In `@stack/core.config.yaml.example`:
- Line 72: Replace the hard-coded corporate URL in the example config for
SEARXNG_URL with a neutral placeholder or empty string so the example does not
enable an environment-specific search endpoint by default; update the
SEARXNG_URL entry in stack/core.config.yaml.example to something like an empty
value or "https://example.local/" and add a short comment indicating users must
explicitly set/opt-in to a real search endpoint for their environment.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 64d884af-ea0d-44ba-b548-1afd3732a53f

📥 Commits

Reviewing files that changed from the base of the PR and between d3b47c2 and 3aa1a81.

⛔ Files ignored due to path filters (1)
  • mucgpt-core-service/uv.lock is excluded by !**/*.lock
📒 Files selected for processing (27)
  • mucgpt-core-service/app/agent/agent.py
  • mucgpt-core-service/app/agent/agent_executor.py
  • mucgpt-core-service/app/agent/middleware.py
  • mucgpt-core-service/app/agent/prompt_pool/atlassian_confluence.md
  • mucgpt-core-service/app/agent/prompt_pool/atlassian_general.md
  • mucgpt-core-service/app/agent/prompt_pool/atlassian_jira.md
  • mucgpt-core-service/app/agent/prompt_pool/atlassian_scope_router.md
  • mucgpt-core-service/app/agent/prompt_pool/default_instructions.md
  • mucgpt-core-service/app/agent/prompt_pool/tool_instructions.md
  • mucgpt-core-service/app/agent/react_agent.py
  • mucgpt-core-service/app/agent/state_models/__init__.py
  • mucgpt-core-service/app/agent/state_models/atlassian_state.py
  • mucgpt-core-service/app/agent/state_models/default_state.py
  • mucgpt-core-service/app/agent/state_models/registry.py
  • mucgpt-core-service/app/agent/tools/internet_search.py
  • mucgpt-core-service/app/agent/tools/mcp.py
  • mucgpt-core-service/app/agent/tools/policies.py
  • mucgpt-core-service/app/agent/tools/tools.py
  • mucgpt-core-service/app/api/routers/tools_router.py
  • mucgpt-core-service/app/config/settings.py
  • mucgpt-core-service/app/init_app.py
  • mucgpt-core-service/config.yaml.example
  • mucgpt-core-service/pyproject.toml
  • mucgpt-core-service/tests/unit/test_agent.py
  • mucgpt-core-service/tests/unit/test_init_app.py
  • mucgpt-core-service/tests/unit/test_internet_search_tool.py
  • stack/core.config.yaml.example
💤 Files with no reviewable changes (2)
  • mucgpt-core-service/app/agent/agent.py
  • mucgpt-core-service/tests/unit/test_agent.py

Comment thread mucgpt-core-service/app/agent/agent_executor.py Outdated
Comment thread mucgpt-core-service/app/agent/agent_executor.py Outdated
Comment thread mucgpt-core-service/app/agent/middleware.py
Comment thread mucgpt-core-service/app/agent/prompt_pool/atlassian_scope_router.md Outdated
Comment thread mucgpt-core-service/app/agent/prompt_pool/default_instructions.md Outdated
Comment thread mucgpt-core-service/app/agent/tools/tools.py Outdated
Comment thread mucgpt-core-service/config.yaml.example Outdated
Comment thread mucgpt-core-service/tests/unit/test_internet_search_tool.py Outdated
Comment thread mucgpt-core-service/tests/unit/test_internet_search_tool.py Outdated
Comment thread stack/core.config.yaml.example Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
mucgpt-core-service/app/agent/tools/tools.py (1)

206-236: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

InternetSearch metadata missing for français, bairisch, and ukrainisch.

Brainstorming and Vereinfachen have entries in every supported language map, but InternetSearch is only present in deutsch and english. Users with the other locales will fall back to the raw tool name/description from make_internet_search_tool, producing inconsistent UX compared to other tools.

Add InternetSearch entries (or document the intended fallback) so localized labels render correctly across all listed languages.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@mucgpt-core-service/app/agent/tools/tools.py` around lines 206 - 236, The
language maps are missing localized entries for the InternetSearch tool for the
locales "français", "bairisch", and "ukrainisch", causing those locales to fall
back to the raw make_internet_search_tool labels; add an "InternetSearch" key
alongside "Brainstorming" and "Vereinfachen" in each of those locale
dictionaries with appropriate localized "name" and "description" strings (or
alternatively add a clear comment documenting the intended fallback behavior),
updating the same structure used for other tools so the localized
label/description for InternetSearch renders consistently across the locales
referenced in tools.py.
♻️ Duplicate comments (1)
stack/core.config.yaml.example (1)

70-77: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Replace hard-coded corporate SearXNG URL with a placeholder.

The example still ships https://searxng-test.muenchen.de/ as the default SEARXNG_URL. Outside the corporate network this either fails resolution or blocks tool calls until the timeout expires, which is a poor first-run experience. Keep the example opt-in by using a placeholder.

♻️ Suggested change
 INTERNET_SEARCH:
-  SEARXNG_URL: "https://searxng-test.muenchen.de/"
+  SEARXNG_URL: "<your-searxng-url>"  # Leave empty or set to your SearXNG instance to enable
   TIMEOUT: 10.0
   MAX_RESULTS: 5
   LANGUAGE: "de"
   SAFESEARCH: 1
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@stack/core.config.yaml.example` around lines 70 - 77, The example config
currently hard-codes a corporate SearXNG URL under the
INTERNET_SEARCH.SEAXNG_URL key which causes failures for external users; change
SEARXNG_URL to a neutral placeholder (e.g. "<YOUR_SEARXNG_URL>" or empty string)
and keep TIMEOUT, MAX_RESULTS, LANGUAGE, SAFESEARCH as-is so the InternetSearch
tool remains opt-in; update any nearby comment to instruct users to replace the
placeholder with their own SearXNG instance if they want to enable the tool.
🧹 Nitpick comments (3)
mucgpt-core-service/app/agent/tools/tools.py (2)

39-43: ⚡ Quick win

Add type hints to _metadata_value signature.

metadata and default are untyped. Both parameters can accept arbitrary tool-metadata containers and default values, so Any is appropriate here.

♻️ Proposed type hints
-def _metadata_value(metadata, key: str, default=None):
+def _metadata_value(metadata: Any, key: str, default: Any = None) -> Any:
     if isinstance(metadata, dict):
         return metadata.get(key, default)
     return getattr(metadata, key, default)

You'll also need from typing import Any (already absent in this file's imports).

As per coding guidelines: "Use mandatory type hints for all function signatures in Python code."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@mucgpt-core-service/app/agent/tools/tools.py` around lines 39 - 43, Add
explicit type hints to the _metadata_value signature: annotate metadata: Any,
default: Any, and the return type: Any, and import Any from typing at the top of
the module; update the function declaration for _metadata_value to use these
types so both callers and static checkers understand that metadata and default
can be arbitrary values and the function returns Any.

147-160: 💤 Low value

Document the new force_reload parameter in the docstring.

list_tool_metadata gained a force_reload argument that propagates into McpLoader.load_mcp_tools, but the docstring still only describes user_info and lang. Adding a brief note prevents confusion for callers tracing this from the API layer.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@mucgpt-core-service/app/agent/tools/tools.py` around lines 147 - 160, Update
the docstring for list_tool_metadata to include the new force_reload parameter:
state that force_reload (bool, default False) forces reloading of MCP tools from
the source by propagating into McpLoader.load_mcp_tools, describe its effect on
caching/refresh behavior, and add it to the Args section alongside user_info and
lang so callers can see its purpose and default value.
mucgpt-core-service/app/config/settings.py (1)

370-377: ⚡ Quick win

Consider validating SAFESEARCH and TIMEOUT bounds.

SearXNG accepts safesearch as 0, 1, or 2; arbitrary integers will be silently sent to the engine and may produce undefined behavior. Likewise TIMEOUT should be strictly positive. Tightening these at the config layer surfaces misconfiguration at startup rather than at request time.

♻️ Proposed tightening
-from pydantic import (
+from pydantic import (
     BaseModel,
     Field,
     HttpUrl,
+    PositiveFloat,
     PositiveInt,
     PrivateAttr,
     SecretStr,
     TypeAdapter,
     field_validator,
     model_validator,
 )
@@
 class InternetSearchConfig(BaseModel):
     """Internet search configuration (nested under INTERNET_SEARCH key in YAML)."""

     SEARXNG_URL: str = ""
-    TIMEOUT: float = 10.0
+    TIMEOUT: PositiveFloat = 10.0
     MAX_RESULTS: PositiveInt = 5
     LANGUAGE: str = "de"
-    SAFESEARCH: int = 1
+    SAFESEARCH: int = Field(default=1, ge=0, le=2)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@mucgpt-core-service/app/config/settings.py` around lines 370 - 377, The
InternetSearchConfig model should validate SAFESEARCH and TIMEOUT bounds: change
SAFESEARCH from a plain int to a constrained type (e.g., Literal[0,1,2] or
conint(ge=0, le=2)) so only 0/1/2 are accepted, and make TIMEOUT a strictly
positive float (e.g., PositiveFloat or confloat(gt=0)) and/or add pydantic
validators on InternetSearchConfig to raise clear validation errors with
contextual messages when values are out of range; reference the
InternetSearchConfig class and the SAFESEARCH and TIMEOUT fields when
implementing these changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@mucgpt-core-service/app/agent/tools/tools.py`:
- Around line 206-236: The language maps are missing localized entries for the
InternetSearch tool for the locales "français", "bairisch", and "ukrainisch",
causing those locales to fall back to the raw make_internet_search_tool labels;
add an "InternetSearch" key alongside "Brainstorming" and "Vereinfachen" in each
of those locale dictionaries with appropriate localized "name" and "description"
strings (or alternatively add a clear comment documenting the intended fallback
behavior), updating the same structure used for other tools so the localized
label/description for InternetSearch renders consistently across the locales
referenced in tools.py.

---

Duplicate comments:
In `@stack/core.config.yaml.example`:
- Around line 70-77: The example config currently hard-codes a corporate SearXNG
URL under the INTERNET_SEARCH.SEAXNG_URL key which causes failures for external
users; change SEARXNG_URL to a neutral placeholder (e.g. "<YOUR_SEARXNG_URL>" or
empty string) and keep TIMEOUT, MAX_RESULTS, LANGUAGE, SAFESEARCH as-is so the
InternetSearch tool remains opt-in; update any nearby comment to instruct users
to replace the placeholder with their own SearXNG instance if they want to
enable the tool.

---

Nitpick comments:
In `@mucgpt-core-service/app/agent/tools/tools.py`:
- Around line 39-43: Add explicit type hints to the _metadata_value signature:
annotate metadata: Any, default: Any, and the return type: Any, and import Any
from typing at the top of the module; update the function declaration for
_metadata_value to use these types so both callers and static checkers
understand that metadata and default can be arbitrary values and the function
returns Any.
- Around line 147-160: Update the docstring for list_tool_metadata to include
the new force_reload parameter: state that force_reload (bool, default False)
forces reloading of MCP tools from the source by propagating into
McpLoader.load_mcp_tools, describe its effect on caching/refresh behavior, and
add it to the Args section alongside user_info and lang so callers can see its
purpose and default value.

In `@mucgpt-core-service/app/config/settings.py`:
- Around line 370-377: The InternetSearchConfig model should validate SAFESEARCH
and TIMEOUT bounds: change SAFESEARCH from a plain int to a constrained type
(e.g., Literal[0,1,2] or conint(ge=0, le=2)) so only 0/1/2 are accepted, and
make TIMEOUT a strictly positive float (e.g., PositiveFloat or confloat(gt=0))
and/or add pydantic validators on InternetSearchConfig to raise clear validation
errors with contextual messages when values are out of range; reference the
InternetSearchConfig class and the SAFESEARCH and TIMEOUT fields when
implementing these changes.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 63a2f922-d5ad-4085-954e-dad1df20897d

📥 Commits

Reviewing files that changed from the base of the PR and between 3aa1a81 and 8f53583.

📒 Files selected for processing (4)
  • mucgpt-core-service/app/agent/tools/tools.py
  • mucgpt-core-service/app/config/settings.py
  • mucgpt-core-service/config.yaml.example
  • stack/core.config.yaml.example
✅ Files skipped from review due to trivial changes (1)
  • mucgpt-core-service/config.yaml.example

Meteord and others added 2 commits May 28, 2026 16:21
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 28, 2026

Actionable comments posted: 0

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 28, 2026

Actionable comments posted: 0

Meteord and others added 5 commits May 28, 2026 16:38
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
…er.md

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

@Meteord Meteord left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! i changed the group for the websearch tool. Pls review again if that is okay @illuminate97

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants