chore: sync with main and stabilize MCP server by visahak · Pull Request #59 · AgentToolkit/altk-evolve

visahak · 2026-02-16T20:06:01Z

Description:

Sync & Conflict Resolution: Merged latest changes from main and resolved conflicts in sqlite_manager.py, mcp_server.py, and pyproject.toml.
MCP Stability: Implemented lazy initialization for the KaizenClient in the MCP server to prevent race conditions with the SSE transport.
Test Isolation: Fixed data leakage in tests by correctly using tmp_path and the KAIZEN_DATA_DIR environment variable.
Code Quality: Formatted the codebase with ruff and fixed all mypy type checking errors (including installing types-requests).

Verification:

All 18 tests (MCP E2E and Milvus Unit) passed successfully.
ruff format and ruff check pass.
mypy passes with 0 errors.

Summary by CodeRabbit

New Features
- Database path can be set via environment with an automatic fallback.
- Entity creation now validates metadata JSON and returns structured errors on invalid input.
- Improved client initialization for more reliable startup and namespace setup.
- Milvus storage now auto-creates indexes and refreshes collections for faster, consistent searches.
Bug Fixes
- Fixed TOML configuration syntax.
Tests
- Expanded parameterized backend tests covering multiple storage backends and teardown.

- Resolved merge conflicts with origin/main. - Implemented lazy initialization in MCP server to fix SSE race conditions. - Added environment variable support for SQLite path and data directory. - Simplified MCP E2E tests by switching to filesystem backend and ensuring isolation.

coderabbitai · 2026-02-16T20:06:19Z

📝 Walkthrough

Walkthrough

Introduces lazy, thread-safe Kaizen client and one-time namespace initialization in the MCP server; makes SQLiteManager accept an optional db_path with environment fallback; adds Milvus embedding-index/search and index creation; parameterizes E2E tests for Milvus/filesystem backends; minor dependency/TOML fixes and test mocks.

Changes

Cohort / File(s)	Summary
SQLite Configuration `kaizen/db/sqlite_manager.py`	Constructor changed to `db_path: str
MCP Server Refactor `kaizen/frontend/mcp/mcp_server.py`	Adds thread-safe lazy client init with `_client_init_lock` and `_namespace_initialized` gate; inlines namespace creation into `get_client()`; removes per-call `ensure_namespace()` usages; adds metadata JSON parsing, structured error responses, and broad exception handling in `create_entity()`; other flows rely on centralized init.
Milvus Backend Enhancements `kaizen/backend/milvus.py`	On namespace/collection creation, creates an embedding index (IP + FLAT) for the embedding field; `update_entities()` now flushes and reloads collections; `search_entities()` uses embedding search when query provided, builds filter strings and normalizes hits; `delete_entity_by_id()` enforces numeric ID validation.
E2E Tests (parameterized backends) `tests/e2e/test_mcp.py`	Replaces single fixture with parameterized fixture for `["milvus","filesystem"]`; sets per-backend env vars (KAIZEN_BACKEND, KAIZEN_SQLITE_PATH/KAIZEN_DATA_DIR), resets settings and MCP server state between runs, and adds backend-specific setup/teardown (Milvus URI swap, server lifecycle, file cleanup).
Unit Test Updates `tests/unit/test_milvus_backend.py`	Adds a mock `search(...)` to exercise Milvus vector-search path in tests, wiring `milvus_backend.milvus.search` to the mock.
Project Config `pyproject.toml`	Adds `types-requests>=2.32.4.20260107` to dev deps and fixes TOML syntax for `ignore_missing_imports` under mypy overrides.

Sequence Diagram(s)

sequenceDiagram
    participant ClientCaller as Client (API call)
    participant MCP as MCP Server (mcp_server.get_client)
    participant Kaizen as KaizenClient
    participant Backend as Backend (Milvus / SQLite / FS)

    ClientCaller->>MCP: call create_entity(...)
    MCP->>MCP: acquire _client_init_lock
    alt client not initialized
        MCP->>Kaizen: KaizenClient(...)
        Kaizen-->>MCP: client instance
        MCP->>Backend: ensure namespace / create collection
        Backend-->>MCP: namespace created
        MCP-->>MCP: set _namespace_initialized = true
    end
    MCP->>Kaizen: use client to update_entities(...)
    Kaizen->>Backend: persist entities / index (Milvus index/create)
    Backend-->>Kaizen: ack (flush/reload if Milvus)
    Kaizen-->>MCP: update result
    MCP-->>ClientCaller: return created entity / error

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Add create_entity and delete_entity MCP tools #35 — Changes to kaizen/frontend/mcp/mcp_server.py adding lazy client/namespace init and create_entity/delete flow updates.
Add milvus backend unit test #40 — Modifications to kaizen/backend/milvus.py and related tests for search/indexing behavior.
Refactor: Rename 'kata' to 'entity' #33 — Adjusts SQLite default DB behavior and relates to kaizen/db/sqlite_manager.py path handling.

Suggested reviewers

vinodmut
illeatmyhat

Poem

🐰 I hopped in code at break of day,

Namespaces woke when I did say,
Embeddings hummed in Milvus' glade,
Paths resolved where env was laid,
A rabbit's patchwork, tested and play! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 59.38% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'chore: sync with main and stabilize MCP server' accurately describes the main changes in the pull request, including merging with main, resolving conflicts, and stabilizing the MCP server with lazy initialization and test improvements.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Tip

Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

tests/e2e/test_mcp.py (1)

9-9: 🛠️ Refactor suggestion | 🟠 Major

Unused import: milvus_client_settings.

This import is a leftover from the previous Milvus-based test setup and is not referenced anywhere in the file. Ruff would flag this.

-from kaizen.config.milvus import milvus_client_settings

🤖 Prompt for AI Agents

Before applying any fix, first verify the finding against the current code and
decide whether a code change is actually needed. If the finding is not valid or
no change is required, do not modify code for that item and briefly explain why
it was skipped.
In `@tests/e2e/test_mcp.py` at line 9, Remove the unused import
milvus_client_settings from the top of the test file: locate the import
statement that references milvus_client_settings and delete it, ensuring no
other references to milvus_client_settings remain in the file so the test no
longer contains an unused import.

🤖 Fix all issues with AI agents

Before applying any fix, first verify the finding against the current code and
decide whether a code change is actually needed. If the finding is not valid or
no change is required, do not modify code for that item and briefly explain why
it was skipped.

In `@kaizen/db/sqlite_manager.py`:
- Around line 28-30: The inline import of os inside SQLiteManager.__init__
should be moved to the module top-level with the other stdlib imports (datetime,
logging, sqlite3, threading) in sqlite_manager.py; remove the inline "import os"
from the __init__ method and add a single top-level "import os" near the other
imports so self.db_path = db_path or os.environ.get("KAIZEN_SQLITE_PATH",
"entities.sqlite.db") continues to work without per-instantiation imports.

In `@kaizen/frontend/mcp/mcp_server.py`:
- Line 10: Remove the unused import asynccontextmanager from the top of the
file: the import statement "from contextlib import asynccontextmanager" in
mcp_server.py is not referenced anywhere and should be deleted to satisfy
linting (Ruff) and avoid unused imports.
- Around line 200-205: In the exception handler inside create_entity (the except
Exception as e: block) remove the redundant traceback import and the
traceback.print_exc() call and rely only on logger.exception(...) to log the
error and traceback; keep the logger.exception(f"CRASH IN CREATE_ENTITY: {e}")
and the subsequent return json.dumps({"error": f"Server Error: {str(e)}"}) as
the error response.
- Around line 28-61: The broad except in get_client surrounding
_client.get_namespace_details(kaizen_config.namespace_id) should be narrowed and
the original error preserved: only catch exceptions that indicate the namespace
might not exist (e.g., NamespaceNotFoundException) or specific recoverable
errors you expect, and for any other Exception log the original exception with
full context (including the exception instance) and re-raise instead of blindly
calling _client.create_namespace; update the get_client function to log the
caught error (use the exception variable) before any retry/create attempt and
only attempt create_namespace when it makes sense.
- Around line 34-58: The lazy initialization of the global _client and
_namespace_initialized in get_client()/mcp_server.py is not
thread-safe—concurrent callers can race and create multiple KaizenClient
instances or duplicate namespace work; protect the initialization with a
threading.Lock (e.g., a module-level _init_lock) and use double-checked locking
around the blocks that create _client and set _namespace_initialized, wrapping
both the client creation (KaizenClient()) and namespace checks/creation
(get_namespace_details, create_namespace with kaizen_config.namespace_id) inside
the lock to ensure only one thread performs initialization, or alternatively
document the single-threaded assumption if you prefer not to add locking.

In `@tests/e2e/test_mcp.py`:
- Around line 15-46: The mcp fixture currently mutates os.environ directly
(lines setting 'KAIZEN_BACKEND', 'KAIZEN_NAMESPACE_ID', 'KAIZEN_DATA_DIR') and
never restores them; replace those direct environment assignments with pytest's
monkeypatch.setenv calls inside the mcp fixture (use the fixture parameter name
monkeypatch) so environment variables are automatically restored after the test,
keep the subsequent kaizen_config.__init__() and filesystem_settings.__init__()
calls, and leave the existing MCP reset logic (mcp_server_module._client and
_namespace_initialized) as-is.
- Around line 37-40: Replace the broad try/except around
kaizen_client.create_namespace('test') so it only catches the expected
NamespaceAlreadyExistsException; allow any other exceptions to propagate (or
re-raise them) so setup failures are not silently swallowed. Specifically,
change the except Exception: pass to an except NamespaceAlreadyExistsException:
pass (import or reference the exception from the same module that defines it)
and do not suppress other errors thrown by kaizen_client.create_namespace.
- Around line 106-119: Replace the manual sys.modules patching for
fs_backend.resolve_conflicts (where code assigns original_resolve, defines
mock_resolve_func, and sets fs_backend.resolve_conflicts) with pytest's
monkeypatch fixture: use monkeypatch.setattr to replace
kaizen.backend.filesystem.resolve_conflicts with the mock_resolve_func so
teardown is automatic; ensure the mock function signature and return
(EntityUpdate list using first_entity_id) remain the same and remove the manual
restore of original_resolve.
- Line 9: Remove the unused import milvus_client_settings from the top of the
test file: locate the import statement that references milvus_client_settings
and delete it, ensuring no other references to milvus_client_settings remain in
the file so the test no longer contains an unused import.

🧹 Nitpick comments (6)

🤖 Fix all nitpicks with AI agents

Before applying any fix, first verify the finding against the current code and
decide whether a code change is actually needed. If the finding is not valid or
no change is required, do not modify code for that item and briefly explain why
it was skipped.

In `@kaizen/db/sqlite_manager.py`:
- Around line 28-30: The inline import of os inside SQLiteManager.__init__
should be moved to the module top-level with the other stdlib imports (datetime,
logging, sqlite3, threading) in sqlite_manager.py; remove the inline "import os"
from the __init__ method and add a single top-level "import os" near the other
imports so self.db_path = db_path or os.environ.get("KAIZEN_SQLITE_PATH",
"entities.sqlite.db") continues to work without per-instantiation imports.

In `@kaizen/frontend/mcp/mcp_server.py`:
- Around line 200-205: In the exception handler inside create_entity (the except
Exception as e: block) remove the redundant traceback import and the
traceback.print_exc() call and rely only on logger.exception(...) to log the
error and traceback; keep the logger.exception(f"CRASH IN CREATE_ENTITY: {e}")
and the subsequent return json.dumps({"error": f"Server Error: {str(e)}"}) as
the error response.
- Around line 28-61: The broad except in get_client surrounding
_client.get_namespace_details(kaizen_config.namespace_id) should be narrowed and
the original error preserved: only catch exceptions that indicate the namespace
might not exist (e.g., NamespaceNotFoundException) or specific recoverable
errors you expect, and for any other Exception log the original exception with
full context (including the exception instance) and re-raise instead of blindly
calling _client.create_namespace; update the get_client function to log the
caught error (use the exception variable) before any retry/create attempt and
only attempt create_namespace when it makes sense.
- Around line 34-58: The lazy initialization of the global _client and
_namespace_initialized in get_client()/mcp_server.py is not
thread-safe—concurrent callers can race and create multiple KaizenClient
instances or duplicate namespace work; protect the initialization with a
threading.Lock (e.g., a module-level _init_lock) and use double-checked locking
around the blocks that create _client and set _namespace_initialized, wrapping
both the client creation (KaizenClient()) and namespace checks/creation
(get_namespace_details, create_namespace with kaizen_config.namespace_id) inside
the lock to ensure only one thread performs initialization, or alternatively
document the single-threaded assumption if you prefer not to add locking.

In `@tests/e2e/test_mcp.py`:
- Around line 37-40: Replace the broad try/except around
kaizen_client.create_namespace('test') so it only catches the expected
NamespaceAlreadyExistsException; allow any other exceptions to propagate (or
re-raise them) so setup failures are not silently swallowed. Specifically,
change the except Exception: pass to an except NamespaceAlreadyExistsException:
pass (import or reference the exception from the same module that defines it)
and do not suppress other errors thrown by kaizen_client.create_namespace.
- Around line 106-119: Replace the manual sys.modules patching for
fs_backend.resolve_conflicts (where code assigns original_resolve, defines
mock_resolve_func, and sets fs_backend.resolve_conflicts) with pytest's
monkeypatch fixture: use monkeypatch.setattr to replace
kaizen.backend.filesystem.resolve_conflicts with the mock_resolve_func so
teardown is automatic; ensure the mock function signature and return
(EntityUpdate list using first_entity_id) remain the same and remove the manual
restore of original_resolve.

tests/e2e/test_mcp.py (2)

37-40: Broad except Exception: pass silently swallows setup failures.

If the namespace already exists, a NamespaceAlreadyExistsException is expected — but any other error (e.g., misconfigured backend, filesystem permissions) would be silently hidden, making test failures hard to diagnose.

Proposed fix — catch only the expected exception

+    from kaizen.schema.exceptions import NamespaceAlreadyExistsException
     try:
         kaizen_client.create_namespace('test')
-    except Exception:
+    except NamespaceAlreadyExistsException:
         pass

🤖 Prompt for AI Agents

Before applying any fix, first verify the finding against the current code and
decide whether a code change is actually needed. If the finding is not valid or
no change is required, do not modify code for that item and briefly explain why
it was skipped.
In `@tests/e2e/test_mcp.py` around lines 37 - 40, Replace the broad try/except
around kaizen_client.create_namespace('test') so it only catches the expected
NamespaceAlreadyExistsException; allow any other exceptions to propagate (or
re-raise them) so setup failures are not silently swallowed. Specifically,
change the except Exception: pass to an except NamespaceAlreadyExistsException:
pass (import or reference the exception from the same module that defines it)
and do not suppress other errors thrown by kaizen_client.create_namespace.

106-119: Direct sys.modules attribute patching is fragile — consider monkeypatch.setattr.

The manual save-replace-restore pattern works (and the finally block is good), but pytest's monkeypatch fixture would handle the cleanup automatically and be more idiomatic. This is a minor suggestion since the current approach is functional and correctly wrapped in try/finally.

🤖 Prompt for AI Agents

Before applying any fix, first verify the finding against the current code and
decide whether a code change is actually needed. If the finding is not valid or
no change is required, do not modify code for that item and briefly explain why
it was skipped.
In `@tests/e2e/test_mcp.py` around lines 106 - 119, Replace the manual sys.modules
patching for fs_backend.resolve_conflicts (where code assigns original_resolve,
defines mock_resolve_func, and sets fs_backend.resolve_conflicts) with pytest's
monkeypatch fixture: use monkeypatch.setattr to replace
kaizen.backend.filesystem.resolve_conflicts with the mock_resolve_func so
teardown is automatic; ensure the mock function signature and return
(EntityUpdate list using first_entity_id) remain the same and remove the manual
restore of original_resolve.

kaizen/frontend/mcp/mcp_server.py (3)

200-205: Redundant traceback logging — logger.exception already includes the traceback.

traceback.print_exc() writes to stderr, and logger.exception(...) on the next line also logs the full traceback. This produces duplicate output. Remove the traceback import and print_exc() call, and rely solely on logger.exception.

Proposed fix

     except Exception as e:
-        import traceback
-        traceback.print_exc()
-        logger.exception(f"CRASH IN CREATE_ENTITY: {e}")
+        logger.exception(f"Error in create_entity: {e}")
         return json.dumps({"error": f"Server Error: {str(e)}"})

🤖 Prompt for AI Agents

Before applying any fix, first verify the finding against the current code and
decide whether a code change is actually needed. If the finding is not valid or
no change is required, do not modify code for that item and briefly explain why
it was skipped.
In `@kaizen/frontend/mcp/mcp_server.py` around lines 200 - 205, In the exception
handler inside create_entity (the except Exception as e: block) remove the
redundant traceback import and the traceback.print_exc() call and rely only on
logger.exception(...) to log the error and traceback; keep the
logger.exception(f"CRASH IN CREATE_ENTITY: {e}") and the subsequent return
json.dumps({"error": f"Server Error: {str(e)}"}) as the error response.

28-61: Broad fallback in get_client() (lines 50–57) may mask real errors.

The generic except Exception on line 50 catches any failure from get_namespace_details (connectivity errors, auth failures, serialization bugs, etc.) and blindly retries with create_namespace. This can hide the real root cause — if the backend is unreachable, the second call will also fail but with a potentially less informative error.

Consider narrowing the catch to only exceptions that meaningfully indicate "namespace may not exist" or at minimum logging the original error before retrying:

Proposed fix

         except NamespaceNotFoundException:
             logger.info(f"Creating namespace '{kaizen_config.namespace_id}'")
             _client.create_namespace(kaizen_config.namespace_id)
             logger.info(f"Namespace '{kaizen_config.namespace_id}' created successfully")
-        except Exception as e:
-            logger.error(f"Error ensuring namespace: {e}")
-            try:
-                _client.create_namespace(kaizen_config.namespace_id)
-                logger.info(f"Namespace '{kaizen_config.namespace_id}' created after error")
-            except Exception as create_error:
-                logger.error(f"Failed to create namespace: {create_error}")
-                raise
+        except Exception:
+            logger.exception(f"Unexpected error checking namespace '{kaizen_config.namespace_id}'")
+            raise

🤖 Prompt for AI Agents

Before applying any fix, first verify the finding against the current code and
decide whether a code change is actually needed. If the finding is not valid or
no change is required, do not modify code for that item and briefly explain why
it was skipped.
In `@kaizen/frontend/mcp/mcp_server.py` around lines 28 - 61, The broad except in
get_client surrounding _client.get_namespace_details(kaizen_config.namespace_id)
should be narrowed and the original error preserved: only catch exceptions that
indicate the namespace might not exist (e.g., NamespaceNotFoundException) or
specific recoverable errors you expect, and for any other Exception log the
original exception with full context (including the exception instance) and
re-raise instead of blindly calling _client.create_namespace; update the
get_client function to log the caught error (use the exception variable) before
any retry/create attempt and only attempt create_namespace when it makes sense.

34-58: Lazy init with bare globals is not thread-safe.

If get_client() is called concurrently (e.g., two simultaneous SSE requests), both callers can see _client is None and create separate KaizenClient instances, or race on _namespace_initialized. In a single-threaded async event loop this is likely fine since the tools are synchronous, but if the server ever handles requests on multiple threads this will break.

Worth documenting the single-threaded assumption or adding a threading.Lock guard if the intent is to be thread-safe.

🤖 Prompt for AI Agents

Before applying any fix, first verify the finding against the current code and
decide whether a code change is actually needed. If the finding is not valid or
no change is required, do not modify code for that item and briefly explain why
it was skipped.
In `@kaizen/frontend/mcp/mcp_server.py` around lines 34 - 58, The lazy
initialization of the global _client and _namespace_initialized in
get_client()/mcp_server.py is not thread-safe—concurrent callers can race and
create multiple KaizenClient instances or duplicate namespace work; protect the
initialization with a threading.Lock (e.g., a module-level _init_lock) and use
double-checked locking around the blocks that create _client and set
_namespace_initialized, wrapping both the client creation (KaizenClient()) and
namespace checks/creation (get_namespace_details, create_namespace with
kaizen_config.namespace_id) inside the lock to ensure only one thread performs
initialization, or alternatively document the single-threaded assumption if you
prefer not to add locking.

kaizen/db/sqlite_manager.py (1)

28-30: Move import os to the module top-level.

os is a standard-library module and there's no circular-import concern here. Placing it at the top of the file alongside the other stdlib imports (datetime, logging, sqlite3, threading) is more conventional and avoids a per-instantiation import lookup.

Proposed fix

At the top of the file (after line 4):

 import threading
+import os

Then in __init__:

     def __init__(self, db_path: str | None = None):
-        import os
         self.db_path = db_path or os.environ.get("KAIZEN_SQLITE_PATH", "entities.sqlite.db")

🤖 Prompt for AI Agents

Before applying any fix, first verify the finding against the current code and
decide whether a code change is actually needed. If the finding is not valid or
no change is required, do not modify code for that item and briefly explain why
it was skipped.
In `@kaizen/db/sqlite_manager.py` around lines 28 - 30, The inline import of os
inside SQLiteManager.__init__ should be moved to the module top-level with the
other stdlib imports (datetime, logging, sqlite3, threading) in
sqlite_manager.py; remove the inline "import os" from the __init__ method and
add a single top-level "import os" near the other imports so self.db_path =
db_path or os.environ.get("KAIZEN_SQLITE_PATH", "entities.sqlite.db") continues
to work without per-instantiation imports.

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

Verify each finding against the current code and only fix it if needed.


In `@kaizen/backend/milvus.py`:
- Around line 197-226: The conditional checking for empty search results is
redundant: replace the combined check "if not results or len(results) == 0" with
a single falsy check "if not results" where the Milvus search response is
handled (around the code that calls self.milvus.search and assigns to results),
keeping the subsequent early return of an empty list and leaving the hit-parsing
logic that uses parse_milvus_entity intact.
- Around line 177-179: The current call pattern — invoking
self.milvus.flush(namespace_id) and self.milvus.load_collection(namespace_id) on
every update_entities call — will become a throughput bottleneck; change
update_entities in the Milvus client wrapper so it does not synchronously
flush+load on every write but instead batches or defers those operations: add a
short-lived buffer or queue keyed by namespace_id and trigger flush+load either
when the buffer reaches a configurable batch_size or on a periodic background
timer (or when a force_flush flag is passed), and expose configuration flags
(e.g., batch_size, flush_interval, force_flush) so callers can opt into
immediate consistency when required; keep the flush+load calls but move them to
the batching/worker path and ensure thread-safe enqueueing in update_entities.
- Around line 228-235: The except handler in delete_entity_by_id should
explicitly chain the original ValueError: change the except to "except
ValueError as e:" and re-raise the KaizenException with "from e" (i.e., raise
KaizenException(f\"Invalid entity ID: {entity_id}. Entity IDs must be
numeric.\") from e) so the original ValueError context is preserved for callers
and logs.

In `@kaizen/db/sqlite_manager.py`:
- Around line 29-32: Move the local import of os out of the __init__ method and
add it to the module-level imports alongside datetime and logging; then simplify
__init__ to use the already-imported os when resolving db_path (db_path or
os.environ.get("KAIZEN_SQLITE_PATH", "entities.sqlite.db")). Ensure no other
uses of os remain inside methods so the dependency is visible at the top of the
file.

In `@kaizen/frontend/mcp/mcp_server.py`:
- Around line 188-193: Remove the redundant traceback.print_exc() and the inline
import inside the except block in the create_entity error handler: move the
import traceback to the top-level imports of mcp_server.py (so the module is
imported once), delete the traceback.print_exc() line in the except Exception as
e block, and keep the existing logger.exception(f"CRASH IN CREATE_ENTITY: {e}")
and json.dumps({"error": ...}) return as-is so the full traceback is emitted via
logger.exception().
- Around line 49-56: The current broad "except Exception as e" around the
namespace check can mask real connection/back-end errors and then try to create
the namespace; change this to only attempt creation when the failure is a
namespace-missing error (catch the specific NamespaceNotFoundException from your
client) and otherwise log the original exception with full context and re-raise;
if you cannot import that specific exception type, at minimum replace
logger.error(f"Error ensuring namespace: {e}") with logger.exception(...) to
include traceback and then re-raise when the exception is not a
NamespaceNotFoundException; update the block that calls
_client.create_namespace(kaizen_config.namespace_id) so it runs only under the
namespace-not-found branch and keep the create_error handling unchanged.

In `@tests/e2e/test_mcp.py`:
- Around line 28-32: The milvus temp DB file is being created in CWD because
milvus_db_file is a relative filename; change creation to place the file inside
pytest's tmp_path (e.g., build the path using tmp_path /
f"test_{uuid.uuid4().hex[:8]}.db") and set milvus_client_settings.uri to the
stringified tmp_path path so Milvus uses the tmp-managed directory (refer to
backend_type check, milvus_db_file variable, and milvus_client_settings.uri);
this ensures pytest auto-cleans the DB file and simplifies teardown.
- Around line 165-172: The test reads os.environ["KAIZEN_BACKEND"] at runtime to
choose patch_target, which couples the test to an env-var side effect; change
the test to accept the backend explicitly from the test fixtures instead: have
the mcp fixture (or a new fixture) return the backend type (or a namedtuple
including backend) or access request.param in a parametrized fixture, then use
that backend value to set patch_target
("kaizen.backend.filesystem.resolve_conflicts" vs
"kaizen.backend.milvus.resolve_conflicts") and call patch(patch_target) — update
references to KAIZEN_BACKEND, patch_target, and the mcp fixture/request.param
accordingly.

🧹 Nitpick comments (7)

🤖 Fix all nitpicks with AI agents

Verify each finding against the current code and only fix it if needed.


In `@kaizen/backend/milvus.py`:
- Around line 197-226: The conditional checking for empty search results is
redundant: replace the combined check "if not results or len(results) == 0" with
a single falsy check "if not results" where the Milvus search response is
handled (around the code that calls self.milvus.search and assigns to results),
keeping the subsequent early return of an empty list and leaving the hit-parsing
logic that uses parse_milvus_entity intact.
- Around line 177-179: The current call pattern — invoking
self.milvus.flush(namespace_id) and self.milvus.load_collection(namespace_id) on
every update_entities call — will become a throughput bottleneck; change
update_entities in the Milvus client wrapper so it does not synchronously
flush+load on every write but instead batches or defers those operations: add a
short-lived buffer or queue keyed by namespace_id and trigger flush+load either
when the buffer reaches a configurable batch_size or on a periodic background
timer (or when a force_flush flag is passed), and expose configuration flags
(e.g., batch_size, flush_interval, force_flush) so callers can opt into
immediate consistency when required; keep the flush+load calls but move them to
the batching/worker path and ensure thread-safe enqueueing in update_entities.
- Around line 228-235: The except handler in delete_entity_by_id should
explicitly chain the original ValueError: change the except to "except
ValueError as e:" and re-raise the KaizenException with "from e" (i.e., raise
KaizenException(f\"Invalid entity ID: {entity_id}. Entity IDs must be
numeric.\") from e) so the original ValueError context is preserved for callers
and logs.

In `@kaizen/db/sqlite_manager.py`:
- Around line 29-32: Move the local import of os out of the __init__ method and
add it to the module-level imports alongside datetime and logging; then simplify
__init__ to use the already-imported os when resolving db_path (db_path or
os.environ.get("KAIZEN_SQLITE_PATH", "entities.sqlite.db")). Ensure no other
uses of os remain inside methods so the dependency is visible at the top of the
file.

In `@kaizen/frontend/mcp/mcp_server.py`:
- Around line 188-193: Remove the redundant traceback.print_exc() and the inline
import inside the except block in the create_entity error handler: move the
import traceback to the top-level imports of mcp_server.py (so the module is
imported once), delete the traceback.print_exc() line in the except Exception as
e block, and keep the existing logger.exception(f"CRASH IN CREATE_ENTITY: {e}")
and json.dumps({"error": ...}) return as-is so the full traceback is emitted via
logger.exception().
- Around line 49-56: The current broad "except Exception as e" around the
namespace check can mask real connection/back-end errors and then try to create
the namespace; change this to only attempt creation when the failure is a
namespace-missing error (catch the specific NamespaceNotFoundException from your
client) and otherwise log the original exception with full context and re-raise;
if you cannot import that specific exception type, at minimum replace
logger.error(f"Error ensuring namespace: {e}") with logger.exception(...) to
include traceback and then re-raise when the exception is not a
NamespaceNotFoundException; update the block that calls
_client.create_namespace(kaizen_config.namespace_id) so it runs only under the
namespace-not-found branch and keep the create_error handling unchanged.

In `@tests/e2e/test_mcp.py`:
- Around line 165-172: The test reads os.environ["KAIZEN_BACKEND"] at runtime to
choose patch_target, which couples the test to an env-var side effect; change
the test to accept the backend explicitly from the test fixtures instead: have
the mcp fixture (or a new fixture) return the backend type (or a namedtuple
including backend) or access request.param in a parametrized fixture, then use
that backend value to set patch_target
("kaizen.backend.filesystem.resolve_conflicts" vs
"kaizen.backend.milvus.resolve_conflicts") and call patch(patch_target) — update
references to KAIZEN_BACKEND, patch_target, and the mcp fixture/request.param
accordingly.

kaizen/backend/milvus.py (3)

197-226: Vector search path handles empty filters and varied hit formats well — one minor redundancy.

Line 211: if not results or len(results) == 0: — not results already covers None, [], and other falsy values, making len(results) == 0 redundant. Minor nit.

The hit-parsing logic at lines 217–224 correctly handles both hit['entity'] and flat-dict formats across PyMilvus versions.
Minor simplification
-            if not results or len(results) == 0:
+            if not results:
                 return []
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@kaizen/backend/milvus.py` around lines 197 - 226, The conditional checking
for empty search results is redundant: replace the combined check "if not
results or len(results) == 0" with a single falsy check "if not results" where
the Milvus search response is handled (around the code that calls
self.milvus.search and assigns to results), keeping the subsequent early return
of an empty list and leaving the hit-parsing logic that uses parse_milvus_entity
intact.
177-179: flush + load_collection after every update_entities call may become a performance bottleneck under high write throughput.

This guarantees read-after-write consistency, which is correct for the current use case. If write frequency increases, consider batching or deferring these calls.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@kaizen/backend/milvus.py` around lines 177 - 179, The current call pattern —
invoking self.milvus.flush(namespace_id) and
self.milvus.load_collection(namespace_id) on every update_entities call — will
become a throughput bottleneck; change update_entities in the Milvus client
wrapper so it does not synchronously flush+load on every write but instead
batches or defers those operations: add a short-lived buffer or queue keyed by
namespace_id and trigger flush+load either when the buffer reaches a
configurable batch_size or on a periodic background timer (or when a force_flush
flag is passed), and expose configuration flags (e.g., batch_size,
flush_interval, force_flush) so callers can opt into immediate consistency when
required; keep the flush+load calls but move them to the batching/worker path
and ensure thread-safe enqueueing in update_entities.
228-235: Numeric ID validation in delete_entity_by_id — consider chaining the original ValueError.

The raise KaizenException(...) inside the except ValueError block will implicitly chain the exception. Using explicit chaining (from) is more intentional:
Proposed fix
         except ValueError:
-            raise KaizenException(f"Invalid entity ID: {entity_id}. Entity IDs must be numeric.")
+            raise KaizenException(f"Invalid entity ID: {entity_id}. Entity IDs must be numeric.") from None
Using from None suppresses the noisy implicit chain if the ValueError context isn't useful to callers, or use from e if you want to preserve it explicitly.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@kaizen/backend/milvus.py` around lines 228 - 235, The except handler in
delete_entity_by_id should explicitly chain the original ValueError: change the
except to "except ValueError as e:" and re-raise the KaizenException with "from
e" (i.e., raise KaizenException(f\"Invalid entity ID: {entity_id}. Entity IDs
must be numeric.\") from e) so the original ValueError context is preserved for
callers and logs.

kaizen/frontend/mcp/mcp_server.py (2)

188-193: Redundant traceback.print_exc() alongside logger.exception(); move import traceback to the top of the file.

logger.exception() already emits the full traceback to the log. The additional traceback.print_exc() duplicates output to stderr and can interleave with structured logs. Also, importing traceback inside the except block is unconventional for a stdlib module.

Proposed fix

At the top of the file:

 import json
 import logging
+import traceback
 import uuid

In the except block:

     except Exception as e:
-        import traceback
-
-        traceback.print_exc()
         logger.exception(f"CRASH IN CREATE_ENTITY: {e}")
         return json.dumps({"error": f"Server Error: {str(e)}"})

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@kaizen/frontend/mcp/mcp_server.py` around lines 188 - 193, Remove the
redundant traceback.print_exc() and the inline import inside the except block in
the create_entity error handler: move the import traceback to the top-level
imports of mcp_server.py (so the module is imported once), delete the
traceback.print_exc() line in the except Exception as e block, and keep the
existing logger.exception(f"CRASH IN CREATE_ENTITY: {e}") and
json.dumps({"error": ...}) return as-is so the full traceback is emitted via
logger.exception().

49-56: Generic except fallback may mask connection errors as namespace-creation attempts.

When get_namespace_details fails for a non-NamespaceNotFoundException reason (e.g., connection timeout, backend unavailable), the code falls through to attempt create_namespace. This could produce confusing error messages or silently swallow the real issue. Consider logging the original exception type or narrowing the catch.

Proposed fix — log the original error with full context

         except Exception as e:
-            logger.error(f"Error ensuring namespace: {e}")
+            logger.error(f"Unexpected error checking namespace (type={type(e).__name__}): {e}")
             try:
                 _client.create_namespace(kaizen_config.namespace_id)
                 logger.info(f"Namespace '{kaizen_config.namespace_id}' created after error")
             except Exception as create_error:
-                logger.error(f"Failed to create namespace: {create_error}")
+                logger.error(f"Failed to create namespace after prior error: {create_error}")
                 raise

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@kaizen/frontend/mcp/mcp_server.py` around lines 49 - 56, The current broad
"except Exception as e" around the namespace check can mask real
connection/back-end errors and then try to create the namespace; change this to
only attempt creation when the failure is a namespace-missing error (catch the
specific NamespaceNotFoundException from your client) and otherwise log the
original exception with full context and re-raise; if you cannot import that
specific exception type, at minimum replace logger.error(f"Error ensuring
namespace: {e}") with logger.exception(...) to include traceback and then
re-raise when the exception is not a NamespaceNotFoundException; update the
block that calls _client.create_namespace(kaizen_config.namespace_id) so it runs
only under the namespace-not-found branch and keep the create_error handling
unchanged.

kaizen/db/sqlite_manager.py (1)

29-32: Move import os to the top of the file.

os is a standard-library module with no risk of circular imports. Placing it inside __init__ is unconventional and hides a dependency. Moving it to the top alongside datetime, logging, etc. is more idiomatic.
Proposed fix

At the top of the file, add os to the existing imports:
 import datetime
 import logging
+import os
 import sqlite3
 import threading
Then simplify __init__:
     def __init__(self, db_path: str | None = None):
-        import os
-
         self.db_path = db_path or os.environ.get("KAIZEN_SQLITE_PATH", "entities.sqlite.db")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@kaizen/db/sqlite_manager.py` around lines 29 - 32, Move the local import of
os out of the __init__ method and add it to the module-level imports alongside
datetime and logging; then simplify __init__ to use the already-imported os when
resolving db_path (db_path or os.environ.get("KAIZEN_SQLITE_PATH",
"entities.sqlite.db")). Ensure no other uses of os remain inside methods so the
dependency is visible at the top of the file.

tests/e2e/test_mcp.py (1)

165-172: Dynamic patch target selection is correct but fragile.

Reading os.environ["KAIZEN_BACKEND"] inside the test body works with the current fixture, but couples the test to an env-var side effect. Consider accessing request.param via a separate fixture or returning the backend type from the mcp fixture (e.g., as part of a named tuple) for explicitness.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/e2e/test_mcp.py` around lines 165 - 172, The test reads
os.environ["KAIZEN_BACKEND"] at runtime to choose patch_target, which couples
the test to an env-var side effect; change the test to accept the backend
explicitly from the test fixtures instead: have the mcp fixture (or a new
fixture) return the backend type (or a namedtuple including backend) or access
request.param in a parametrized fixture, then use that backend value to set
patch_target ("kaizen.backend.filesystem.resolve_conflicts" vs
"kaizen.backend.milvus.resolve_conflicts") and call patch(patch_target) — update
references to KAIZEN_BACKEND, patch_target, and the mcp fixture/request.param
accordingly.

vinodmut · 2026-02-18T16:35:32Z

@visahak has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 5 minutes and 22 seconds before requesting another review.

Wow! 👏

coderabbitai

♻️ Duplicate comments (1)

tests/e2e/test_mcp.py (1)

15-105: ⚠️ Potential issue | 🟠 Major

Restore env vars after the fixture to avoid cross‑test leakage.

Direct os.environ mutation isn’t restored on teardown, so later tests can inherit KAIZEN_* values. Using pytest’s monkeypatch makes teardown automatic.

✅ Proposed fix (use monkeypatch)

 `@pytest.fixture`(params=["milvus", "filesystem"])
-def mcp(request, tmp_path):
+def mcp(request, tmp_path, monkeypatch):
     backend_type = request.param
-    os.environ["KAIZEN_NAMESPACE_ID"] = "test"
-    os.environ["KAIZEN_BACKEND"] = backend_type
+    monkeypatch.setenv("KAIZEN_NAMESPACE_ID", "test")
+    monkeypatch.setenv("KAIZEN_BACKEND", backend_type)

     # Common SQLite setup
-    os.environ["KAIZEN_SQLITE_PATH"] = str(tmp_path / f"test_{backend_type}.sqlite.db")
+    monkeypatch.setenv("KAIZEN_SQLITE_PATH", str(tmp_path / f"test_{backend_type}.sqlite.db"))

     # Backend-specific setup
     milvus_db_file = None
     original_milvus_uri = None

     if backend_type == "milvus":
         # Use a unique DB file inside tmp_path for each test to avoid socket/locking issues and ensure cleanup
         milvus_db_file = str(tmp_path / f"test_{uuid.uuid4().hex[:8]}.db")
         original_milvus_uri = milvus_client_settings.uri
         milvus_client_settings.uri = milvus_db_file
         # Note: currently milvus-lite creates a file, not a dir for the uri
         # We set KAIZEN_URI just in case, though settings init should pick it up if we did it before
     elif backend_type == "filesystem":
-        os.environ["KAIZEN_DATA_DIR"] = str(tmp_path)
+        monkeypatch.setenv("KAIZEN_DATA_DIR", str(tmp_path))

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/e2e/test_mcp.py` around lines 15 - 105, The fixture mcp mutates
os.environ directly which leaks into other tests; change the fixture signature
to accept pytest's monkeypatch (def mcp(request, tmp_path, monkeypatch):) and
replace every os.environ[...] = ... with monkeypatch.setenv(...) for
KAIZEN_NAMESPACE_ID, KAIZEN_BACKEND, KAIZEN_SQLITE_PATH and KAIZEN_DATA_DIR
(only for the filesystem branch); leave the milvus_client_settings.uri override
as-is but keep restoring original_milvus_uri at teardown, and remove any manual
env cleanup since monkeypatch will automatically restore env vars after the
fixture ends; ensure references to the fixture (mcp) and module resets
(mcp_server_module._client, mcp_server_module._namespace_initialized) remain
unchanged.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@tests/e2e/test_mcp.py`:
- Around line 15-105: The fixture mcp mutates os.environ directly which leaks
into other tests; change the fixture signature to accept pytest's monkeypatch
(def mcp(request, tmp_path, monkeypatch):) and replace every os.environ[...] =
... with monkeypatch.setenv(...) for KAIZEN_NAMESPACE_ID, KAIZEN_BACKEND,
KAIZEN_SQLITE_PATH and KAIZEN_DATA_DIR (only for the filesystem branch); leave
the milvus_client_settings.uri override as-is but keep restoring
original_milvus_uri at teardown, and remove any manual env cleanup since
monkeypatch will automatically restore env vars after the fixture ends; ensure
references to the fixture (mcp) and module resets (mcp_server_module._client,
mcp_server_module._namespace_initialized) remain unchanged.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 49e615e and b118e3d.

📒 Files selected for processing (2)

kaizen/frontend/mcp/mcp_server.py
tests/e2e/test_mcp.py

coderabbitai Bot reviewed Feb 16, 2026

View reviewed changes

Comment thread kaizen/frontend/mcp/mcp_server.py Outdated

Comment thread tests/e2e/test_mcp.py Outdated

This comment was marked as outdated.

Sign in to view

visahak force-pushed the fix/mcp-sync-and-stabilization branch from fcafc16 to 77d4a35 Compare February 16, 2026 20:29

visahak added 3 commits February 16, 2026 15:31

style: fix ruff formatting and linting issues

0569ef1

fix: resolve mypy type checking errors

f8a187f

fix: Milvus backend stability and parameterized E2E tests

49e615e

AgentToolkit deleted a comment from coderabbitai Bot Feb 16, 2026

coderabbitai Bot reviewed Feb 16, 2026

View reviewed changes

Comment thread tests/e2e/test_mcp.py

visahak added 2 commits February 16, 2026 17:22

fix: make MCP server initialization thread-safe

45d4b22

test: isolate Milvus temporary database inside tmp_path

30fa85e

vinodmut approved these changes Feb 18, 2026

View reviewed changes

Merge branch 'main' into fix/mcp-sync-and-stabilization

b118e3d

coderabbitai Bot reviewed Feb 23, 2026

View reviewed changes

visahak merged commit cf6171a into AgentToolkit:main Feb 23, 2026
15 checks passed

This was referenced Feb 23, 2026

Feat/kaizen policies #68

Merged

feat: Add tip provenance tracking and metadata #73

Merged

visahak deleted the fix/mcp-sync-and-stabilization branch April 1, 2026 23:42

coderabbitai Bot mentioned this pull request Apr 30, 2026

feat(mcp): user facts tools, SSE transport hardening, and warmup #238

Merged

9 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

chore: sync with main and stabilize MCP server#59

chore: sync with main and stabilize MCP server#59
visahak merged 7 commits into
AgentToolkit:mainfrom
visahak:fix/mcp-sync-and-stabilization

visahak commented Feb 16, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Feb 16, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

vinodmut commented Feb 18, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

visahak commented Feb 16, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description:

Verification:

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Feb 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

vinodmut commented Feb 18, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

visahak commented Feb 16, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Feb 16, 2026 •

edited

Loading