Ensure namespace and db conflict resolve and auto-creating with postgres#197
Conversation
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 23 minutes and 35 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (7)
📝 WalkthroughWalkthroughDefaults changed to filesystem backend; Milvus made optional; Postgres backend gained optional auto-create/bootstrap logic and metadata-aware filters; MCP SSE handling and namespace initialization simplified; docs, env example, tests, and dependency declarations updated. Changes
Sequence Diagram(s)sequenceDiagram
participant PEB as PostgresEntityBackend
participant TDB as Target DB (EVOLVE_PG_DBNAME)
participant BDB as Bootstrap DB (bootstrap_db, template1, user)
PEB->>TDB: attempt _connect(target_dbname)
TDB--xPEB: fails (missing-database)
alt auto_create_db == true
PEB->>PEB: detect missing-database
PEB->>BDB: _create_database() try candidate bootstrap_db
alt CREATE DATABASE succeeds
BDB-->>PEB: success
PEB->>TDB: reconnect to target_dbname
TDB-->>PEB: connection established
else candidate fails
PEB->>BDB: try next candidate (template1, user)
BDB-->>PEB: success or exhausted
alt exhausted
PEB-->>PEB: raise EvolveException (tried candidates + last error)
end
end
else auto_create_db == false
PEB-->>PEB: re-raise missing-database error
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
docs/guides/configuration.md (1)
42-110:⚠️ Potential issue | 🟡 MinorDocumentation contradicts the new default backend.
EvolveConfig.backendnow defaults to"filesystem"(peraltk_evolve/config/evolve.pyline 7 and the newdocs/guides/backend-configuration.md), but this file still describesmilvusas the default in three places:
- Line 44:
EVOLVE_BACKENDdefault column saysmilvus.- Line 95:
**Milvus** (default)in the Storage Backends table.- Lines 102–103:
# Use Milvus backend (default)comment.These should be updated to point at
filesystemto match the code change and avoid confusing users.📝 Proposed fix
-| `EVOLVE_BACKEND` | Backend provider (`milvus`, `filesystem`, or `postgres`) | `milvus` | +| `EVOLVE_BACKEND` | Backend provider (`milvus`, `filesystem`, or `postgres`) | `filesystem` |-| **Milvus** (default) | Vector database with embeddings | Semantic similarity | Production | -| **Filesystem** | JSON files, no embeddings | Text substring match | Development/testing | +| **Filesystem** (default) | JSON files, no embeddings | Text substring match | Development/testing | +| **Milvus** | Vector database with embeddings | Semantic similarity | Production |-# Use Milvus backend (default) +# Use Milvus backend export EVOLVE_BACKEND=milvus -# Use Filesystem backend +# Use Filesystem backend (default) export EVOLVE_BACKEND=filesystem🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/guides/configuration.md` around lines 42 - 110, The doc incorrectly states Milvus is the default; update the three spots to reflect the code default "filesystem": change the EVOLVE_BACKEND default column value from `milvus` to `filesystem` (referencing the EVOLVE_BACKEND variable), update the Storage Backends table row header from "**Milvus** (default)" to remove "(default)" and instead mark "**Filesystem** (default)" or annotate Filesystem as the default, and update the example comment line `# Use Milvus backend (default)` to `# Use Filesystem backend (default)` so the documentation matches EvolveConfig.backend and the backend-configuration.md changes.
🧹 Nitpick comments (1)
tests/unit/test_mcp_server.py (1)
106-128: Consider a fixture for the global-state reset.The try/finally restoration of
_clientand_namespace_initializedworks, but if another test also exercisesget_client()it will share state with this one. Wrapping the save/reset/restore in a pytest fixture (or usingmonkeypatch.setattrfor the module attributes, which auto-reverts) would be more robust and reusable.-def test_get_client_uses_idempotent_namespace_bootstrap(monkeypatch): - original_client = mcp_server_module._client - original_initialized = mcp_server_module._namespace_initialized - - fake_client = MagicMock() - ... - try: - mcp_server_module._client = None - mcp_server_module._namespace_initialized = False - monkeypatch.setattr(mcp_server_module, "EvolveClient", lambda: fake_client) - ... - finally: - mcp_server_module._client = original_client - mcp_server_module._namespace_initialized = original_initialized +def test_get_client_uses_idempotent_namespace_bootstrap(monkeypatch): + fake_client = MagicMock() + fake_client.ensure_namespace.return_value = Namespace( + id="evolve", created_at=datetime.datetime.now(datetime.UTC) + ) + monkeypatch.setattr(mcp_server_module, "_client", None) + monkeypatch.setattr(mcp_server_module, "_namespace_initialized", False) + monkeypatch.setattr(mcp_server_module, "EvolveClient", lambda: fake_client) + + client = mcp_server_module.get_client() + assert client is fake_client + fake_client.ensure_namespace.assert_called_once_with( + mcp_server_module.evolve_config.namespace_id + ) + fake_client.get_namespace_details.assert_not_called() + fake_client.create_namespace.assert_not_called()Otherwise, the assertions correctly validate the new idempotent namespace-bootstrap behavior. As per coding guidelines, this test is already marked via the module-level
pytestmark = pytest.mark.unit.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/test_mcp_server.py` around lines 106 - 128, Replace the manual try/finally save-and-restore of mcp_server_module._client and mcp_server_module._namespace_initialized in the test with a pytest fixture or monkeypatch.setattr so the module globals are automatically reverted; specifically, remove the direct assignments around the call to get_client() and instead use monkeypatch.setattr(mcp_server_module, "_client", None) and monkeypatch.setattr(mcp_server_module, "_namespace_initialized", False) (or create a fixture that does the save/reset/restore) while keeping the monkeypatch replacement of EvolveClient and the existing assertions on get_client(), ensure_namespace, get_namespace_details, and create_namespace.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@altk_evolve/backend/postgres.py`:
- Around line 111-123: The CREATE DATABASE race needs to be treated as success:
in _connect_target_db when auto_create_db triggers an admin CREATE DATABASE via
admin_conn and the cursor.execute raises an exception, inspect the exception's
SQLSTATE/pgcode and if it equals '42P04' (duplicate_database) treat it as
success by attempting to connect to the target DB and returning rather than
treating it as a bootstrap failure; update the exception handling around the
admin_conn cursor/execute (use the existing last_error and logger.debug) to
detect pgcode == '42P04' and proceed to reconnect to the target DB, and add a
unit test that mocks admin_conn cursor.execute to raise an exception with pgcode
'42P04' while admin connection succeeds and asserts that _connect_target_db
connects to the target DB successfully.
In `@altk_evolve/config/evolve.py`:
- Line 7: The change of the default backend from "milvus" to "filesystem" can
silently break upgrades; update the release/changelog to call out this breaking
change and, in code where the config `backend` default is set (the `backend:
Literal["milvus","filesystem","postgres"] = "filesystem"` in
altk_evolve/config/evolve.py), add a one-time startup warning when the
environment variable `EVOLVE_BACKEND` is not set to inform users that the
default has changed (log a clear message indicating prior default was "milvus"
and advise setting EVOLVE_BACKEND to "milvus" to keep previous behavior). Ensure
the warning is emitted only once per process (use a static flag or startup hook)
so it is visible on upgrade without spamming logs.
In `@docs/guides/backend-configuration.md`:
- Around line 45-51: Update the pros list entry in backend-configuration.md to
indicate that automatic DB creation is opt-in: clarify the line "Automatic
database creation with fallback logic" by adding a parenthetical noting it's
opt-in via EVOLVE_PG_AUTO_CREATE_DB=true, and reference the config default in
altk_evolve/config/postgres.py (auto_create_db defaults to False) so readers
don't assume it is enabled by default.
---
Outside diff comments:
In `@docs/guides/configuration.md`:
- Around line 42-110: The doc incorrectly states Milvus is the default; update
the three spots to reflect the code default "filesystem": change the
EVOLVE_BACKEND default column value from `milvus` to `filesystem` (referencing
the EVOLVE_BACKEND variable), update the Storage Backends table row header from
"**Milvus** (default)" to remove "(default)" and instead mark "**Filesystem**
(default)" or annotate Filesystem as the default, and update the example comment
line `# Use Milvus backend (default)` to `# Use Filesystem backend (default)` so
the documentation matches EvolveConfig.backend and the backend-configuration.md
changes.
---
Nitpick comments:
In `@tests/unit/test_mcp_server.py`:
- Around line 106-128: Replace the manual try/finally save-and-restore of
mcp_server_module._client and mcp_server_module._namespace_initialized in the
test with a pytest fixture or monkeypatch.setattr so the module globals are
automatically reverted; specifically, remove the direct assignments around the
call to get_client() and instead use monkeypatch.setattr(mcp_server_module,
"_client", None) and monkeypatch.setattr(mcp_server_module,
"_namespace_initialized", False) (or create a fixture that does the
save/reset/restore) while keeping the monkeypatch replacement of EvolveClient
and the existing assertions on get_client(), ensure_namespace,
get_namespace_details, and create_namespace.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1b50e973-84f9-4f25-8a16-07a1fd0bbc5b
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (12)
.env.example.secrets.baselineREADME.mdaltk_evolve/backend/postgres.pyaltk_evolve/config/evolve.pyaltk_evolve/config/postgres.pyaltk_evolve/frontend/mcp/mcp_server.pydocs/guides/backend-configuration.mddocs/guides/configuration.mdpyproject.tomltests/unit/test_mcp_server.pytests/unit/test_postgres_backend.py
c9880b8 to
ca840f0
Compare
Postgres searches were treating metadata-backed filters like task_id as physical columns, which broke save_trajectory lookups. The MCP SSE launcher was also bypassing the disconnect-safe HTTP transport, so client disconnect races surfaced as ASGI exception groups during response flush. This change routes unknown Postgres filters through JSONB metadata, preserves explicit schema-field filtering, and launches SSE through the resilient Starlette transport so benign disconnects are swallowed at the transport boundary. The tests lock the filter parameterization and the launcher wiring to prevent the same regressions from returning. Constraint: Keep backend filter semantics aligned across filesystem, Milvus, and Postgres without changing MCP tool call shapes Constraint: Preserve FastMCP stdio startup behavior while changing only the SSE launch path Rejected: Special-case task_id inside save_trajectory | would leave Postgres filter semantics inconsistent for other metadata keys Rejected: Catch ClosedResourceError higher in the ASGI stack | broader than necessary and duplicates the existing transport hardening point Confidence: high Scope-risk: moderate Reversibility: clean Directive: Route new non-schema Postgres filters through metadata unless a real column is added and covered across backends Tested: uv run pytest -v tests/unit/test_postgres_backend.py tests/unit/test_mcp_server.py Tested: uv run pytest -v tests/unit/test_mcp_http_transport.py tests/unit/test_mcp_launcher.py Not-tested: Live SSE session with a real disconnecting client against uvicorn Related: save_trajectory MCP SSE transport
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (2)
altk_evolve/backend/postgres.py (1)
108-137: Bootstrap fallback will mask non-raceCREATE DATABASEfailures and produce misleading errors.The outer
except Exception as eat L127 catches both connection failures and re-raisedCREATE DATABASEerrors from the inner block (e.g., insufficient privileges, invalid encoding, name collisions that aren't duplicate_database). In those cases we silently move on to the next bootstrap candidate and retry the same failing CREATE, eventually raising a generic "Tried bootstrap databases: …; Last error: …" that obscures the real root cause (which was usually the first attempt, not the last).Consider splitting the error handling: only fall through to the next candidate on connect failures to the admin DB; for
CREATE DATABASEerrors other than duplicate_database, raise immediately.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@altk_evolve/backend/postgres.py` around lines 108 - 137, The current loop catches all exceptions (including CREATE DATABASE failures) in the outer except and treats them as bootstrap connection failures; change the control flow so only connection errors from self._connect cause a continue to the next bootstrap_db while CREATE DATABASE errors (except duplicate_database / pgcode "42P04") are raised immediately. Concretely: call self._connect(bootstrap_db) inside a small try/except that only handles connection-related exceptions (or catch specific connection exception types) and sets last_error/continues; keep the CREATE DATABASE block (with create_error handling for pgcode "42P04") separate so non-duplicate create_error is re-raised and not swallowed by the outer loop; preserve use of admin_conn, create_error, last_error and raise EvolveException only if all bootstrap connection attempts truly failed.tests/unit/test_mcp_launcher.py (1)
63-72: Assert the resilient factory receives the launcher MCP server.The test would still pass if
run_sse_server()calledcreate_resilient_sse_app()with the wrong server object. Capturing the factory argument closes that gap.Proposed test strengthening
def test_run_sse_server_uses_resilient_http_transport(monkeypatch) -> None: resilient_app = object() captured: list[tuple[object, str, int, str]] = [] + factory_calls: list[object] = [] - monkeypatch.setattr(launcher, "create_resilient_sse_app", lambda server: resilient_app) + def fake_create_resilient_sse_app(server): + factory_calls.append(server) + return resilient_app + + monkeypatch.setattr(launcher, "create_resilient_sse_app", fake_create_resilient_sse_app) monkeypatch.setattr( launcher.uvicorn, "run", lambda app, host, port, log_level: captured.append((app, host, port, log_level)), ) launcher.run_sse_server(host="127.0.0.1", port=8201) + assert factory_calls == [launcher.mcp] assert captured == [(resilient_app, "127.0.0.1", 8201, "warning")]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/test_mcp_launcher.py` around lines 63 - 72, The test must also verify that create_resilient_sse_app was called with the launcher MCP server: modify the monkeypatch for launcher.create_resilient_sse_app so that its replacement appends the received server argument into captured before returning resilient_app, keep the monkeypatch of launcher.uvicorn.run as-is, call launcher.run_sse_server(...), then assert that the captured list contains the factory-captured server object (the item you appended) and that the uvicorn.run call received resilient_app (i.e. verify the factory argument equals the server object produced/passed by run_sse_server and the uvicorn run tuple remains (resilient_app, "127.0.0.1", 8201, "warning")).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@altk_evolve/backend/postgres.py`:
- Around line 293-305: The new filter routing silently treats any key not in
self._schema_filter_fields as a metadata query (building metadata_key and
appending metadata @> ...), which can break callers that expect bare keys like
"task_id" or "user_id" to match top-level columns; update the filter handling in
the method that constructs where_parts/params so that unknown bare keys are
rejected (raise ValueError) or require callers to prefix metadata keys with
"metadata." (validate key.startswith("metadata.") before compiling jsonb
predicates), and add a clear log/error message explaining accepted keys
(referencing self._schema_filter_fields, filters, metadata_key, where_parts,
params) and update docs/release notes or add a migration note for existing
writers to move fields into metadata if you choose the metadata-prefixed
behavior.
In `@tests/unit/test_mcp_launcher.py`:
- Line 59: This module's tests are missing the pytest unit marker; add a
module-level pytest marker by importing pytest and setting pytestmark =
pytest.mark.unit at the top of the test_mcp_launcher.py file so the test
function test_run_sse_server_uses_resilient_http_transport and the rest of the
module are marked as unit tests.
In `@tests/unit/test_postgres_backend.py`:
- Around line 256-296: The test and production code must use psycopg3's SQLSTATE
attribute: change the duplicate-database detection in PostgresEntityBackend (the
error handling logic that currently checks exception.pgcode) to check
exception.sqlstate == "42P04" instead, and update the test
DuplicateDatabaseError mock to expose sqlstate = "42P04" (remove or replace
pgcode) so the race-condition branch is exercised; ensure any other checks in
PostgresEntityBackend that reference pgcode are switched to sqlstate for
compatibility with psycopg3.
---
Nitpick comments:
In `@altk_evolve/backend/postgres.py`:
- Around line 108-137: The current loop catches all exceptions (including CREATE
DATABASE failures) in the outer except and treats them as bootstrap connection
failures; change the control flow so only connection errors from self._connect
cause a continue to the next bootstrap_db while CREATE DATABASE errors (except
duplicate_database / pgcode "42P04") are raised immediately. Concretely: call
self._connect(bootstrap_db) inside a small try/except that only handles
connection-related exceptions (or catch specific connection exception types) and
sets last_error/continues; keep the CREATE DATABASE block (with create_error
handling for pgcode "42P04") separate so non-duplicate create_error is re-raised
and not swallowed by the outer loop; preserve use of admin_conn, create_error,
last_error and raise EvolveException only if all bootstrap connection attempts
truly failed.
In `@tests/unit/test_mcp_launcher.py`:
- Around line 63-72: The test must also verify that create_resilient_sse_app was
called with the launcher MCP server: modify the monkeypatch for
launcher.create_resilient_sse_app so that its replacement appends the received
server argument into captured before returning resilient_app, keep the
monkeypatch of launcher.uvicorn.run as-is, call launcher.run_sse_server(...),
then assert that the captured list contains the factory-captured server object
(the item you appended) and that the uvicorn.run call received resilient_app
(i.e. verify the factory argument equals the server object produced/passed by
run_sse_server and the uvicorn run tuple remains (resilient_app, "127.0.0.1",
8201, "warning")).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 2fb7bd1f-4c0f-4fdc-8089-25236cfcecd1
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (16)
.env.example.secrets.baselineREADME.mdaltk_evolve/backend/postgres.pyaltk_evolve/config/evolve.pyaltk_evolve/config/postgres.pyaltk_evolve/frontend/mcp/__main__.pyaltk_evolve/frontend/mcp/http_transport.pyaltk_evolve/frontend/mcp/mcp_server.pydocs/guides/backend-configuration.mddocs/guides/configuration.mdpyproject.tomltests/unit/test_mcp_http_transport.pytests/unit/test_mcp_launcher.pytests/unit/test_mcp_server.pytests/unit/test_postgres_backend.py
✅ Files skipped from review due to trivial changes (6)
- README.md
- altk_evolve/config/postgres.py
- .secrets.baseline
- docs/guides/configuration.md
- docs/guides/backend-configuration.md
- tests/unit/test_mcp_server.py
🚧 Files skipped from review as they are similar to previous changes (3)
- altk_evolve/config/evolve.py
- .env.example
- pyproject.toml
Postgres search filters now reject unknown bare keys instead of silently routing them into JSON metadata predicates, which keeps top-level column matching explicit and aligns internal callers on the metadata.<key> contract. The same pass also switches duplicate-database race handling to psycopg3's sqlstate attribute and marks the MCP launcher test module as unit-scoped. Constraint: Existing metadata filters need an explicit metadata. prefix in the Postgres backend Constraint: Pre-commit updates .secrets.baseline when tracked line numbers change Rejected: Keep accepting bare unknown keys as metadata lookups | silently breaks callers expecting schema-column matching Confidence: high Scope-risk: moderate Reversibility: clean Directive: Preserve explicit schema-vs-metadata filter routing across backends instead of reintroducing fallback inference Tested: uv run pytest -v tests/unit/test_postgres_backend.py -k 'search_entities'; uv run pytest -v -m e2e tests/e2e/test_mcp.py::test_save_trajectory_and_retrieve_guidelines; uv run pytest -v tests/unit/test_mcp_launcher.py; uv run pytest -v tests/unit/test_postgres_backend.py -k 'race_condition or auto_creates_missing_database or missing_database_without_auto_create_raises'; uv run ruff check altk_evolve/backend/postgres.py altk_evolve/frontend/mcp/mcp_server.py tests/unit/test_postgres_backend.py tests/unit/test_mcp_launcher.py Not-tested: Cross-backend consistency changes for filesystem and Milvus filter validation
|
I didn't check, but we May need to update the documentation on milvus and postgres. |
Summary by CodeRabbit