fix: surface source description in MCP tool description (#267)#309
Conversation
The `description` field on TOML [[sources]] was parsed into SourceConfig and exposed via the REST /api/sources endpoint, but never surfaced through MCP tool metadata. AI clients using DBHub via MCP only saw the hardcoded template "Execute SQL queries on the '<id>' <dbType> database" regardless of how the user described the source, making it impossible for models to route requests between multiple sources based on the user's descriptions. Prepend sourceConfig.description (when set) to both execute_sql and search_objects generated tool descriptions so AI clients see the source's purpose first while retaining all existing technical context (readonly note, max_rows note, dbType, source id). Custom tool descriptions (CustomToolConfig.description) are untouched because custom tools already carry their own required description. Fixes bytebase#267
There was a problem hiding this comment.
Pull request overview
Surfaces [[sources]].description from dbhub.toml into the MCP tool descriptions so AI clients can see per-source intent/purpose when multiple sources are configured.
Changes:
- Prefix
sourceConfig.descriptiononto the generated tooldescriptionstrings forexecute_sqlandsearch_objects. - Add Vitest integration coverage ensuring the prefix is present (and ordered first) when configured, and absent when not.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
src/utils/tool-metadata.ts |
Prepends source descriptions into built-in tool metadata used for MCP registration and API responses. |
src/utils/__tests__/tool-metadata.integration.test.ts |
Adds integration tests validating description propagation behavior for sources with/without description. |
| // Determine description with more context. | ||
| // Prepend the user-provided `description` from the source config (if set) | ||
| // so AI clients reading the MCP tool list see the source's purpose first. | ||
| const userDescPrefix = sourceConfig.description ? `${sourceConfig.description}. ` : ""; |
There was a problem hiding this comment.
userDescPrefix unconditionally appends ". " to the user-provided sourceConfig.description. If the config already ends with punctuation (e.g. ".", "!", "?"), this produces awkward double punctuation in the registered tool description (e.g. "Prod DB.. Execute SQL...") and also fails to handle leading/trailing whitespace. Consider trimming first and using a separator that doesn’t mutate punctuation (e.g., add a space, or only add a period when the description doesn’t already end with sentence-ending punctuation). Adding a small test for a description ending with "." would prevent regressions.
| const userDescPrefix = sourceConfig.description ? `${sourceConfig.description}. ` : ""; | |
| const trimmedSourceDescription = sourceConfig.description?.trim() ?? ""; | |
| const userDescPrefix = trimmedSourceDescription | |
| ? /[.!?]$/.test(trimmedSourceDescription) | |
| ? `${trimmedSourceDescription} ` | |
| : `${trimmedSourceDescription}. ` | |
| : ""; |
There was a problem hiding this comment.
Addressed in 055f2a5. Extracted the logic into a pure helper buildSourceDescriptionPrefix(description) that: (1) trims surrounding whitespace, (2) returns "" for undefined/empty/whitespace-only inputs, and (3) appends only a space (no extra period) when the trimmed description already ends with ., !, or ?. The same helper is now used in both getExecuteSqlMetadata and getSearchObjectsMetadata so the two sites can't drift. Added 10 pure unit tests in src/utils/__tests__/tool-metadata.test.ts covering each edge case (including the "Prod DB." double-period regression you flagged).
| : `Search Database Objects on ${sourceId} (${dbType})`; | ||
| // Prepend the user-provided `description` from the source config (if set) | ||
| // so AI clients reading the MCP tool list see the source's purpose first. | ||
| const userDescPrefix = sourceConfig.description ? `${sourceConfig.description}. ` : ""; |
There was a problem hiding this comment.
Same issue as above for getSearchObjectsMetadata: the prefix logic appends ". " without trimming or checking existing punctuation, which can lead to double punctuation or odd spacing in tool descriptions. Recommend normalizing with trim() and using a separator that preserves user punctuation (or conditionally adds a period only when needed).
| const userDescPrefix = sourceConfig.description ? `${sourceConfig.description}. ` : ""; | |
| const trimmedSourceDescription = sourceConfig.description?.trim() ?? ""; | |
| const userDescPrefix = trimmedSourceDescription | |
| ? /[.!?]$/.test(trimmedSourceDescription) | |
| ? `${trimmedSourceDescription} ` | |
| : `${trimmedSourceDescription}. ` | |
| : ""; |
There was a problem hiding this comment.
Same fix as above — both sites now call the shared buildSourceDescriptionPrefix helper in 055f2a5, so getSearchObjectsMetadata gets the same trim + punctuation-aware behavior for free.
…nput Addresses Copilot review feedback on PR bytebase#309: - Extract prefix-building logic into a pure helper `buildSourceDescriptionPrefix` used by both `getExecuteSqlMetadata` and `getSearchObjectsMetadata` instead of inlining the same expression twice. - Trim surrounding whitespace from the user-provided description so leading or trailing whitespace does not leak into the MCP tool description. - Skip appending '.' when the description already ends with a sentence-ending character (one of '.', '!', '?') to avoid double punctuation such as "Prod DB.. Execute SQL...". Adds 10 pure unit tests in src/utils/__tests__/tool-metadata.test.ts covering undefined/empty/whitespace-only inputs, each sentence-ending punctuation case, whitespace trimming, preservation of internal formatting, and documenting that mid-sentence punctuation (':', ';', ')') is still terminated with '. '.
| export function buildSourceDescriptionPrefix(description: string | undefined): string { | ||
| const trimmed = description?.trim() ?? ""; | ||
| if (!trimmed) return ""; | ||
| return /[.!?]$/.test(trimmed) ? `${trimmed} ` : `${trimmed}. `; | ||
| } |
There was a problem hiding this comment.
buildSourceDescriptionPrefix currently appends ". " for any description not ending in ., !, or ?. For descriptions that end with a colon (e.g. "Details below:"), this produces ":." in the final tool description ("Details below:. Execute …"), which reads like a punctuation error. Consider treating : (and potentially other trailing separators) as already-terminating punctuation and only appending a space (or using a separator that never yields :.).
There was a problem hiding this comment.
Addressed in 8d63191. Agreed — ":." is an artifact with no redeeming semantic reason. Extended the terminator regex to /[.!?:]$/ so a trailing colon is treated as already-introducing what follows, yielding "Details below: Execute SQL..." instead. ; and ) remain non-terminators — they signal mid-sentence continuation (joining clauses / closing a parenthetical within the same sentence) so . still fires for those.
| it("does not treat non-sentence-ending punctuation as terminators", () => { | ||
| // These are mid-sentence / structural punctuation; the helper should | ||
| // still append ". " to produce a complete sentence boundary. | ||
| expect(buildSourceDescriptionPrefix("Details below:")).toBe("Details below:. "); | ||
| expect(buildSourceDescriptionPrefix("(read-only)")).toBe("(read-only). "); |
There was a problem hiding this comment.
This test locks in the current buildSourceDescriptionPrefix("Details below:") === "Details below:. " behavior, which would surface as "Details below:. Execute …" in tool descriptions. If : should act as a natural separator, the helper should return "Details below: " (space only) instead, and this assertion should be updated accordingly.
There was a problem hiding this comment.
Test updated in 8d63191. The "Details below:" case now asserts "Details below: " (space-only) and lives in a dedicated test alongside the other sentence-ending cases. The remaining /[.!?:]$/ non-matches () and ;) stay in the 'non-terminators' test with a comment explaining why : was moved out.
Addresses second-pass Copilot review on PR bytebase#309: descriptions ending with a colon (e.g. "Details below:") produced the awkward artifact "Details below:. Execute SQL..." under the previous regex. A trailing colon naturally introduces what follows (the generic tool template), so appending "." after it is both ugly and semantically redundant. Extend the terminator regex to /[.!?:]$/ and update the corresponding test case. ';' and ')' remain non-terminators because they signal mid-sentence continuation rather than introduction.
tianzhou
left a comment
There was a problem hiding this comment.
LGTM. Thanks for the contribution.
Summary
Makes
[[sources]].descriptionfromdbhub.tomlactually reach the MCP tool description that AI clients read. Previously, the field was parsed and surfaced via the REST/api/sourcesendpoint but was completely ignored by the MCP tool registration path, defeating its documented purpose: "Helps AI models understand the purpose and contents of each database when multiple sources are configured."Problem (Fixes #267)
When a user configures multiple sources in
dbhub.tomlwith descriptions:The MCP tool descriptions that AI clients see are:
execute_sql_analytics_prod: "Execute SQL queries on the 'analytics_prod' postgres database"execute_sql_payments: "Execute SQL queries on the 'payments' postgres database"The user's carefully-written descriptions never reach the AI. Models must guess routing from the
idalone.This is exactly what issue #267 reported — and another user (@Diluka) confirmed in that thread: "I asked the AI, and it said that the description wasn't in the tool's prompt" — before the issue was closed as "supported, doc was missing".
Root cause
SourceConfig.descriptionis typed insrc/types/config.ts(L51) and parsed correctly, but:src/utils/tool-metadata.ts→getExecuteSqlMetadata()andgetSearchObjectsMetadata()hardcode the template string and never referencesourceConfig.description.src/tools/index.ts→registerExecuteSqlTool/registerSearchObjectsToolpass the hardcoded template straight toserver.registerTool({ description, ... }).grep sourceConfig.descriptionacross the repo returns one hit:src/api/sources.ts(the REST endpoint for the built-in web UI). Nothing in the MCP path.PR #232 introduced the
descriptionfield and its commit message says it's "exposed via/api/sourcesfor MCP clients" — but/api/sourcesis an HTTP REST endpoint, not the MCP protocol. That's the scope gap this PR closes.Fix
Prepend
sourceConfig.description(when set) to the generated description in bothgetExecuteSqlMetadata()andgetSearchObjectsMetadata(). All existing technical signals —[READ-ONLY MODE],(limited to N rows), source id,dbType— are preserved unchanged.Before:
After (with
description = \"Production read replica for BI / analytics queries\"):Business context comes first (stronger routing signal for the AI), technical/safety context is preserved and unchanged.
What's not touched
CustomToolConfig.description): already work correctly, path unchanged. Custom tools carry their own required description; prepending the source description to them would override explicit user intent./api/sourcesendpoint:tool.descriptionreflects the new concatenated string,source.descriptionstill returns the raw user-provided value. Existing web UI behavior is preserved.title,annotations: unchanged.Test plan
src/utils/__tests__/tool-metadata.integration.test.ts(4 tests) coveringgetExecuteSqlMetadataandgetSearchObjectsMetadatafor sources with and without description. Verifies the prefix appears first, and the original template + readonly note are retained.src/api/__tests__/sources.integration.test.tsstill pass, includingtoContain(source.id)/toContain(source.type)assertions that now also see the prefix.pnpm build:backendpasses.sqlite.integration.test.tsfor file-based readonly,manager.test.tsfor.ssh/configpath separator,json-rpc-integration.test.tsforspawn pnpm ENOENT) exist onmainwithout this patch and are out of scope.Related
descriptionfield to [[sources]] in dbhub.toml for connection context #267list_sourcesmeta tool). That design change would independently benefit from accurate per-source descriptions once this PR lands.