Skip to content

[DO NOT MERGE - WIP] Add new Cosmos DB tools#2644

Open
vcolin7 wants to merge 7 commits into
mainfrom
cosmos/new-tools
Open

[DO NOT MERGE - WIP] Add new Cosmos DB tools#2644
vcolin7 wants to merge 7 commits into
mainfrom
cosmos/new-tools

Conversation

@vcolin7
Copy link
Copy Markdown
Contributor

@vcolin7 vcolin7 commented May 15, 2026

What does this PR do?

Adds new Cosmos DB tools.

GitHub issue number?

Pending

Pre-merge Checklist

  • Required for All PRs
    • Read contribution guidelines
    • PR title clearly describes the change
    • Commit history is clean with descriptive messages (cleanup guide)
    • Added comprehensive tests for new/modified functionality
    • Created a changelog entry if the change falls among the following: new feature, bug fix, UI/UX update, breaking change, or updated dependencies. Follow the changelog entry guide
  • For MCP tool changes:
    • One tool per PR: This PR adds or modifies only one MCP tool for faster review cycles
    • Updated servers/Azure.Mcp.Server/README.md and/or servers/Fabric.Mcp.Server/README.md documentation
    • Validate README.md changes running the script ./eng/scripts/Process-PackageReadMe.ps1. See Package README
    • For new or modified tool descriptions, ran ToolDescriptionEvaluator and obtained a score of 0.4 or more and a top 3 ranking for all related test prompts
    • For tools with new names, including new tools or renamed tools, update consolidated-tools.json
    • For renamed tools, follow the Tool Rename Checklist and tag the PR with the breaking-change label
    • For new tools associated with Azure services or publicly available tools/APIs/products, add URL to documentation in the PR description
  • Extra steps for Azure MCP Server tool changes:
    • Updated command list in servers/Azure.Mcp.Server/docs/azmcp-commands.md
    • Ran ./eng/scripts/Update-AzCommandsMetadata.ps1 to update tool metadata in azmcp-commands.md (required for CI)
    • Updated test prompts in servers/Azure.Mcp.Server/docs/e2eTestPrompts.md
    • 👉 For Community (non-Microsoft team member) PRs:
      • Security review: Reviewed code for security vulnerabilities, malicious code, or suspicious activities before running tests (crypto mining, spam, data exfiltration, etc.)
      • Manual tests run: added comment /azp run mcp - pullrequest - live to run Live Test Pipeline

vcolin7 added 7 commits May 13, 2026 16:28
Adds five new MCP tools to Azure.Mcp.Tools.Cosmos:
- container_schema_get: infer approximate schema from sampled documents
- item_list-recent: return the N most recently modified documents
- item_get: look up a document by id (point read when partition key supplied)
- item_text-search: FullTextContains-based search on a property
- item_vector-search: VectorDistance-based search with optional Azure OpenAI embedding generation
Copilot AI review requested due to automatic review settings May 15, 2026 04:32
@vcolin7 vcolin7 requested review from a team and xiangyan99 as code owners May 15, 2026 04:32
@vcolin7 vcolin7 changed the title Added changelog entry [DO NOT MERGE - WIP] Add new Cosmos DB tools May 15, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR expands the Cosmos DB MCP toolset by adding new container/item commands (schema inference, get by id, list recent, text search, vector search), along with service implementations, option definitions, tests, and documentation/changelog updates.

Changes:

  • Added new Cosmos commands and registered them in CosmosSetup (container schema get; item get/list-recent/text-search/vector-search).
  • Extended ICosmosService/CosmosService with schema inference, item retrieval, search operations, and Azure OpenAI embedding generation.
  • Added unit tests plus updated azmcp-commands.md, e2eTestPrompts.md, and a changelog entry.

Invoking Livetests

Copilot submitted PRs are not trustworthy by default. Users with write access to the repo need to validate the contents of this PR before leaving a comment with the text /azp run mcp - pullrequest - live. This will trigger the necessary livetest workflows to complete required validation.

Reviewed changes

Copilot reviewed 28 out of 28 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tools/Azure.Mcp.Tools.Cosmos/tests/Azure.Mcp.Tools.Cosmos.UnitTests/Item/ItemVectorSearchCommandTests.cs Adds unit tests for the new vector-search command behavior and validation.
tools/Azure.Mcp.Tools.Cosmos/tests/Azure.Mcp.Tools.Cosmos.UnitTests/Item/ItemTextSearchCommandTests.cs Adds unit tests for the new text-search command and property validation.
tools/Azure.Mcp.Tools.Cosmos/tests/Azure.Mcp.Tools.Cosmos.UnitTests/Item/ItemListRecentCommandTests.cs Adds unit tests for listing recent items and count validation.
tools/Azure.Mcp.Tools.Cosmos/tests/Azure.Mcp.Tools.Cosmos.UnitTests/Item/ItemGetCommandTests.cs Adds unit tests for getting an item by id (including not-found behavior).
tools/Azure.Mcp.Tools.Cosmos/tests/Azure.Mcp.Tools.Cosmos.UnitTests/Container/ContainerSchemaGetCommandTests.cs Adds unit tests for inferring container schema and handling service errors.
tools/Azure.Mcp.Tools.Cosmos/src/Validation/PropertyValidator.cs Introduces identifier validation for property names interpolated into Cosmos SQL fragments.
tools/Azure.Mcp.Tools.Cosmos/src/Services/ICosmosService.cs Extends the Cosmos service contract with schema, item operations, searches, and embedding generation.
tools/Azure.Mcp.Tools.Cosmos/src/Services/CosmosService.cs Implements schema inference, recent listing, item get, full-text search, vector search, and embedding generation.
tools/Azure.Mcp.Tools.Cosmos/src/Options/Item/ItemVectorSearchOptions.cs Adds option model for vector-search command parameters.
tools/Azure.Mcp.Tools.Cosmos/src/Options/Item/ItemTextSearchOptions.cs Adds option model for text-search command parameters.
tools/Azure.Mcp.Tools.Cosmos/src/Options/Item/ItemListRecentOptions.cs Adds option model for list-recent command parameters.
tools/Azure.Mcp.Tools.Cosmos/src/Options/Item/ItemGetOptions.cs Adds option model for get-by-id command parameters.
tools/Azure.Mcp.Tools.Cosmos/src/Options/CosmosOptionDefinitions.cs Adds new CLI option definitions used by the new Cosmos commands.
tools/Azure.Mcp.Tools.Cosmos/src/Options/Container/ContainerSchemaGetOptions.cs Adds option model for container schema get parameters.
tools/Azure.Mcp.Tools.Cosmos/src/Models/SchemaProperty.cs Adds schema property model used in schema inference results.
tools/Azure.Mcp.Tools.Cosmos/src/Models/EmbeddingRequest.cs Adds embedding request model for Azure OpenAI embedding generation.
tools/Azure.Mcp.Tools.Cosmos/src/Models/ContainerSchema.cs Adds container schema model returned by the service.
tools/Azure.Mcp.Tools.Cosmos/src/CosmosSetup.cs Registers new commands and adds the new schema command group.
tools/Azure.Mcp.Tools.Cosmos/src/Commands/Item/ItemVectorSearchCommand.cs Adds the vector-search command including option validation and response shape.
tools/Azure.Mcp.Tools.Cosmos/src/Commands/Item/ItemTextSearchCommand.cs Adds the text-search command including option validation and response shape.
tools/Azure.Mcp.Tools.Cosmos/src/Commands/Item/ItemListRecentCommand.cs Adds the list-recent command including count validation and response shape.
tools/Azure.Mcp.Tools.Cosmos/src/Commands/Item/ItemGetCommand.cs Adds the item get command (point read when partition key provided).
tools/Azure.Mcp.Tools.Cosmos/src/Commands/CosmosJsonContext.cs Registers new command result types for System.Text.Json source generation.
tools/Azure.Mcp.Tools.Cosmos/src/Commands/Container/ContainerSchemaGetCommand.cs Adds the schema get command including sample-size option and response shape.
tools/Azure.Mcp.Tools.Cosmos/src/Azure.Mcp.Tools.Cosmos.csproj Adds Azure.AI.OpenAI dependency needed for embedding generation.
servers/Azure.Mcp.Server/docs/e2eTestPrompts.md Adds e2e prompt entries for the new Cosmos commands.
servers/Azure.Mcp.Server/docs/azmcp-commands.md Adds CLI documentation entries for the new Cosmos commands.
servers/Azure.Mcp.Server/changelog-entries/cosmos-mcp-toolkit-tools.yaml Adds a changelog entry describing the new Cosmos tools.

Comment on lines +41 to +43
if (size < 1 || size > 100)
{
result.AddError("--sample-size must be between 1 and 100.");
$"--{CountName}"
)
{
Description = "Maximum number of documents to return (1-20). Defaults to 10.",

[Theory]
[InlineData("0")]
[InlineData("101")]
Copy link
Copy Markdown
Contributor

@jongio jongio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid set of additions for Cosmos schema inspection and search. The Copilot bot already flagged the --sample-size and --count validation range mismatches, so I won't repeat those. A couple of additional items below.

The test boundary values in the new test files consistently use "101" for over-range checks, but the commands validate at much lower thresholds (1-20 for most, 1-50 for vector search). Testing at the actual boundary would catch off-by-one regressions.

Also noting this is marked WIP with CI failing across all targets, so I'm keeping this as a comment review. Happy to do another pass once it's ready.


[Theory]
[InlineData("0")]
[InlineData("101")]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MEDIUM] Test boundary doesn't match command validation.

The command rejects count > 20, but this test uses "101". That'll catch values way past the limit but won't verify the boundary itself. Consider testing "21" to catch off-by-one issues if the validation threshold changes.

The same pattern shows up in the other new test files too (vector search tests use "101" but validate at 50).

var client = await GetCosmosClientAsync(accountName, subscription, authMethod, tenant, retryPolicy, cancellationToken);
var container = client.GetContainer(databaseName, containerName);

var queryDef = new QueryDefinition($"SELECT TOP {count} * FROM c ORDER BY c._ts DESC");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[LOW] Minor inconsistency: GetRecentItems interpolates count directly into the query string, while VectorSearch parameterizes it as @topN. Both are safe since count is always an int, but the parameterized approach is more defensive. Up to you whether it's worth making consistent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Untriaged

Development

Successfully merging this pull request may close these issues.

3 participants