server : auto-insert media marker in embedding / multimodal prompts#25093
Open
TheOneWhoWill wants to merge 2 commits into
Open
server : auto-insert media marker in embedding / multimodal prompts#25093TheOneWhoWill wants to merge 2 commits into
TheOneWhoWill wants to merge 2 commits into
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR fixes multimodal /embedding requests in the server by ensuring the mtmd media marker is present in the text prompt before tokenization, aligning server behavior with the multimodal CLI and preventing marker/bitmap count mismatches.
Changes:
- Query the active marker from the mtmd context (
mtmd_get_marker()). - Auto-prepend media markers to the prompt before calling
mtmd_tokenize()when markers are missing.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
33ac645 to
e50014f
Compare
The /embedding (and /embeddings, /v1/embeddings) endpoints failed with "number of media markers in text (0) does not match number of bitmaps (1)" when passing multimodal data via the "content" object format. The server initializes the mtmd context with a randomized media marker (via get_media_marker()), but process_mtmd_prompt() passed the raw prompt string to mtmd_tokenize() without ensuring it contained the required markers. The CLI (mtmd-cli.cpp) already handles this by auto-prepending markers, but the server did not. Fix: query the actual marker from the mtmd context via mtmd_get_marker() and auto-insert one per file if the prompt lacks them. server: auto-insert missing media markers in process_mtmd_prompt Fixes the /embedding endpoint when multimodal data is provided without corresponding media markers in the prompt string. Counts existing markers and prepends only the missing number so the count matches files.size(). Assisted-by: GitHub Copilot Potential fix for pull request finding This just makes the wording more accurate Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> server: auto-insert missing media markers in process_mtmd_prompt Fixes the /embedding endpoint when multimodal data is provided without corresponding media markers in the prompt string. Counts existing markers and prepends only the missing number so the count matches files.size(). Assisted-by: GitHub Copilot
aln730
reviewed
Jun 28, 2026
Forgot to remove merge conflict headers Co-authored-by: AGawas <94751172+aln730@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The /embedding (and /embeddings, /v1/embeddings) endpoints failed with "number of media markers in text (0) does not match number of bitmaps (1)" when passing multimodal data via the "content" object format.
The server initializes the mtmd context with a randomized media marker (via get_media_marker()), but process_mtmd_prompt() passed the raw prompt string to mtmd_tokenize() without ensuring it contained the required markers. The CLI (mtmd-cli.cpp) already handles this by auto-prepending markers, but the server did not.
Fix: query the actual marker from the mtmd context via mtmd_get_marker() and auto-insert one per file if the prompt lacks them.
Overview
Fixes #25088
Essentially calls to the /embedding endpoint were failing because the process_mtmd_prompt function in tools/server/server-common.cpp passes the raw text from a user's prompt without including the placeholder marker from mtmd_default_marker() and one is required for each attatched image. I added a simple check for existence and inserted 1 per image.
Requirements