Skip to content

Commit 46cb6ea

Browse files
committed
server : auto-insert media marker in embedding / multimodal prompts
The /embedding (and /embeddings, /v1/embeddings) endpoints failed with "number of media markers in text (0) does not match number of bitmaps (1)" when passing multimodal data via the "content" object format. The server initializes the mtmd context with a randomized media marker (via get_media_marker()), but process_mtmd_prompt() passed the raw prompt string to mtmd_tokenize() without ensuring it contained the required markers. The CLI (mtmd-cli.cpp) already handles this by auto-prepending markers, but the server did not. Fix: query the actual marker from the mtmd context via mtmd_get_marker() and auto-insert one per file if the prompt lacks them. server: auto-insert missing media markers in process_mtmd_prompt Fixes the /embedding endpoint when multimodal data is provided without corresponding media markers in the prompt string. Counts existing markers and prepends only the missing number so the count matches files.size(). Assisted-by: GitHub Copilot Potential fix for pull request finding This just makes the wording more accurate Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> server: auto-insert missing media markers in process_mtmd_prompt Fixes the /embedding endpoint when multimodal data is provided without corresponding media markers in the prompt string. Counts existing markers and prepends only the missing number so the count matches files.size(). Assisted-by: GitHub Copilot
1 parent 27c8bb4 commit 46cb6ea

1 file changed

Lines changed: 27 additions & 1 deletion

File tree

tools/server/server-common.cpp

Lines changed: 27 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -702,10 +702,36 @@ server_tokens process_mtmd_prompt(mtmd_context * mctx, const std::string & promp
702702
}
703703
}
704704
// process prompt
705+
std::string prompt_adj;
706+
707+
const std::string marker = mtmd_get_marker(mctx);
708+
const size_t marker_len = marker.size();
709+
710+
// count existing media markers in the prompt
711+
size_t n_markers = 0;
712+
size_t pos = 0;
713+
while ((pos = prompt.find(marker, pos)) != std::string::npos) {
714+
n_markers++;
715+
pos += marker_len;
716+
}
717+
718+
// prepend missing markers so the count matches the number of files
719+
<<<<<<< HEAD
720+
// this mirrors the behavior in mtmd-cli.cpp, but also handles prompts that already contain some markers
721+
=======
722+
// this mirrors the behavior in mtmd-cli.cpp but also handles partial matches
723+
if (n_markers < files.size()) {
724+
>>>>>>> 327c5a188 (server: auto-insert missing media markers in process_mtmd_prompt)
725+
for (size_t i = 0; i < files.size() - n_markers; i++) {
726+
prompt_adj += marker;
727+
}
728+
}
729+
prompt_adj += prompt;
730+
705731
std::vector<server_tokens> inputs;
706732
// multimodal
707733
mtmd_input_text inp_txt = {
708-
prompt.c_str(),
734+
prompt_adj.c_str(),
709735
/* add_special */ true,
710736
/* parse_special */ true,
711737
};

0 commit comments

Comments
 (0)