Skip to content

Commit b65bb4b

Browse files
authored
server: expose prompt token counts in /slots endpoint (#23454)
Add n_prompt_tokens, n_prompt_tokens_processed, and n_prompt_tokens_cache to the /slots JSON response. These fields are already tracked internally but were not exposed, making it impossible for clients to monitor prompt evaluation progress during processing.
1 parent a1a69f7 commit b65bb4b

1 file changed

Lines changed: 3 additions & 0 deletions

File tree

tools/server/server-context.cpp

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -506,6 +506,9 @@ struct server_slot {
506506

507507
if (ptask) {
508508
res["id_task"] = ptask->id;
509+
res["n_prompt_tokens"] = (int32_t) prompt.tokens.size();
510+
res["n_prompt_tokens_processed"] = n_prompt_tokens_processed;
511+
res["n_prompt_tokens_cache"] = n_prompt_tokens_cache;
509512
res["params"] = ptask->params.to_json(only_metrics);
510513
res["next_token"] = {
511514
{

0 commit comments

Comments
 (0)