fix: show cold-start loading label during engine warmup#287
Merged
Conversation
Signed-off-by: Logan Nguyen <lg.131.dev@gmail.com>
904d4ed to
e8c479a
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
While the built-in llama.cpp engine or a local Ollama model is cold-starting, the chat view showed only the bare typing dots with no indication of what was happening. This adds a phased loading label next to the dots that narrates the wait, and marks the engine warmed the instant a real chat request's first token proves it (rather than waiting on a separately-queued warm-up prime).
Key changes
useEngineLoadingLabelhook: shows nothing for sub-second waits, then cycles phase-1 filler ("Starting up the model…", "Reading model weights…"), and jumps to phase-2 filler ("Warming up…", "Bigger models take a little longer…") the instant the built-in engine's realwarmup:builtin-warmingsignal fires. Progress is monotonic — once phase 2 is entered it never falls back to phase 1.useEngineWarmupStatushook: app-root-mounted subscriber to the existingwarmup:builtin-warming/-warmedevents, shared with the Settings status line.ConversationView: wires the label into the existing loading row, deferring to an active search-stage label when one is present. Remote (openai-kind) providers never show a label.stream_builtin_chatnow marks the engine warmed off the real request's own first streamed token (BuiltinWarmState::mark_warmed_by_real_request), independent of a proactive prime that may still be queued behind it at the engine's single execution slot. This fixes the Settings "warming" status getting stuck for the duration of a response that raced ahead of its own prime.Testing
bun run test:all:coverage— frontend 100/100/100/100, backend--fail-under-lines 100, all passing.bun run validate-build— clean (lint, format, typecheck, build).