feat(chat): tiered local AI model strategy for guest users#100
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
WalkthroughThis PR adds local model inference capabilities to the chat application. It introduces server-side rejection of local model requests, client-side consent management, device capability detection, Web Worker-based inference, and updated model configuration logic to differentiate between local and cloud providers. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant UI as ChatArea<br/>(React)
participant LocalConsent as useLocalModelConsent<br/>(Hook)
participant LocalTransport as localTransport<br/>(Browser)
participant Worker as inference.worker<br/>(Web Worker)
participant HF as HuggingFace<br/>Transformers
User->>UI: Send message (local model)
UI->>LocalConsent: requestConsent()
LocalConsent->>LocalConsent: resolveLocalModelSpec()
LocalConsent->>UI: Show consent dialog
User->>UI: Confirm download
LocalConsent->>LocalConsent: setLocalModelConsent()
UI->>LocalTransport: streamLocalChatRequest(spec, messages)
LocalTransport->>Worker: POST generate message<br/>(modelId, device, dtype, messages)
Worker->>HF: Load/cache pipeline
HF-->>Worker: TextGenerationPipeline ready
Worker->>HF: Stream generation + callbacks
HF-->>Worker: progress events (download %)
Worker-->>LocalTransport: progress message
LocalTransport->>UI: onProgress callback
HF-->>Worker: text chunks
Worker-->>LocalTransport: chunk message
LocalTransport->>UI: onChunk callback + accumulate
HF-->>Worker: completion
Worker-->>LocalTransport: complete message
LocalTransport-->>UI: Resolved string
UI->>User: Display response
sequenceDiagram
participant User
participant UI as ChatArea<br/>(React)
participant API as /api/chat<br/>(Route Handler)
participant Service as chatService<br/>(LangChain)
participant Provider as Cloud Provider<br/>(OpenAI/Anthropic/Google)
User->>UI: Send message (cloud model)
UI->>API: POST /api/chat<br/>(selectedModel, messages)
API->>API: Check isLocalModel()<br/>➜ false, continue
API->>Service: buildChatModel(config)
Service->>Service: Validate API key present
Service->>Service: Create LLM instance
API->>Provider: Stream chat completion
Provider-->>API: Response chunks
API-->>UI: SSE stream
UI->>User: Display response
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
When a local model is selected, resolveLocalModelSpec() was called once in requestConsent() and again inside runLocalStream. Now the spec resolved during consent is passed directly to sendMessage and forwarded to runLocalStream, which only falls back to re-resolving if no spec is provided.
…fore first token" This reverts commit 16adb89.
…detection in local capabilities
There was a problem hiding this comment.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
components/modals/settings/index.tsx (1)
208-250:⚠️ Potential issue | 🟠 MajorApply the same availability filtering when saving settings.
onSubmitonly removes models missing provider keys, solocal-autosurvives when any API key is present. Example: with onlylocal-autoenabled, adding an Anthropic key savesselectedModel: "local-auto"even thoughgetAvailableModels()suppresses Local models in that state. UsegetAvailableModels(data)as the saved enabled list, and disable/hide Local options while keys are present.🛠️ Proposed fix direction
import { getAvailableModels, getConfigIssues, + hasAnyApiKey, hasRequiredKeyForModel, } from "@/lib/chat/config";- const enabledModels = - issues.enabledModelsMissingKeys.length > 0 - ? data.enabledModels.filter( - (m) => !issues.enabledModelsMissingKeys.includes(m as ModelValue) - ) - : data.enabledModels; + const enabledModels = getAvailableModels(data); if (enabledModels.length === 0) { toast.error("Please enable at least one model."); return; }const hasRequiredKey = hasRequiredKeyForModel(option.value, { openAIKey: watchedValues.openAIKey, anthropicKey: watchedValues.anthropicKey, googleKey: watchedValues.googleKey, }); + const suppressLocal = + hasAnyApiKey(watchedValues) && option.provider === "Local"; return { value: option.value, label: option.label, badge: option.provider, - disabled: !hasRequiredKey, + disabled: !hasRequiredKey || suppressLocal, };🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/modals/settings/index.tsx` around lines 208 - 250, The form submit currently filters only models missing keys via getConfigIssues, allowing local-auto to remain enabled when provider keys exist; update onSubmit to compute the saved enabledModels using getAvailableModels(data) (or equivalent availability function) so models suppressed by getAvailableModels are removed before setConfig, and ensure selectedModel is validated against that filtered list; reference onSubmit, getConfigIssues, getAvailableModels, hasRequiredKeyForModel, MODEL_OPTIONS and setConfig when making the change and keep the existing toast/handleClose behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@hooks/useCircleChat.ts`:
- Around line 38-49: The localStorage marker
`enki-model-downloaded-${spec.modelId}` used in useCircleChat.ts inside the
onProgress handler can go out of sync with actual Cache Storage (so downloads
may be misreported as "loading-cache"); update the logic so the marker is kept
in sync with cache operations: either clear that localStorage key whenever
`clearLocalModelCache` (or any cache-clearing function) runs, or replace the
marker check in `onProgress` with an actual existence check against the Cache
Storage (query the relevant cache for the model URL by `spec.modelId`) and call
`options.onModelStatus` based on the real cache result instead of the stale
localStorage flag.
- Around line 16-24: runLocalStream currently auto-resolves a local model spec
(via resolveLocalModelSpec) when options.spec is omitted, which allows callers
(e.g., sendMessage) to trigger a local model download without user consent;
change runLocalStream to require options.spec (do not call
resolveLocalModelSpec) and immediately throw/return an error if spec is
undefined, and update any callers to only call runLocalStream with an explicit,
consent-approved spec (the same guard should be applied to the other local-path
streaming logic in the 88-140 region), ensuring the local-download path cannot
run unless a spec was explicitly provided by the consent flow.
In `@hooks/useLocalModelConsent.ts`:
- Around line 13-21: Resolve the model first, then scope consent and pending
prompts to that modelId: call resolveLocalModelSpec() and extract a stable model
identifier/version, then replace hasLocalModelConsent() and
setLocalModelConsent() with per-model variants (e.g.,
hasLocalModelConsent(modelId), setLocalModelConsent(modelId, value)). Instead of
a single resolverRef.current, store pending resolvers per modelId (e.g.,
Map<modelId, resolver[]>) so concurrent requestConsent() calls for the same
model enqueue their promises and do not overwrite each other; if consent already
exists for that model return resolved immediately, otherwise setSpec(resolved),
setOpen(true) if not already open for that model, push the resolver into the
model's resolver list, and when the user confirms/denies resolve all queued
resolvers and persist the per-model consent via setLocalModelConsent(modelId,
confirmed).
In `@lib/local/consent.ts`:
- Around line 3-10: The consent functions should defensively handle missing or
blocked localStorage: update hasLocalModelConsent() to first guard for SSR
(typeof window === "undefined") and wrap localStorage.getItem(CONSENT_KEY) in a
try/catch returning false on error, and update setLocalModelConsent() to include
the same SSR guard (return early if window is undefined) and wrap
localStorage.setItem(CONSENT_KEY, "true") in a try/catch that safely
ignores/storage-errors; reference the existing functions hasLocalModelConsent,
setLocalModelConsent and the CONSENT_KEY constant when making these changes.
In `@lib/local/inference.worker.ts`:
- Around line 96-106: The pipeline() call that awaits model loading
(TextGenerationPipeline) must be made abort-aware: while awaiting
pipeline(modelId...) monitor the abort signal/message for this request (the same
requestId used in scope.postMessage) and if an abort is received before pipeline
resolves, immediately stop the worker (e.g., post an "aborted" message via
scope.postMessage with requestId and then call self.close()/terminate) so the
long-running model download doesn't continue; ensure the parent knows to
recreate the worker for future requests. In practice: start pipeline(...) and
concurrently listen for an abort (AbortSignal or scope.onmessage "abort" for
requestId), and if aborted before pipeline resolves, skip using the returned
TextGenerationPipeline and terminate this worker instance instead of waiting for
pipeline to finish.
In `@lib/local/localTransport.ts`:
- Around line 126-137: The promise handling worker communication currently only
listens for "message" and the abort signal, so if the worker errors or emits
"messageerror" the request never settles; add listeners for "error" and
"messageerror" on the Worker instance (the same object referenced as w) that
call the same cleanup logic as handleAbort and reject the pending stream/promise
with a clear Error (include requestId and spec.modelId in the message), and
ensure these error listeners are removed alongside handleMessage and the abort
listener when the request completes; reference the existing handleMessage and
handleAbort handlers, the Worker variable w, and the requestId/spec identifiers
when implementing the rejection and cleanup.
In `@next.config.js`:
- Around line 11-17: The Turbopack aliases in next.config.js are incorrect
because turbopack: {} plus webpack config.resolve.alias entries like sharp$ and
"onnxruntime-node$": false won't disable native modules in Turbopack and
resolveAlias doesn't accept false values; remove the empty turbopack: {} block
(or replace with a validated resolveAlias mapping) and delete the false aliases
from the webpack config.resolve.alias, and instead rely on
serverExternalPackages (if present) or add proper turbopack.resolveAlias
mappings that map the module names to safe stubs; update references in the file
to the turbopack, webpack, config.resolve.alias, sharp$, onnxruntime-node$,
resolveAlias, and serverExternalPackages symbols so the change is located and
validated.
---
Outside diff comments:
In `@components/modals/settings/index.tsx`:
- Around line 208-250: The form submit currently filters only models missing
keys via getConfigIssues, allowing local-auto to remain enabled when provider
keys exist; update onSubmit to compute the saved enabledModels using
getAvailableModels(data) (or equivalent availability function) so models
suppressed by getAvailableModels are removed before setConfig, and ensure
selectedModel is validated against that filtered list; reference onSubmit,
getConfigIssues, getAvailableModels, hasRequiredKeyForModel, MODEL_OPTIONS and
setConfig when making the change and keep the existing toast/handleClose
behavior.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 66e46d8e-3be8-4009-a2cb-a0c6176a4cd7
⛔ Files ignored due to path filters (1)
bun.lockis excluded by!**/*.lock
📒 Files selected for processing (20)
app/api/chat/route.tscomponents/Inputs/InputChat/index.tsxcomponents/chat/ChatArea/index.tsxcomponents/modals/local-model-consent/index.tsxcomponents/modals/settings/index.tsxconstants/models.tshooks/useCircleChat.tshooks/useLocalModelConsent.tslib/chat/config.test.tslib/chat/config.tslib/chat/prompt.tslib/langchain/chatService.tslib/local/capabilities.tslib/local/consent.tslib/local/inference.worker.tslib/local/localTransport.tsnext.config.jspackage.jsonstore/index.tsstore/slices/configSlice.ts
Summary
Localprovider withlocal-automodel that runs entirely in the browser — no API key required@huggingface/transformers(ONNX), keeping the UI thread unblockedTest plan
Summary by CodeRabbit
New Features
Improvements