docs: Ask AI chat grounded in the docs vector store by ouiliame · Pull Request #5172 · simstudioai/sim

ouiliame · 2026-06-22T20:46:02Z

Summary

Adds an Ask AI chat to the docs site so readers can ask questions about Sim in natural language and get answers grounded in the documentation.

Floating launcher → chat panel (components/ai/ask-ai.tsx) built on the Vercel AI SDK's useChat. Assistant replies render as markdown via streamdown (the same AI-streaming markdown renderer the main app's chat uses), with source chips linking back to cited pages.
Streaming API route (app/api/chat/route.ts) using streamText with the OpenAI provider. A searchDocs tool runs a vector search over the existing docs_embeddings store and returns source links, so the model answers from real docs rather than memory.
Reuses the OPENAI_API_KEY already in the environment (same key the docs search uses for embeddings). Model defaults to gpt-5.4-mini, overridable via OPENAI_CHAT_MODEL.

Abuse hardening

This endpoint proxies a paid LLM, so an unauthenticated public route is a target for scripted "free inference". Shipped in this PR (cost caps per request):

Max messages, max input size (413), max output tokens, reduced tool-step limit
Lenient same-origin check (rejects obvious cross-origin; DOCS_ALLOWED_ORIGINS to extend)

Infra-side follow-ups (dashboard/provisioning, not code) — do before public launch:

OpenAI hard spend cap + alert (backstops worst case regardless of code)
Durable per-IP rate limit (Upstash/Vercel KV — the per-request caps bound cost but not volume)
Vercel Firewall / BotID (or Turnstile) to block headless traffic at the edge

Notes

The LLM-text plumbing (llms.txt, llms-full.txt, .md/.mdx routes, Accept negotiation) already existed — this PR adds only the Ask AI chat.
Embeddings stay on OpenAI (text-embedding-3-small); only the chat completion uses the chat model. Branch is off staging; bun run build and type-check pass for the docs app.

🤖 Generated with Claude Code

Adds an "Ask AI" chat to the docs site. A floating launcher opens a panel backed by the Vercel AI SDK (OpenAI provider, OPENAI_API_KEY from the environment). The chat is grounded via a searchDocs tool that runs a vector search over the existing docs embeddings and returns source links, so answers cite real pages. - app/api/chat/route.ts — streaming POST handler (streamText + searchDocs tool) - components/ai/ask-ai.tsx — useChat panel with streamed answers + source chips - wired into the docs layout; reuses the existing OPENAI_API_KEY and embeddings Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

vercel · 2026-06-22T20:46:08Z

@ouiliame is attempting to deploy a commit to the Sim Team on Vercel.

A member of the Team first needs to authorize it.

vercel · 2026-06-22T20:46:51Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
docs	Ready	Preview, Comment	Jun 22, 2026 9:14pm

Better grounding/instruction-following than gpt-4o-mini at docs-chat volumes; still overridable via OPENAI_CHAT_MODEL. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Markdown: render assistant messages with streamdown (the same AI-streaming markdown component the main app's chat uses), so bold/lists/code render instead of raw **asterisks**. User messages stay plain text. Abuse guards: the endpoint proxies a paid LLM, so cap the cost of any single request — max messages, max input size, max output tokens, fewer tool steps — and reject obvious cross-origin calls (lenient: Origin is a filter, not a boundary). Durable per-IP rate limiting, a provider spend cap, and edge bot protection are provisioned separately. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The size cap previously measured the whole serialized message array, so a follow-up question failed (413) once the history carried the prior answer plus retrieved doc chunks. Count only user-authored text instead, and loosen all bounds ~20x so normal multi-turn use never hits them — they remain only as a backstop against egregious abuse. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

cursor · 2026-06-22T21:10:13Z

PR Summary

Medium Risk
Introduces a public LLM proxy that spends OpenAI credits and hits the embeddings DB; request caps and origin filtering reduce but do not eliminate scripted abuse without the planned infra rate limits and spend caps.

Overview
Adds an Ask AI experience to the docs site: a floating launcher opens a chat panel on every localized docs page, wired through the shared layout with the active locale.

A new POST /api/chat route streams answers with the Vercel AI SDK (streamText + OpenAI). The model must call a searchDocs tool that vector-searches the existing docs_embeddings store (same embedding path as site search), filters by locale, and returns titles/URLs/content so replies stay doc-grounded. Per-request abuse caps (message count, user text size, total payload, output tokens, tool steps) and a lenient Origin check (DOCS_ALLOWED_ORIGINS) limit cost on this public, unauthenticated endpoint.

The client uses useChat with markdown via streamdown and shows source chips from searchDocs tool output. Docs app dependencies add @ai-sdk/openai, @ai-sdk/react, ai, streamdown, and zod.

^{Reviewed by Cursor Bugbot for commit d807260. Bugbot is set up for automated code reviews on this repo. Configure here.}

greptile-apps · 2026-06-22T21:15:30Z

Greptile Summary

Adds a floating "Ask AI" chat panel to the docs site, grounded in the existing docs_embeddings vector store via a new /api/chat streaming route built on AI SDK v5 streamText and useChat. The implementation includes thoughtful per-request abuse hardening (message count cap, user-input char limit, total payload backstop, origin filter) with acknowledged infra-level follow-ups before public launch.

app/api/chat/route.ts: New streaming POST route — validates and size-caps incoming messages, runs a pgvector similarity search via the searchDocs tool, and streams the answer back with toUIMessageStreamResponse.
components/ai/ask-ai.tsx: Client-side floating launcher + chat panel using DefaultChatTransport / useChat; correctly splits instant-scroll-on-open from smooth-scroll-on-new-messages; source chips render with rel="noopener noreferrer".
package.json: Adds ai v5, @ai-sdk/openai, @ai-sdk/react, streamdown, and zod to the docs app.

Confidence Score: 5/5

Safe to merge; the two new files are well-scoped to the docs app with no impact on the main Sim application.

The route and component are carefully written with structural validation, size caps, and correct AI SDK v5 patterns. Both findings are non-blocking style/configuration nits with no effect on the happy path.

No files require special attention, though operators configuring DOCS_ALLOWED_ORIGINS should be aware the allowlist expects hostnames rather than full origin strings.

Important Files Changed

Filename	Overview
apps/docs/app/api/chat/route.ts	New streaming chat API route. Solid abuse-hardening (message caps, size limits, origin check). Minor: DOCS_ALLOWED_ORIGINS allowlist compares hostnames but env var name implies full origins — silent misconfiguration footgun.
apps/docs/components/ai/ask-ai.tsx	New floating chat component using AI SDK v5 useChat with DefaultChatTransport. Correctly splits instant-scroll (open) from smooth-scroll (messages) effects. Source chips include rel="noopener noreferrer". Minor: smooth-scroll effect fires even when panel is closed.
apps/docs/app/[lang]/layout.tsx	Minimal addition: mounts the AskAI component inside the existing RootProvider, correctly passing the active locale.
apps/docs/package.json	Adds @ai-sdk/openai, @ai-sdk/react, ai v5, streamdown, and zod dependencies to the docs app.

Sequence Diagram

%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
    participant User
    participant AskAI as AskAI Component
    participant Route as /api/chat
    participant OpenAI as OpenAI
    participant DB as docs_embeddings

    User->>AskAI: Types question, presses Enter
    AskAI->>Route: "POST /api/chat {messages, locale}"
    Route->>Route: isAllowedOrigin + validate + size caps
    Route->>OpenAI: streamText with searchDocs tool
    OpenAI-->>Route: Tool call: searchDocs(query)
    Route->>OpenAI: generateSearchEmbedding(query)
    OpenAI-->>Route: embedding vector
    Route->>DB: "SELECT ORDER BY embedding <=> vector LIMIT 24"
    DB-->>Route: top 24 chunks
    Route->>Route: Filter by locale → top 6 chunks
    Route-->>OpenAI: tool result (title, url, content)
    OpenAI-->>Route: streaming answer text
    Route-->>AskAI: UIMessageStream
    AskAI-->>User: Streamed answer with cited doc links

%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
    participant User
    participant AskAI as AskAI Component
    participant Route as /api/chat
    participant OpenAI as OpenAI
    participant DB as docs_embeddings

    User->>AskAI: Types question, presses Enter
    AskAI->>Route: "POST /api/chat {messages, locale}"
    Route->>Route: isAllowedOrigin + validate + size caps
    Route->>OpenAI: streamText with searchDocs tool
    OpenAI-->>Route: Tool call: searchDocs(query)
    Route->>OpenAI: generateSearchEmbedding(query)
    OpenAI-->>Route: embedding vector
    Route->>DB: "SELECT ORDER BY embedding <=> vector LIMIT 24"
    DB-->>Route: top 24 chunks
    Route->>Route: Filter by locale → top 6 chunks
    Route-->>OpenAI: tool result (title, url, content)
    OpenAI-->>Route: streaming answer text
    Route-->>AskAI: UIMessageStream
    AskAI-->>User: Streamed answer with cited doc links

_{Reviews (3): Last reviewed commit: "docs: validate message shape on Ask AI r..." | Re-trigger Greptile}

waleedlatif1 · 2026-06-22T21:21:49Z

@ouiliame run the review loop and fix the comments and re-run @greptile and @cursor review until 5/5

- Guard req.json() with try/catch → 400 on malformed body (was 500) - Scope vector search to the reader's locale (mirrors site search); client forwards the active locale to the route - Backstop the whole serialized payload so assistant/tool parts can't be stuffed past the user-text cap - Split the scroll effect: instant jump on panel open, smooth on new messages - Add rel="noopener noreferrer" target="_blank" to source-chip links Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

ouiliame · 2026-06-22T22:32:21Z

@greptile review
@cursor review

A message that's a valid JSON array element but missing parts/role would throw in userInputChars and surface as a 500. Reject malformed messages with a 400. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

ouiliame · 2026-06-22T23:33:32Z

@greptile review
@cursor review

ouiliame · 2026-06-24T20:45:33Z

@waleedlatif1 ready for merge — review loop is at 5/5 ✅

Greptile: 5/5 (Confidence Score) on the latest commit
Cursor: no outstanding bugs; only its standing Medium Risk note, which is the architectural "public unauthenticated LLM proxy" flag — not a code fix. The hardening that clears it (durable rate limit, OpenAI spend cap, Vercel Firewall/BotID) is tracked as the pre-launch infra checklist in the PR body.

All review comments addressed across three rounds: req.json() 500→400, locale-scoped retrieval, 413 cap closed for assistant/tool payload, scroll-effect split, rel="noopener" on source links, and message-shape validation. Branch is merged up to date with staging; type-check + build green.

One call for you: the Cursor Medium Risk stays until those infra controls land. Merge now with the checklist as a fast-follow, or want them in first?

gitguardian · 2026-06-24T20:45:53Z

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

Since your pull request originates from a forked repository, GitGuardian is not able to associate the secrets uncovered with secret incidents on your GitGuardian dashboard.
Skipping this check run and merging your pull request will create secret incidents on your GitGuardian dashboard.

🔎 Detected hardcoded secret in your pull request

GitGuardian id	GitGuardian status	Secret	Commit	Filename
34197725	Triggered	Basic Auth String	`d807260`	apps/sim/executor/handlers/pi/cloud-backend.test.ts	View secret

🛠 Guidelines to remediate hardcoded secrets

Understand the implications of revoking this secret by investigating where it is used in your code.
Replace and store your secret safely. Learn here the best practices.
Revoke and rotate this secret.
If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider

following these best practices for managing and storing secrets including API keys and other credentials
install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.

^{🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.}

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit d807260. Configure here.}

cursor · 2026-06-24T20:46:54Z

+  const result = streamText({
+    model: openai(CHAT_MODEL),
+    system: SYSTEM_PROMPT,
+    messages: convertToModelMessages(messages),


Client messages bypass grounding

Medium Severity

The chat route accepts a client-supplied messages array after minimal structural checks and passes it to convertToModelMessages. Crafted assistant tool parts or extra system messages can inject fake doc search results or instructions, so replies may not reflect actual searchDocs output.

^{Reviewed by Cursor Bugbot for commit d807260. Configure here.}

cursor · 2026-06-24T20:46:54Z

+      if (part.type === 'text') total += part.text.length
+    }
+  }
+  return total


Invalid text parts crash route

Medium Severity

userInputChars assumes every part with type === 'text' has a string text field. A crafted message that passes isValidMessage but omits text throws when reading .length, turning the request into an unhandled server error instead of a 400 response.

^{Reviewed by Cursor Bugbot for commit d807260. Configure here.}

docs: default Ask AI to gpt-5.4-mini

c9c01c1

Better grounding/instruction-following than gpt-4o-mini at docs-chat volumes; still overridable via OPENAI_CHAT_MODEL. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

vercel Bot deployed to Preview June 22, 2026 20:54 View deployment

vercel Bot deployed to Preview June 22, 2026 21:04 View deployment

ouiliame marked this pull request as ready for review June 22, 2026 21:10

ouiliame requested a review from a team as a code owner June 22, 2026 21:10

cursor Bot reviewed Jun 22, 2026

View reviewed changes

Comment thread apps/docs/app/api/chat/route.ts Outdated

Comment thread apps/docs/app/api/chat/route.ts

vercel Bot deployed to Preview June 22, 2026 21:14 View deployment

greptile-apps Bot reviewed Jun 22, 2026

View reviewed changes

Comment thread apps/docs/app/api/chat/route.ts Outdated

Comment thread apps/docs/components/ai/ask-ai.tsx Outdated

Comment thread apps/docs/components/ai/ask-ai.tsx

greptile-apps Bot reviewed Jun 22, 2026

View reviewed changes

Comment thread apps/docs/app/api/chat/route.ts

docs: validate message shape on Ask AI route (400, not 500)

55c6b0e

A message that's a valid JSON array element but missing parts/role would throw in userInputChars and surface as a 500. Reject malformed messages with a 400. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Merge remote-tracking branch 'origin/staging' into feat/docs-ask-ai

d807260

cursor Bot reviewed Jun 24, 2026

View reviewed changes

Uh oh!

Conversation

ouiliame commented Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Abuse hardening

Notes

Uh oh!

vercel Bot commented Jun 22, 2026

Uh oh!

vercel Bot commented Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cursor Bot commented Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Summary

Uh oh!

Uh oh!

Uh oh!

greptile-apps Bot commented Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

Uh oh!

Uh oh!

waleedlatif1 commented Jun 22, 2026

Uh oh!

ouiliame commented Jun 22, 2026

Uh oh!

Uh oh!

ouiliame commented Jun 22, 2026

Uh oh!

ouiliame commented Jun 24, 2026

Uh oh!

gitguardian Bot commented Jun 24, 2026

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot Jun 24, 2026

Choose a reason for hiding this comment

Client messages bypass grounding

Uh oh!

cursor Bot Jun 24, 2026

Choose a reason for hiding this comment

Invalid text parts crash route

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ouiliame commented Jun 22, 2026 •

edited

Loading

vercel Bot commented Jun 22, 2026 •

edited

Loading

cursor Bot commented Jun 22, 2026 •

edited

Loading

greptile-apps Bot commented Jun 22, 2026 •

edited

Loading