Skip to content

docs: Ask AI chat grounded in the docs vector store#5172

Open
ouiliame wants to merge 7 commits into
simstudioai:stagingfrom
ouiliame:feat/docs-ask-ai
Open

docs: Ask AI chat grounded in the docs vector store#5172
ouiliame wants to merge 7 commits into
simstudioai:stagingfrom
ouiliame:feat/docs-ask-ai

Conversation

@ouiliame

@ouiliame ouiliame commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds an Ask AI chat to the docs site so readers can ask questions about Sim in natural language and get answers grounded in the documentation.

  • Floating launcher → chat panel (components/ai/ask-ai.tsx) built on the Vercel AI SDK's useChat. Assistant replies render as markdown via streamdown (the same AI-streaming markdown renderer the main app's chat uses), with source chips linking back to cited pages.
  • Streaming API route (app/api/chat/route.ts) using streamText with the OpenAI provider. A searchDocs tool runs a vector search over the existing docs_embeddings store and returns source links, so the model answers from real docs rather than memory.
  • Reuses the OPENAI_API_KEY already in the environment (same key the docs search uses for embeddings). Model defaults to gpt-5.4-mini, overridable via OPENAI_CHAT_MODEL.

Abuse hardening

This endpoint proxies a paid LLM, so an unauthenticated public route is a target for scripted "free inference". Shipped in this PR (cost caps per request):

  • Max messages, max input size (413), max output tokens, reduced tool-step limit
  • Lenient same-origin check (rejects obvious cross-origin; DOCS_ALLOWED_ORIGINS to extend)

Infra-side follow-ups (dashboard/provisioning, not code) — do before public launch:

  • OpenAI hard spend cap + alert (backstops worst case regardless of code)
  • Durable per-IP rate limit (Upstash/Vercel KV — the per-request caps bound cost but not volume)
  • Vercel Firewall / BotID (or Turnstile) to block headless traffic at the edge

Notes

  • The LLM-text plumbing (llms.txt, llms-full.txt, .md/.mdx routes, Accept negotiation) already existed — this PR adds only the Ask AI chat.
  • Embeddings stay on OpenAI (text-embedding-3-small); only the chat completion uses the chat model. Branch is off staging; bun run build and type-check pass for the docs app.

🤖 Generated with Claude Code

Adds an "Ask AI" chat to the docs site. A floating launcher opens a panel
backed by the Vercel AI SDK (OpenAI provider, OPENAI_API_KEY from the
environment). The chat is grounded via a searchDocs tool that runs a vector
search over the existing docs embeddings and returns source links, so answers
cite real pages.

- app/api/chat/route.ts — streaming POST handler (streamText + searchDocs tool)
- components/ai/ask-ai.tsx — useChat panel with streamed answers + source chips
- wired into the docs layout; reuses the existing OPENAI_API_KEY and embeddings

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@vercel

vercel Bot commented Jun 22, 2026

Copy link
Copy Markdown

@ouiliame is attempting to deploy a commit to the Sim Team on Vercel.

A member of the Team first needs to authorize it.

@vercel

vercel Bot commented Jun 22, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
docs Ready Ready Preview, Comment Jun 22, 2026 9:14pm

Request Review

Better grounding/instruction-following than gpt-4o-mini at docs-chat volumes;
still overridable via OPENAI_CHAT_MODEL.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Markdown: render assistant messages with streamdown (the same AI-streaming
markdown component the main app's chat uses), so bold/lists/code render
instead of raw **asterisks**. User messages stay plain text.

Abuse guards: the endpoint proxies a paid LLM, so cap the cost of any single
request — max messages, max input size, max output tokens, fewer tool steps —
and reject obvious cross-origin calls (lenient: Origin is a filter, not a
boundary). Durable per-IP rate limiting, a provider spend cap, and edge bot
protection are provisioned separately.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The size cap previously measured the whole serialized message array, so a
follow-up question failed (413) once the history carried the prior answer plus
retrieved doc chunks. Count only user-authored text instead, and loosen all
bounds ~20x so normal multi-turn use never hits them — they remain only as a
backstop against egregious abuse.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@ouiliame ouiliame marked this pull request as ready for review June 22, 2026 21:10
@ouiliame ouiliame requested a review from a team as a code owner June 22, 2026 21:10
@cursor

cursor Bot commented Jun 22, 2026

Copy link
Copy Markdown

PR Summary

Medium Risk
Introduces a public LLM proxy that spends OpenAI credits and hits the embeddings DB; request caps and origin filtering reduce but do not eliminate scripted abuse without the planned infra rate limits and spend caps.

Overview
Adds an Ask AI experience to the docs site: a floating launcher opens a chat panel on every localized docs page, wired through the shared layout with the active locale.

A new POST /api/chat route streams answers with the Vercel AI SDK (streamText + OpenAI). The model must call a searchDocs tool that vector-searches the existing docs_embeddings store (same embedding path as site search), filters by locale, and returns titles/URLs/content so replies stay doc-grounded. Per-request abuse caps (message count, user text size, total payload, output tokens, tool steps) and a lenient Origin check (DOCS_ALLOWED_ORIGINS) limit cost on this public, unauthenticated endpoint.

The client uses useChat with markdown via streamdown and shows source chips from searchDocs tool output. Docs app dependencies add @ai-sdk/openai, @ai-sdk/react, ai, streamdown, and zod.

Reviewed by Cursor Bugbot for commit d807260. Bugbot is set up for automated code reviews on this repo. Configure here.

Comment thread apps/docs/app/api/chat/route.ts Outdated
Comment thread apps/docs/app/api/chat/route.ts
@greptile-apps

greptile-apps Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

Adds a floating "Ask AI" chat panel to the docs site, grounded in the existing docs_embeddings vector store via a new /api/chat streaming route built on AI SDK v5 streamText and useChat. The implementation includes thoughtful per-request abuse hardening (message count cap, user-input char limit, total payload backstop, origin filter) with acknowledged infra-level follow-ups before public launch.

  • app/api/chat/route.ts: New streaming POST route — validates and size-caps incoming messages, runs a pgvector similarity search via the searchDocs tool, and streams the answer back with toUIMessageStreamResponse.
  • components/ai/ask-ai.tsx: Client-side floating launcher + chat panel using DefaultChatTransport / useChat; correctly splits instant-scroll-on-open from smooth-scroll-on-new-messages; source chips render with rel="noopener noreferrer".
  • package.json: Adds ai v5, @ai-sdk/openai, @ai-sdk/react, streamdown, and zod to the docs app.

Confidence Score: 5/5

Safe to merge; the two new files are well-scoped to the docs app with no impact on the main Sim application.

The route and component are carefully written with structural validation, size caps, and correct AI SDK v5 patterns. Both findings are non-blocking style/configuration nits with no effect on the happy path.

No files require special attention, though operators configuring DOCS_ALLOWED_ORIGINS should be aware the allowlist expects hostnames rather than full origin strings.

Important Files Changed

Filename Overview
apps/docs/app/api/chat/route.ts New streaming chat API route. Solid abuse-hardening (message caps, size limits, origin check). Minor: DOCS_ALLOWED_ORIGINS allowlist compares hostnames but env var name implies full origins — silent misconfiguration footgun.
apps/docs/components/ai/ask-ai.tsx New floating chat component using AI SDK v5 useChat with DefaultChatTransport. Correctly splits instant-scroll (open) from smooth-scroll (messages) effects. Source chips include rel="noopener noreferrer". Minor: smooth-scroll effect fires even when panel is closed.
apps/docs/app/[lang]/layout.tsx Minimal addition: mounts the AskAI component inside the existing RootProvider, correctly passing the active locale.
apps/docs/package.json Adds @ai-sdk/openai, @ai-sdk/react, ai v5, streamdown, and zod dependencies to the docs app.

Sequence Diagram

%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
    participant User
    participant AskAI as AskAI Component
    participant Route as /api/chat
    participant OpenAI as OpenAI
    participant DB as docs_embeddings

    User->>AskAI: Types question, presses Enter
    AskAI->>Route: "POST /api/chat {messages, locale}"
    Route->>Route: isAllowedOrigin + validate + size caps
    Route->>OpenAI: streamText with searchDocs tool
    OpenAI-->>Route: Tool call: searchDocs(query)
    Route->>OpenAI: generateSearchEmbedding(query)
    OpenAI-->>Route: embedding vector
    Route->>DB: "SELECT ORDER BY embedding <=> vector LIMIT 24"
    DB-->>Route: top 24 chunks
    Route->>Route: Filter by locale → top 6 chunks
    Route-->>OpenAI: tool result (title, url, content)
    OpenAI-->>Route: streaming answer text
    Route-->>AskAI: UIMessageStream
    AskAI-->>User: Streamed answer with cited doc links
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
    participant User
    participant AskAI as AskAI Component
    participant Route as /api/chat
    participant OpenAI as OpenAI
    participant DB as docs_embeddings

    User->>AskAI: Types question, presses Enter
    AskAI->>Route: "POST /api/chat {messages, locale}"
    Route->>Route: isAllowedOrigin + validate + size caps
    Route->>OpenAI: streamText with searchDocs tool
    OpenAI-->>Route: Tool call: searchDocs(query)
    Route->>OpenAI: generateSearchEmbedding(query)
    OpenAI-->>Route: embedding vector
    Route->>DB: "SELECT ORDER BY embedding <=> vector LIMIT 24"
    DB-->>Route: top 24 chunks
    Route->>Route: Filter by locale → top 6 chunks
    Route-->>OpenAI: tool result (title, url, content)
    OpenAI-->>Route: streaming answer text
    Route-->>AskAI: UIMessageStream
    AskAI-->>User: Streamed answer with cited doc links
Loading

Reviews (3): Last reviewed commit: "docs: validate message shape on Ask AI r..." | Re-trigger Greptile

Comment thread apps/docs/app/api/chat/route.ts Outdated
Comment thread apps/docs/components/ai/ask-ai.tsx Outdated
Comment thread apps/docs/components/ai/ask-ai.tsx
@waleedlatif1

Copy link
Copy Markdown
Collaborator

@ouiliame run the review loop and fix the comments and re-run @greptile and @cursor review until 5/5

- Guard req.json() with try/catch → 400 on malformed body (was 500)
- Scope vector search to the reader's locale (mirrors site search); client
  forwards the active locale to the route
- Backstop the whole serialized payload so assistant/tool parts can't be
  stuffed past the user-text cap
- Split the scroll effect: instant jump on panel open, smooth on new messages
- Add rel="noopener noreferrer" target="_blank" to source-chip links

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@ouiliame

Copy link
Copy Markdown
Contributor Author

@greptile review
@cursor review

Comment thread apps/docs/app/api/chat/route.ts
A message that's a valid JSON array element but missing parts/role would throw
in userInputChars and surface as a 500. Reject malformed messages with a 400.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@ouiliame

Copy link
Copy Markdown
Contributor Author

@greptile review
@cursor review

@ouiliame

Copy link
Copy Markdown
Contributor Author

@waleedlatif1 ready for merge — review loop is at 5/5

  • Greptile: 5/5 (Confidence Score) on the latest commit
  • Cursor: no outstanding bugs; only its standing Medium Risk note, which is the architectural "public unauthenticated LLM proxy" flag — not a code fix. The hardening that clears it (durable rate limit, OpenAI spend cap, Vercel Firewall/BotID) is tracked as the pre-launch infra checklist in the PR body.

All review comments addressed across three rounds: req.json() 500→400, locale-scoped retrieval, 413 cap closed for assistant/tool payload, scroll-effect split, rel="noopener" on source links, and message-shape validation. Branch is merged up to date with staging; type-check + build green.

One call for you: the Cursor Medium Risk stays until those infra controls land. Merge now with the checklist as a fast-follow, or want them in first?

@gitguardian

gitguardian Bot commented Jun 24, 2026

Copy link
Copy Markdown

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

Since your pull request originates from a forked repository, GitGuardian is not able to associate the secrets uncovered with secret incidents on your GitGuardian dashboard.
Skipping this check run and merging your pull request will create secret incidents on your GitGuardian dashboard.

🔎 Detected hardcoded secret in your pull request
GitGuardian id GitGuardian status Secret Commit Filename
34197725 Triggered Basic Auth String d807260 apps/sim/executor/handlers/pi/cloud-backend.test.ts View secret
🛠 Guidelines to remediate hardcoded secrets
  1. Understand the implications of revoking this secret by investigating where it is used in your code.
  2. Replace and store your secret safely. Learn here the best practices.
  3. Revoke and rotate this secret.
  4. If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit d807260. Configure here.

const result = streamText({
model: openai(CHAT_MODEL),
system: SYSTEM_PROMPT,
messages: convertToModelMessages(messages),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Client messages bypass grounding

Medium Severity

The chat route accepts a client-supplied messages array after minimal structural checks and passes it to convertToModelMessages. Crafted assistant tool parts or extra system messages can inject fake doc search results or instructions, so replies may not reflect actual searchDocs output.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit d807260. Configure here.

if (part.type === 'text') total += part.text.length
}
}
return total

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Invalid text parts crash route

Medium Severity

userInputChars assumes every part with type === 'text' has a string text field. A crafted message that passes isValidMessage but omits text throws when reading .length, turning the request into an unhandled server error instead of a 400 response.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit d807260. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants