Added: Dashboard with Vercel config#142
Conversation
caohy1988
left a comment
There was a problem hiding this comment.
Full review on head 4705d2d.
I also prepared a connector fix, but I could not push directly to this PR's source branch: GitHub returned Permission to Saherpathan/BigQuery-Agent-Analytics-SDK.git denied to caohy1988. I pushed the fix commit to my fork instead:
- Branch:
caohy1988:fix/pr142-dashboard-connector - Commit:
05aa9f0 fix(dashboard): harden BigQuery data connector
That patch does four things: supports Vercel service-account env vars and local ADC / GOOGLE_APPLICATION_CREDENTIALS, validates Project/Dataset/Table identifiers before building SQL, normalizes BQ AA rows into the shape the UI expects (id, parent_id, type, label, latency, total_tokens), and fixes the local runbook to use npx vercel dev for /api.
Findings:
-
High —
/apibuilds SQL from unsanitized request headers.
dashboard_v2/api/index.tsinterpolatesx-gcp-project-id,x-bq-dataset, andx-bq-tabledirectly into a backticked table path. A malformed header containing a backtick can break out of the identifier and alter the query. Even if this is intended as BYO table input, the server must validate identifiers before composing SQL. The patch branch adds strict identifier validation and returns400for invalid table refs. -
High — the current BigQuery connector breaks local ADC and some Vercel env setups.
The module eagerly constructsnew BigQuery({ projectId, credentials: { client_email, private_key } })even whenGCP_CLIENT_EMAIL/GCP_PRIVATE_KEYare absent. That prevents the local.envguidance (GOOGLE_APPLICATION_CREDENTIALS=...) from working reliably, because the client is given a partial credentials object instead of falling back to ADC. The patch branch only passes explicit credentials when both service-account env vars are set; otherwise it lets the Google client use ADC /GOOGLE_APPLICATION_CREDENTIALS. -
Medium — the API returns raw BQ AA rows, but the UI expects graph-friendly fields that do not exist in
agent_events.
TraceTreekeys nodes bynode.id || node.trace_id, edges byparent_id, and labels/types bytype/label; BQ AA rows usespan_id,parent_span_id,event_type,agent, JSONcontent, JSONattributes, and JSONlatency_ms. As a result, data can load successfully but render as disconnectednode-Nentries with zero tokens/latency. The patch branch normalizes rows server-side so the existing components can render real BQ AA telemetry. -
Medium — local setup docs say
npm run dev, but that starts only Vite and does not run the Vercel/apifunction.
The README's local path sends users tolocalhost:5173, where/apiis not backed by the serverless function. For the BigQuery connector to run locally, the documented path should benpx vercel dev(usuallylocalhost:3000) or the Vite dev server needs an explicit API proxy/server. The patch branch updates the runbook and keepsnpm run devdocumented as UI-only. -
Medium — the public deployment lets any visitor query any table the deployment service account can read.
The frontend sends Project/Dataset/Table in headers and the backend trusts them. Even after identifier validation, a public Vercel URL plus a broadly-permissioned service account becomes a table browser for every BigQuery table that service account can access. For a public demo, either restrict the allowed project/dataset/table via env allowlist, or make the deployment private/authenticated. This can be follow-up if the service account is intentionally scoped to one demo table only, but the security model should be explicit in README. -
Low —
package.jsonhas a deadstartscript.
"start": "node server.cjs"references a file that does not exist. The patch branch changes it tovite preview --host=0.0.0.0.
Verification on the patch branch:
npm ci-> clean, 0 vulnerabilities.npm run lint-> clean (tsc --noEmit).npm run build-> clean; Vite only emits the existing large-chunk warning for the dashboard bundle.git diff --check-> clean.
|
Thanks for the thorough review and the patch, @caohy1988! I’ve merged your fixes for the SQL sanitization, ADC support, and data normalization into this branch. Much appreciated!
|
|
do you have an e2e demo recorded for this workflow? @Saherpathan |
caohy1988
left a comment
There was a problem hiding this comment.
Fresh review of PR #142 head ff4d4fe against current origin/main.
I found several issues that should block merge.
Blockers
B1 — Public API route lets any browser user query any table the Vercel service account can read
dashboard_v2/api/index.ts takes the table target from browser-controlled headers:
const userProject = getHeader(req, 'x-gcp-project-id');
const userDataset = getHeader(req, 'x-bq-dataset');
const userTable = getHeader(req, 'x-bq-table');Then it runs:
SELECT * FROM `${userProject}.${userDataset}.${userTable}`using the server-side Vercel service account. The regex validation prevents SQL injection, but it does not authorize the table. Any user who can reach a deployed dashboard can change localStorage or send headers and query any table the service account is allowed to read.
This is a confused-deputy problem. The frontend should not choose arbitrary BigQuery table refs for a privileged backend credential.
Please fix with one of these patterns before merge:
- Configure the allowed project/dataset/table server-side via env vars and ignore browser-supplied table refs.
- Or maintain a server-side allowlist of table refs / dataset prefixes.
- Or add real user authentication and authorize the requested table against that user.
For this demo dashboard, the simplest safe shape is probably env-configured table refs plus a timespan filter.
B2 — Data-source inputs do not trigger a reload after the initial failure
The command bar stores Project/Dataset/Table in localStorage:
localStorage.setItem('user_gcp_project', e.target.value);
localStorage.setItem('user_bq_dataset', e.target.value);
localStorage.setItem('user_bq_table', e.target.value);But the components only refetch on filters.timespan / filters.agentId:
AuditLog.tsx:useEffect(..., [filters.timespan])FinOpsSummary.tsx:useEffect(..., [filters.timespan])TraceTree.tsx:useEffect(..., [filters.timespan, filters.agentId])
So a first-time user lands on the dashboard, gets Missing Configuration, enters project/dataset/table, and the dashboard does not refetch unless they change timespan or reload the page.
Please move the data-source values into shared React state / URL state, or add an explicit Connect/Refresh button that triggers the query after the values are entered. As written, the primary configuration flow is broken.
B3 — Agent filter is visible but not wired to the backend query
useDashboardFilters() tracks agent_id, and CommandBar exposes an Agent dropdown, but fetchAgentData() only sends timespan; the API route only filters on timestamp:
WHERE timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL ...)Selecting an agent changes the URL, but not the BigQuery result set. This is misleading for a dashboard where the first control in the command bar is Agent.
Either wire agent_id through as a validated query parameter and add the corresponding AND agent = @agent_id filter, or remove/disable the Agent filter until it is implemented.
B4 — git diff --check fails
Local check:
git diff --check origin/main...HEADfails with many trailing-whitespace errors, including:
dashboard_v2/api/index.ts:145dashboard_v2/src/App.tsx:14dashboard_v2/src/components/CommandBar.tsx:2dashboard_v2/src/components/AuditLog.tsx:18dashboard_v2/src/components/FinOpsSummary.tsx:20dashboard_v2/src/components/TraceTree.tsx:20dashboard_v2/src/hooks/useDashboardFilters.tsx:28
Please strip trailing whitespace before merge.
Verified locally
After npm ci in dashboard_v2/:
npm run buildpasses.npm run lint(tsc --noEmit) passes.npm audit --omit=devreportsfound 0 vulnerabilities.
GitHub currently only shows cla/google; no dashboard build/typecheck CI is attached to the PR.
The TypeScript app compiles, but the backend authorization and UI data-source flow need fixes before this should merge.
caohy1988
left a comment
There was a problem hiding this comment.
Follow-up to my earlier reviews. Additional blockers I haven't yet seen called out — all checked against pr142-head (ff4d4fe) with npm ci && npm run build locally.
Blockers
B5 — Missing Apache 2.0 copyright headers on every new TS/TSX file
This repo's convention (verified against every Python and shell file under src/ and examples/) is an Apache 2.0 header at the top of every new source file:
# Copyright 2026 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
...
Every new file in dashboard_v2/ is missing this:
dashboard_v2/api/index.tsdashboard_v2/src/App.tsxdashboard_v2/src/main.tsxdashboard_v2/src/services/apiService.tsdashboard_v2/src/hooks/useDashboardFilters.tsxdashboard_v2/src/components/{AuditLog,CommandBar,FinOpsSummary,TraceTree}.tsxdashboard_v2/src/lib/utils.tsdashboard_v2/vite.config.ts
For a Google-owned repo this is non-negotiable. Add the // Copyright 2026 Google LLC Apache 2.0 header to every new .ts / .tsx file (use /* ... */ block form for TypeScript).
B6 — Latent secret-handling foot-gun: vite.config.ts defines process.env.GEMINI_API_KEY for the client bundle
dashboard_v2/vite.config.ts:9-11:
define: {
'process.env.GEMINI_API_KEY': JSON.stringify(env.GEMINI_API_KEY),
},Vite's define is a build-time text-replace: every occurrence of process.env.GEMINI_API_KEY in client code is baked into the JS bundle as the literal string value.
Empirically the current bundle doesn't leak because (a) loadEnv(mode, '.', '') reads from .env* files, not process.env, (b) no .env is checked in, and (c) no code in src/ references process.env.GEMINI_API_KEY yet. But the wiring is a loaded gun:
- The dashboard already declares
@google/genaiindependencies(production install). It's currently unused, but its presence + thedefinestrongly suggests someone planned client-side Gemini calls. - Any developer who creates a local
.envwithGEMINI_API_KEY=...for testing and runsnpm run buildships a bundle with the literal key in the JS source — readable by every visitor. - Some CI setups write
.envfiles from secrets at build time, triggering the same leak in production.
Fix: Remove the define block entirely. If the dashboard ever needs Gemini access, it must go through the /api serverless route using a server-side env var. Also drop @google/genai from dependencies until a server-side caller actually needs it.
B7 — package.json lists vite in BOTH dependencies and devDependencies
dashboard_v2/package.json:31 and dashboard_v2/package.json:40:
"dependencies": {
...
"vite": "^6.2.3"
},
"devDependencies": {
...
"vite": "^6.2.3"
}vite is a build-time tool and belongs in devDependencies only. The double-listing creates two problems: production installs ship Vite (a ~30 MB bundler) unnecessarily, and npm will silently pick one entry over the other depending on npm version, masking future version drift.
Fix: Remove vite from dependencies.
B8 — Four dependencies declared but never imported
Grepping dashboard_v2/src/ and dashboard_v2/api/ for imports:
| Dep | Where declared | Imports found |
|---|---|---|
express |
dependencies |
none |
@types/express |
devDependencies |
none |
dotenv |
dependencies |
none (Vite uses its own loadEnv; Vercel injects env at runtime) |
@google/genai |
dependencies |
none |
The Vercel API handler at api/index.ts uses the bare (req, res) => void signature, not Express. dotenv is redundant on top of loadEnv. @google/genai is dead pending B6.
Fix: Remove all four. npm run build and npm run lint should still pass.
B9 — /api returns raw BigQuery error messages to the client
dashboard_v2/api/index.ts:170-173:
res.status(500).json({
error: error.message || 'Failed to query BigQuery',
code: error.code,
});BigQuery's error.message can contain project IDs, dataset names, column names, and partial query text from server-side error paths. Even with the identifier validation from your earlier patch, an unauthenticated curl against /api can probe for information by deliberately triggering schema-mismatch errors and reading the responses.
Fix: Return a generic message ({ error: "Query failed", request_id: "<uuid>" }) and console.error(error) server-side. Operators read details in Vercel logs; clients see only the request id for support correlation.
Soft blockers
B10 — No Node engines field in package.json
Vercel's default Node version changes over time; without "engines": { "node": ">=20" } (or similar), the next Vercel default bump could silently change behavior. Pin it.
B11 — tsconfig.json has no include / exclude — client and server share one compilation context
Today, anyone in src/ who writes import { BigQuery } from '@google-cloud/bigquery' will type-check cleanly (because api/ shares the same tsconfig and pulls the type into scope), then break at runtime when Vite bundles it into the browser.
Fix: Either split into tsconfig.client.json (with include: ["src/**/*"]) and tsconfig.server.json (with include: ["api/**/*"]), or add exclude: ["api/**"] to the existing tsconfig and have the Vercel function build use its own.
B12 — No security headers in vercel.json
A public dashboard fetching BigQuery data should set at minimum:
"headers": [{
"source": "/(.*)",
"headers": [
{ "key": "X-Content-Type-Options", "value": "nosniff" },
{ "key": "X-Frame-Options", "value": "DENY" },
{ "key": "Referrer-Policy", "value": "strict-origin-when-cross-origin" }
]
}]Bare minimum defense-in-depth for a publicly-accessible URL.
Verification on pr142-head (ff4d4fe)
npm ci→ clean, 0 vulnerabilities.npm run lint→ clean.npm run build→ clean (the existing 840 KB chunk-size warning).- Empirical leak check: built with
GEMINI_API_KEY=PROBE-LEAK-SENTINEL-12345in shell env; sentinel does not appear indist/assets/*.js(becauseloadEnvreads.env*files, notprocess.env). B6 is a latent foot-gun, not an active leak today.
Summary
Your existing reviews covered the SQL identifier validation, ADC fallback, UI ↔ row-shape mismatch, the dev-runbook drift, the public-deployment authorization model, the data-source refetch flow, the wired-but-not-implemented Agent filter, and the trailing-whitespace lint.
The eight findings above (B5–B12) are all unaddressed by those reviews. B5 (copyright headers) and B6 (secret-handling foot-gun) are the hardest blockers; B7–B9 are package hygiene + info-disclosure; B10–B12 are soft. None of them are caught by npm run lint / npm run build so they have to be raised in review.
|
This is the video demo: Local dashboard view How to run locally
http://localhost:3000 |
caohy1988
left a comment
There was a problem hiding this comment.
Re-review — PR #142 head dfd826d
Checked out locally; npm ci clean, npm run lint (tsc) clean, npm run build clean. Several prior findings are genuinely fixed — the URL-state refetch flow and the Vite dev-server API plugin are nice work. But most of the May 12 blockers are still open, and I found the likely root cause of the "live deployment data connection issue" mentioned in the PR description — it's a one-line vercel.json bug, not BigQuery.
New — N1: vercel.json rewrite shadows /api/health, which bricks the entire production dashboard
The PR description says the live deployment "is currently facing a data connection issue with BigQuery." I believe the actual root cause is this, in vercel.json:
"rewrites": [
{ "source": "/api/(.*)", "destination": "/api/index.ts" },
...
]The catch-all sends /api/health to /api/index.ts — the data handler — instead of the dedicated api/health.ts function. The chain in production:
GET /api/health→ rewritten toindex.ts→handleAgentDataRequestsees a GET with nox-gcp-project-id/x-bq-dataset/x-bq-tableheaders → returns 400 "Missing Configuration".fetchDashboardHealth(apiService.ts:36) throws on!response.ok.useDashboardHealthleaveshealth = null→ready: Boolean(null?.ready)→ false.- Every component computes
sourceReady = … && authReady(e.g.AuditLog.tsx:14) → false → renders the not-ready state and never fetches, regardless of how correctly the user configured project/dataset/table.
Local dev works because the Vite plugin (vite.config.ts:20-25) special-cases /api/health before the catch-all — which is exactly the asymmetry that makes this bug invisible in local testing.
Fix: route health before the catch-all:
"rewrites": [
{ "source": "/api/health", "destination": "/api/health.ts" },
{ "source": "/api/(.*)", "destination": "/api/index.ts" },
{ "source": "/api", "destination": "/api/index.ts" },
{ "source": "/(.*)", "destination": "/index.html" }
](or drop the /api/* rewrites entirely and rely on Vercel's file-based function routing, keeping only the SPA fallback). Please redeploy and re-test the live URL after this — I expect the "data connection issue" disappears.
New — N2: .gitignore does not cover service-account.json, and the documented runbook tells users to put a private key exactly there
Your June 5 runbook comment says:
Place a service account JSON at
dashboard_v2/service-account.json
and the code makes that the default credentials path (agentData.ts:16). But dashboard_v2/.gitignore contains only node_modules/, dist/, assets/, .env (duplicated four times, incidentally). service-account.json is not ignored. Any user who follows the official setup and later runs git add . commits a live service-account private key.
Fix: add service-account.json (and *.json key-file patterns like *-credentials.json if you like) to .gitignore, and de-duplicate the file while there. This is a one-line change protecting against the single worst outcome a demo can cause.
New — N3: ADC regression — gcloud auth application-default login no longer works
The round-1 connector patch fell back to bare new BigQuery() so Google's client library could resolve ADC in all its forms. The current getBigQueryClient() (agentData.ts:131-161) only supports two explicit shapes — GCP_CLIENT_EMAIL+GCP_PRIVATE_KEY env vars, or a service-account JSON file read manually — and throws otherwise. A developer who authenticated with gcloud auth application-default login (no key file, credentials at ~/.config/gcloud/application_default_credentials.json) gets "BigQuery authentication is not configured" even though every Google SDK on their machine works.
Fix: add a third branch — if neither explicit shape is configured, return new BigQuery({ projectId }) and let the client library do its standard ADC resolution. Update hasBigQueryAuthConfigured() / getDashboardRuntimeStatus() to reflect that ADC is a valid state (or rename ready to explicit_credentials_configured and have the health UI treat "unknown" as try-anyway).
New — N4: dead "Gemini Key" input collects a secret into localStorage that nothing consumes
CommandBar.tsx:104-113 renders a Gemini API key field and stores it at localStorage['user_gemini_key']. Grepping src/ and api/: nothing reads it. Combined with the still-present define: { 'process.env.GEMINI_API_KEY': ... } in vite.config.ts (B6 below), this looks like half-built client-side Gemini wiring. A UI field that collects a secret and stores it in plaintext localStorage for no purpose is strictly negative value.
Fix: delete the input field and the localStorage key. If Gemini features come later, the key belongs server-side (see B6).
Still open from May 12 (flagged, unaddressed)
- B1 — confused-deputy table access (3rd flag).
agentData.ts:292-294still takes project/dataset/table from browser headers and queries with the server credential. Identifier regexes prevent SQL injection but do not authorize the table: any visitor to the deployed URL can read any table the Vercel service account can access. Fix as before: env-configured table ref(s) server-side, or a server-side allowlist, or real auth. For the demo, env-pinned table refs are the simplest safe shape. - B5 — Apache 2.0 headers still missing from every
.ts/.tsxfile (checked all ofapi/,src/,vite.config.ts). Non-negotiable for this repo. - B6 —
GEMINI_API_KEYdefine foot-gun still invite.config.ts:49. Remove thedefineblock; with N4's dead input gone there is zero reason for it to exist. - B9 — raw BigQuery error pass-through still at
agentData.ts:344:{ error: error.message, code: error.code }. Return a generic message + request id; log details server-side. - B4 — whitespace lint still fails:
git diff --checkflags trailing whitespace insrc/App.tsx:27,31,34(+ more) and new-blank-line-at-EOF in.env.exampleandREADME.md. - B8 — unused dependencies all still declared:
express,@types/express,dotenv,@google/genai— zero imports insrc/orapi/. Remove all four. - B10 — no
enginesfield; B11 — tsconfig has noinclude/exclude(client + server share one compilation context; also notepaths: {"@/*": ["./*"]}disagrees with Vite's@ → ./srcalias — latent only since no@/imports exist yet, but one of the two should change); B12 — no security headers invercel.json.
Verified RESOLVED since ff4d4fe
- B2 (data-source refetch flow) — fixed properly. Project/dataset/table now live in URL search params via
useDashboardFilters,CommandBarcallssetFilters(...)on change, and every component's effect dependency array includesfilters.projectId/datasetId/tableId+sourceReady(e.g.AuditLog.tsx:69). First-run configuration now triggers fetches without a manual reload. Nice. - B3 (agent filter) — mostly fixed.
agent_id/user_id/trace_id/span_idflow through the query string and the API applies them (applyDashboardFilters). One caveat: filtering happens in memory afterLIMIT 1000(agentData.ts:330,337), so on busy tables an agent's rows past the newest 1000 events are silently invisible. Consider pushing at leastagent_idinto the SQL as a parameterizedAND agent = @agent_id, or document the 1000-row window in the UI. - B7 (vite double-listed) — fixed;
viteis devDependencies-only now. - Round-1 #6 (dead
startscript) — fixed (vite preview --host=0.0.0.0). - Round-1 #4 (local runbook drift) — fixed structurally: the new
localApiPlugininvite.config.tsserves/apiinsidenpm run devon port 3000. This is a genuinely good pattern — same handler module (handleAgentDataRequest) shared between the Vercel function and the dev middleware, so local and prod can't drift (modulo N1's rewrite asymmetry). - Round-1 #1/#3 (identifier validation + row normalization) — preserved in the
agentData.tsrefactor (assertValidTableRef,normalizeRow).
Build verification: npm ci → clean; npm run lint → clean; npm run build → clean (existing chunk-size warning only).
Verdict — REQUEST_CHANGES
Priority order:
- N1 — fix the
/api/healthrewrite. One line; very likely repairs the live deployment you've been debugging as a "BigQuery data connection issue." - N2 — add
service-account.jsonto.gitignore. One line; prevents key leakage by anyone following your own runbook. - B1 — the confused-deputy table authorization (third time flagged). This is the gating security issue for any public deployment.
- B5 (headers), B6 + N4 (delete the Gemini define + dead input), B9 (generic errors), B4 (whitespace), B8 (4 dead deps) — all mechanical.
- N3 (ADC fallback), B3's post-LIMIT caveat, B10–B12 — quality-of-life; bundle in or follow up.
The frontend architecture has improved meaningfully across rounds (URL-state filters, shared API handler between dev/prod, health gating). Once the authorization model and the hygiene items land, this is close.
Live Dashboard
Known Issue
The live deployment is currently facing a data connection issue with BigQuery, so analytics data may not load correctly on the hosted version. Local setup should work correctly after configuring valid Google Cloud credentials and environment variables.
Tech Stack
React
TypeScript
Vite
Node.js
Google BigQuery
Vercel