evo2 SAE recipe: feature-explorer dashboard (viz)#1623
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThis PR adds a new Evo2 SAE Feature Explorer: a Vite/React dashboard (four tabs) with DuckDB-WASM/Mosaic front-end, sequence embedding/UMAP tooling, generative steering and inspector UIs, a backend /gene_embed endpoint, scripts to build atlas/examples parquets, a dashboard launcher, tests, and docs/sample data. ChangesEvo2 SAE Feature Explorer Frontend & Backend
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
Recipe-level entry point tying together the offline extract/train (#1621), the live inference engine + server + CLI (this PR), and the dashboard (#1623): venv setup, 7B/L26 config, CLI encode/batch, serve, dashboard launch, and the CPU/GPU test commands. codonfm/esm2 have one; evo2 didn't. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Polina Binder <pbinder@nvidia.com>
The recipe-level README documented the dashboard (#1623) and the atlas generator (deferred), neither of which is in this PR — premature here. The inference run instructions live in launch_inference.sh's header. A complete recipe README lands once the dashboard + generator exist. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Polina Binder <pbinder@nvidia.com>
Embeds many sequences -> per-feature vectors for the #1623 Sequence-UMAP panel: each sequence is encode_batch'd, pooled (mean + max) over the DNA region, and returned as base64 float32 [n x n_features] (both pools from one forward, so the client toggles pooling without re-running) plus per-sequence metadata + feature stats. Thin wrapper over the engine's encode_batch. test_server.py: asserts the response decodes to an [ng x nf] matrix. 7 contract tests pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Polina Binder <pbinder@nvidia.com>
cee51a3 to
2c4e69e
Compare
Embeds many sequences -> per-feature vectors for the #1623 Sequence-UMAP panel: each sequence is encode_batch'd, pooled (mean + max) over the DNA region, and returned as base64 float32 [n x n_features] (both pools from one forward, so the client toggles pooling without re-running) plus per-sequence metadata + feature stats. Thin wrapper over the engine's encode_batch. test_server.py: asserts the response decodes to an [ng x nf] matrix. 7 contract tests pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Polina Binder <pbinder@nvidia.com>
|
@coderabbitai review |
✅ Action performedReview finished.
|
There was a problem hiding this comment.
Actionable comments posted: 14
🧹 Nitpick comments (5)
bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/SequenceView.jsx (1)
228-267: ⚡ Quick winExtract duplicated anchor computation logic.
The anchor computation logic is duplicated three times: in the
SequenceViewcomponent (lines 96-102), and twice in this function (lines 240-246 and 254-260). This violates the DRY principle and creates a maintenance burden.♻️ Proposed refactor to extract shared logic
Add a helper function before
SequenceView:+function computeAnchor(acts, alignMode) { + let anchor = 0 + if (alignMode === 'first_activation') { + anchor = acts.findIndex(a => a > 0) + if (anchor < 0) anchor = 0 + } else if (alignMode === 'max_activation') { + let maxVal = -1 + acts.forEach((a, i) => { if (a > maxVal) { maxVal = a; anchor = i } }) + } + return anchor +} + export default function SequenceView({Then refactor the three call sites to use
computeAnchor(acts, alignMode)instead of the duplicated logic.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/SequenceView.jsx` around lines 228 - 267, Extract the duplicated anchor calculation into a helper (e.g., computeAnchor(acts, alignMode)) and use it in computeAlignInfo and SequenceView: implement computeAnchor to accept an activation array and alignMode and return the anchor index (handling 'start', 'first_activation', 'max_activation' and defaulting to 0), then replace the repeated blocks inside computeAlignInfo (both the loop that computes maxAnchor and the loop that computes totalLength) and the anchor logic in SequenceView with calls to computeAnchor(acts, alignMode) so all three sites use the single helper.bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/RegionDetailModal.jsx (2)
114-154: ⚡ Quick winAdd accessibility attributes for the modal dialog.
The modal is missing important accessibility attributes:
role="dialog"andaria-modal="true"on the modal containeraria-labelledbypointing to the region label- Focus trap to prevent tabbing out of the modal
- Focus return to the trigger element when closed
These attributes are required for screen reader users to understand and navigate the modal correctly.
♻️ Proposed accessibility improvements
<div style={styles.backdrop} onClick={onClose}> - <div style={styles.modal} onClick={e => e.stopPropagation()}> + <div style={styles.modal} onClick={e => e.stopPropagation()} + role="dialog" aria-modal="true" aria-labelledby="region-label"> <div style={styles.closeBtn} onClick={onClose}>x</div> <div style={styles.body}> <div style={styles.header}> - <span style={styles.regionLabel}>{label}</span> + <span id="region-label" style={styles.regionLabel}>{label}</span> </div>For a complete solution, consider adding a focus trap library or implementing focus management to capture Tab key presses and cycle focus within the modal.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/RegionDetailModal.jsx` around lines 114 - 154, The modal JSX rendered by the modal variable needs ARIA attributes and focus management: add role="dialog" and aria-modal="true" to the element using styles.modal, give the region label span (styles.regionLabel) a unique id and set aria-labelledby on the modal to that id, and implement focus trapping and focus return around modal open/close (use a ref to store the element that opened the modal, set initial focus to a focusable element inside the modal on mount, intercept Tab/Shift+Tab to cycle focus within the modal, and onClose restore focus to the saved trigger element). Ensure onClose still closes the modal and that focus cleanup happens on unmount.
6-102: ⚡ Quick winUse CSS custom properties for theme consistency.
Several styles use hardcoded color values instead of CSS custom properties, which will break dark mode support:
- Line 17:
background: '#fff'should usevar(--bg-card)or similar- Line 33:
background: 'rgba(255,255,255,0.9)'in closeBtn- Lines 34, 63, 72-76, 86, 96: Various hardcoded grays like
#ddd,#222,#f9fafb,#eee,#888,#333,#fafafaOther components in the codebase (e.g., SequenceInspector.jsx, GenerativeSteering.jsx) consistently use CSS custom properties like
var(--bg-card),var(--text),var(--border)for theme support.♻️ Proposed fix for theme compatibility
modal: { - background: '`#fff`', + background: 'var(--bg-card)', borderRadius: '12px', // ... rest }, closeBtn: { // ... - background: 'rgba(255,255,255,0.9)', - border: '1px solid `#ddd`', + background: 'var(--bg-card)', + border: '1px solid var(--border)', // ... - color: '`#555`', + color: 'var(--text-secondary)', }, // Apply similar changes to other style properties🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/RegionDetailModal.jsx` around lines 6 - 102, The styles object in RegionDetailModal.jsx uses hardcoded color literals that break theme/dark-mode; update the color values in the styles constant (backdrop, modal, closeBtn, regionLabel, statBox, statLabel, statValue, sequenceBox and any other color/border/background properties) to use CSS custom properties (e.g. var(--bg-card), var(--bg-overlay) or var(--bg), var(--text), var(--muted), var(--border), var(--mono) as appropriate) so theming and dark mode match other components like SequenceInspector.jsx and GenerativeSteering.jsx; keep the same property keys (backdrop.background, modal.background, closeBtn.background/border, regionLabel.color, statBox.background/border, statLabel.color, statValue.color, sequenceBox.background, etc.) and provide sensible fallbacks if needed.bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/SequenceUMAPView.jsx (1)
113-131: ⚖️ Poor tradeoffConsider making UMAP computation non-blocking.
Line 128 calls
buildItems(Gmean, nf, ng, r.genes)which synchronously computes UMAP (line 415). For large datasets (many sequences or features), this CPU-intensive computation can freeze the UI.Consider wrapping the UMAP computation in a
setTimeout(..., 0)or using a Web Worker to keep the UI responsive during computation.♻️ Example: Add a microtask delay
const Gmean = dec(r.G_b64) const Gmax = r.Gmax_b64 ? dec(r.Gmax_b64) : Gmean const nf = r.n_features, ng = r.n_genes + await new Promise(r => setTimeout(r, 16)) // yield to browser const items = buildItems(Gmean, nf, ng, r.genes) setBundle({ G: Gmean, Gmean, Gmax, nf, ng, meta: r.genes, items, stats: r.feature_stats, saeId: r.sae_id })🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/SequenceUMAPView.jsx` around lines 113 - 131, The embed() handler currently calls buildItems(Gmean, nf, ng, r.genes) synchronously which triggers UMAP computation and can freeze the UI; change it so the heavy work runs asynchronously — either enqueue buildItems on the next tick (e.g., wrap the buildItems + setBundle(...) call in setTimeout(..., 0) or queueMicrotask) or move the UMAP logic into a Web Worker and post the response to the main thread; ensure you still setBusy(true) before and setBusy(false) only after setBundle is applied, and reference the same symbols: embed(), buildItems, setBundle, and the decoded arrays Gmean/Gmax, nf, ng, r.genes when transferring data to the worker or delayed callback.bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/GenerativeSteering.jsx (1)
183-210: ⚡ Quick winUse theme-aware colors for DNA sequence readout.
Lines 206 and 209 use hardcoded light-theme colors (
#ffffff,#e0e0e0,#111) that will not adapt to dark mode. While the comment on line 162 suggests a "light panel" is intentional for readability, the panel should still respond to the active theme.Consider using
var(--bg-code)or similar custom properties, or checking the current theme to provide appropriate contrast.♻️ Proposed theme compatibility fix
seqReadout: { - background: '`#ffffff`', - border: '1px solid `#e0e0e0`', + background: 'var(--bg-card-expanded)', + border: '1px solid var(--border)', borderRadius: '6px', padding: '8px 10px', fontFamily: 'ui-monospace, Menlo, monospace', fontSize: '13px', lineHeight: 1.7, }, seqLine: { display: 'flex', gap: '8px', alignItems: 'baseline' }, seqIdx: { color: '`#999`', fontSize: '11px', minWidth: '40px', textAlign: 'right', whiteSpace: 'pre' }, - seqText: { color: '`#111`', letterSpacing: '1px', wordBreak: 'break-all' }, + seqText: { color: 'var(--text)', letterSpacing: '1px', wordBreak: 'break-all' },🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/GenerativeSteering.jsx` around lines 183 - 210, S.seqReadout, S.seqText and S.seqIdx use hardcoded light-theme colors; update them to theme-aware CSS variables so the DNA readout adapts to dark mode. Replace seqReadout background and border colors with variables (e.g., var(--bg-code) or var(--bg-panel) and var(--border-muted) or var(--border)), set seqText color to var(--text) (or var(--text-heading)), and set seqIdx to a muted variable like var(--text-muted) instead of fixed hex values; make the changes in the S object entries named seqReadout, seqText and seqIdx.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In
`@bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/index.html`:
- Line 12: Replace invalid CSS token "light" used for font-weight inside the
`@font-face` declarations with a numeric weight (e.g., 300); locate the two
`@font-face` blocks that currently set font-weight: light and change them to
font-weight: 300 (or another valid numeric weight consistent with the font
files) so the font faces are recognized by browsers.
In
`@bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/App.jsx`:
- Around line 943-950: Replace the clickable <span> that opens the metrics modal
with an actual keyboard-accessible control: change the interactive element that
calls setShowMetricsModal(true) to a <button type="button"> with an explicit
accessible label (aria-label or visually hidden text) and preserve the inline
styles and click handler logic; do the same for the other instances referenced
(the similar spans at lines where handlers call setShow* or toggle functions
around the components) so keyboard users can tab to and activate these controls
and screen readers receive the label.
- Around line 14-19: App is mutating the document root class via the useEffect
that toggles document.documentElement.classList based on its local darkMode
state, while Dashboard also controls the root theme causing cross-component
drift; remove the root DOM mutation from App and instead accept darkMode and
setDarkMode as props (or lift darkMode state up) so Dashboard remains the single
owner of root theme control; specifically delete or disable the useEffect in App
that references document.documentElement.classList.toggle and ensure App uses
the passed-in darkMode and setDarkMode (or propagate them up) so only Dashboard
applies the root 'dark' class.
- Around line 474-475: The current conditional sets selected feature ids to null
when ids.size is 0, which treats an empty selection as "no filter" and shows all
features; change the logic in the setSelectedFeatureIds call (referencing
setSelectedFeatureIds, ids and totalFeatures) so that only a full selection
(ids.size === totalFeatures) becomes null while an empty Set is preserved (i.e.,
set to ids), e.g. replace the condition with one that returns null only when
ids.size === totalFeatures, otherwise setSelectedFeatureIds(ids).
- Around line 165-174: The parquetUrl coming from query params (via
dataPath/parquetUrl) is directly interpolated into the SQL sent to
vg.coordinator().exec() for read_parquet, which allows a single quote in the URL
to break or inject SQL; fix by escaping the URL before embedding (e.g., replace
any single quote with two single quotes or use a SQL-quoting helper) or use a
parameterized API if available, and pass the escaped/quoted value into the SQL
string used in vg.coordinator().exec(`... read_parquet('${...}')`) so
read_parquet receives a safe literal.
In
`@bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/EmbeddingView.jsx`:
- Around line 266-307: The effect starting with useEffect(...) updates category
colors but omits darkMode and features from its dependency array, so changes to
theme or the features array won't recompute hidden-category color mapping;
update the effect dependencies to include darkMode and features (in addition to
categoryColumn, categoryColumns, hiddenCategories) and ensure the logic inside
(references to categoryColumn, categoryColumns, hiddenCategories, features,
SEQUENTIAL_COLORS, CATEGORY_COLORS, HIDDEN_COLOR and the call
viewRef.current.update({...})) will re-run when those values change so colors
remain consistent.
- Around line 53-59: The current EmbeddingView.jsx sets this.inner.innerHTML
with interpolated label and colorField (and numeric fields), which enables DOM
XSS; instead, refactor the render to build DOM nodes programmatically (e.g.,
createElement for container divs and text nodes) and assign user-provided
strings to element.textContent (not innerHTML), then append numeric values (use
logFreq.toFixed and maxAct.toFixed) into textContent or separate text nodes;
replace the template usage of this.inner.innerHTML with
this.inner.appendChild(...) logic so label and colorField are never inserted as
raw HTML.
In
`@bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/FeatureCard.jsx`:
- Around line 315-316: The card expand and title-edit handlers are bound to
non-semantic elements (e.g. the div with styles.header using handleClick and
similar spans around lines 368-381), which prevents keyboard users from
activating them; update those elements in the FeatureCard component to be
keyboard-accessible by replacing purely clickable divs/spans with semantic
interactive elements (preferably <button>) or by adding role="button",
tabIndex="0", and onKeyDown handlers that call the same handler on Enter/Space;
also add appropriate ARIA attributes (e.g. aria-expanded on the card toggle and
aria-label or aria-labelledby on the title-edit trigger) and ensure focus styles
are preserved so keyboard users can see focus.
- Around line 275-311: The CSV export is vulnerable to formula injection and
broken quoting; add a sanitizeCell helper used by exportToCSV to (1) stringify
null/undefined, (2) escape quotes by doubling them, (3) wrap fields containing
commas/newlines/quotes in double quotes, and (4) if the resulting field starts
with =, +, -, or @, prefix it with a single quote to neutralize spreadsheets.
Update all places that push data into lines (metadata rows using
feature.feature_id, displayTitle, userTitle, freq, maxAct and example rows using
getRegionLabel(ex), ex.max_activation, ex.sequence) to pass values through
sanitizeCell before concatenation; also sanitize the filename components
(displayTitle) to avoid surprises.
In
`@bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/GenerativeSteering.jsx`:
- Around line 38-58: In the generate function, validate and coerce numeric
inputs before building the request body: ensure Number(nTokens) and
Number(temperature) are parsed and checked (e.g., parseInt/parseFloat and verify
not NaN and n_tokens >= 1) and show setError or bail if invalid; similarly map
through clamps in the same function to parse each c.strength, filter out or
reject empty/NaN strengths (or enforce a minimum) before constructing features
(feature_id and numeric strength) so you never send 0 from empty strings —
update generate to perform these checks and either set a client-side error or
replace invalid values with safe defaults prior to calling postJSON.
In
`@bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/InfoButton.jsx`:
- Around line 33-53: Replace the non-semantic clickable <span> in InfoButton.jsx
with a real <button> element so it is keyboard-focusable and operable; update
the element using the same ref (buttonRef), keep the onClick={() => setOpen(o =>
!o)} handler, add type="button" and an appropriate aria-label (e.g.,
aria-label="More information" or aria-expanded based on open state), and
preserve the existing inline style properties on the button so visual appearance
remains unchanged; ensure any pointer/user-select/CSS properties are applied to
the button and remove any unnecessary role attributes since a native button is
used.
In
`@bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/utils.js`:
- Around line 9-11: The label construction uses sid with
`${sid}:${example.start}-${example.end}`, which yields a leading colon when sid
is missing; update the code that builds the label (the expression using sid,
example.start and example.end in utils.js) to conditionally prepend the colon
only when sid is present — i.e., build a prefix that is either "sid:" or an
empty string, then append "start-end" using example.start and example.end;
ensure the existing null checks for example.start/example.end remain in place.
In
`@bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/scripts/launch_dashboard.py`:
- Around line 93-97: The finally block sends SIGTERM to the Vite subprocess via
proc.terminate() but does not reap it; update the shutdown sequence for proc
(the Popen object) to wait for it and force-kill if it ignores SIGTERM: call
proc.terminate(), then attempt proc.wait(timeout=...) and if it times out call
proc.kill() and then proc.wait() again to ensure the process is reaped; also
handle and ignore exceptions from wait/kill so the script exits cleanly.
In
`@bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/src/evo2_sae/server.py`:
- Around line 251-258: The response object currently returns keys "nf", "ng",
"meta", and "stats" alongside "G_b64" and "Gmax_b64"; rename these keys to match
the frontend contract used by SequenceUMAPView.jsx: replace "nf" ->
"n_features", "ng" -> "n_genes", "meta" -> "genes", and "stats" ->
"feature_stats" in the returned dict (keep the same values/types, e.g.,
int(gmean.shape[1]) etc.), so callers consuming "G_b64"/"Gmax_b64" and the UMAP
consumer get the expected fields.
---
Nitpick comments:
In
`@bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/GenerativeSteering.jsx`:
- Around line 183-210: S.seqReadout, S.seqText and S.seqIdx use hardcoded
light-theme colors; update them to theme-aware CSS variables so the DNA readout
adapts to dark mode. Replace seqReadout background and border colors with
variables (e.g., var(--bg-code) or var(--bg-panel) and var(--border-muted) or
var(--border)), set seqText color to var(--text) (or var(--text-heading)), and
set seqIdx to a muted variable like var(--text-muted) instead of fixed hex
values; make the changes in the S object entries named seqReadout, seqText and
seqIdx.
In
`@bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/RegionDetailModal.jsx`:
- Around line 114-154: The modal JSX rendered by the modal variable needs ARIA
attributes and focus management: add role="dialog" and aria-modal="true" to the
element using styles.modal, give the region label span (styles.regionLabel) a
unique id and set aria-labelledby on the modal to that id, and implement focus
trapping and focus return around modal open/close (use a ref to store the
element that opened the modal, set initial focus to a focusable element inside
the modal on mount, intercept Tab/Shift+Tab to cycle focus within the modal, and
onClose restore focus to the saved trigger element). Ensure onClose still closes
the modal and that focus cleanup happens on unmount.
- Around line 6-102: The styles object in RegionDetailModal.jsx uses hardcoded
color literals that break theme/dark-mode; update the color values in the styles
constant (backdrop, modal, closeBtn, regionLabel, statBox, statLabel, statValue,
sequenceBox and any other color/border/background properties) to use CSS custom
properties (e.g. var(--bg-card), var(--bg-overlay) or var(--bg), var(--text),
var(--muted), var(--border), var(--mono) as appropriate) so theming and dark
mode match other components like SequenceInspector.jsx and
GenerativeSteering.jsx; keep the same property keys (backdrop.background,
modal.background, closeBtn.background/border, regionLabel.color,
statBox.background/border, statLabel.color, statValue.color,
sequenceBox.background, etc.) and provide sensible fallbacks if needed.
In
`@bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/SequenceUMAPView.jsx`:
- Around line 113-131: The embed() handler currently calls buildItems(Gmean, nf,
ng, r.genes) synchronously which triggers UMAP computation and can freeze the
UI; change it so the heavy work runs asynchronously — either enqueue buildItems
on the next tick (e.g., wrap the buildItems + setBundle(...) call in
setTimeout(..., 0) or queueMicrotask) or move the UMAP logic into a Web Worker
and post the response to the main thread; ensure you still setBusy(true) before
and setBusy(false) only after setBundle is applied, and reference the same
symbols: embed(), buildItems, setBundle, and the decoded arrays Gmean/Gmax, nf,
ng, r.genes when transferring data to the worker or delayed callback.
In
`@bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/SequenceView.jsx`:
- Around line 228-267: Extract the duplicated anchor calculation into a helper
(e.g., computeAnchor(acts, alignMode)) and use it in computeAlignInfo and
SequenceView: implement computeAnchor to accept an activation array and
alignMode and return the anchor index (handling 'start', 'first_activation',
'max_activation' and defaulting to 0), then replace the repeated blocks inside
computeAlignInfo (both the loop that computes maxAnchor and the loop that
computes totalLength) and the anchor logic in SequenceView with calls to
computeAnchor(acts, alignMode) so all three sites use the single helper.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: fd177528-4a24-4837-a844-cbfb935a4481
📒 Files selected for processing (27)
bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/.gitignorebionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/README.mdbionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/index.htmlbionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/package.jsonbionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/public/sequence_library.jsonbionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/App.jsxbionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/Dashboard.jsxbionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/EmbeddingView.jsxbionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/FeatureCard.jsxbionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/FeatureDetailPage.jsxbionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/FeatureList.jsxbionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/GenerativeSteering.jsxbionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/Histogram.jsxbionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/InfoButton.jsxbionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/RegionDetailModal.jsxbionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/SequenceInspector.jsxbionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/SequenceUMAPView.jsxbionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/SequenceView.jsxbionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/backend.jsbionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/index.jsxbionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/styles.jsbionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/utils.jsbionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/vite.config.jsbionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/scripts/launch_dashboard.pybionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/src/evo2_sae/server.pybionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/tests/test_launch_dashboard.pybionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/tests/test_server.py
| const [darkMode, setDarkMode] = useState(true) | ||
|
|
||
| // Toggle dark class on document root | ||
| useEffect(() => { | ||
| document.documentElement.classList.toggle('dark', darkMode) | ||
| }, [darkMode]) |
There was a problem hiding this comment.
Unify theme ownership to a single source of truth.
App mutates the root dark class locally, while Dashboard also controls it. This creates cross-tab/remount theme drift (e.g., atlas remount can reset theme unexpectedly). Keep root theme control in one component only and pass state/handlers via props.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/App.jsx`
around lines 14 - 19, App is mutating the document root class via the useEffect
that toggles document.documentElement.classList based on its local darkMode
state, while Dashboard also controls the root theme causing cross-component
drift; remove the root DOM mutation from App and instead accept darkMode and
setDarkMode as props (or lift darkMode state up) so Dashboard remains the single
owner of root theme control; specifically delete or disable the useEffect in App
that references document.documentElement.classList.toggle and ensure App uses
the passed-in darkMode and setDarkMode (or propagate them up) so only Dashboard
applies the root 'dark' class.
| setSelectedFeatureIds(ids.size > 0 && ids.size < totalFeatures ? ids : null) | ||
| } catch (err) { |
There was a problem hiding this comment.
Preserve empty selections instead of collapsing them to “all.”
At Line 474, ids.size === 0 currently falls back to null, which is treated as “no filter.” That incorrectly shows all features when a filter yields zero rows.
Suggested fix
- setSelectedFeatureIds(ids.size > 0 && ids.size < totalFeatures ? ids : null)
+ setSelectedFeatureIds(ids.size === totalFeatures ? null : ids)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/App.jsx`
around lines 474 - 475, The current conditional sets selected feature ids to
null when ids.size is 0, which treats an empty selection as "no filter" and
shows all features; change the logic in the setSelectedFeatureIds call
(referencing setSelectedFeatureIds, ids and totalFeatures) so that only a full
selection (ids.size === totalFeatures) becomes null while an empty Set is
preserved (i.e., set to ids), e.g. replace the condition with one that returns
null only when ids.size === totalFeatures, otherwise setSelectedFeatureIds(ids).
| <span | ||
| ref={buttonRef} | ||
| onClick={() => setOpen(o => !o)} | ||
| style={{ | ||
| display: 'inline-flex', | ||
| alignItems: 'center', | ||
| justifyContent: 'center', | ||
| width: '15px', | ||
| height: '15px', | ||
| borderRadius: '50%', | ||
| border: '1px solid var(--border-input)', | ||
| fontSize: '10px', | ||
| fontWeight: '600', | ||
| color: 'var(--text-tertiary)', | ||
| cursor: 'pointer', | ||
| userSelect: 'none', | ||
| lineHeight: 1, | ||
| }} | ||
| > | ||
| i | ||
| </span> |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/InfoButton.jsx"
# Show file header + relevant section with line numbers
nl -ba "$FILE" | sed -n '1,120p'
# Quick search for open state / tooltip usage
rg -n "open|setOpen|tooltip|aria-|buttonRef|onClick|span|button" "$FILE" || trueRepository: NVIDIA-BioNeMo/bionemo-framework
Length of output: 119
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/InfoButton.jsx"
# Show the whole file if it's small; otherwise show the referenced region.
lines=$(wc -l < "$FILE")
echo "FILE_LINES=$lines"
if [ "$lines" -le 160 ]; then
cat -n "$FILE"
else
# Lines 25-90 around the reported snippet
sed -n '25,90p' <(cat -n "$FILE")
fi
echo "---- rg hits ----"
rg -n "open|setOpen|tooltip|aria-|buttonRef|onClick|role=|type=" "$FILE" || trueRepository: NVIDIA-BioNeMo/bionemo-framework
Length of output: 3222
Replace the tooltip-triggering “i” span with a semantic button for keyboard/a11y support.
InfoButton.jsx uses a clickable <span> (onClick only), so it’s not reliably focusable/activatable via keyboard. Switch it to a <button> and add appropriate ARIA.
Proposed fix
- <span
+ <button
+ type="button"
ref={buttonRef}
- onClick={() => setOpen(o => !o)}
+ onClick={() => setOpen((o) => !o)}
+ aria-label="More information"
+ aria-expanded={open}
style={{
display: 'inline-flex',
alignItems: 'center',
justifyContent: 'center',
width: '15px',
height: '15px',
+ background: 'transparent',
+ padding: 0,
borderRadius: '50%',
border: '1px solid var(--border-input)',
fontSize: '10px',
fontWeight: '600',
color: 'var(--text-tertiary)',
cursor: 'pointer',
userSelect: 'none',
lineHeight: 1,
}}
>
i
- </span>
+ </button>🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/InfoButton.jsx`
around lines 33 - 53, Replace the non-semantic clickable <span> in
InfoButton.jsx with a real <button> element so it is keyboard-focusable and
operable; update the element using the same ref (buttonRef), keep the
onClick={() => setOpen(o => !o)} handler, add type="button" and an appropriate
aria-label (e.g., aria-label="More information" or aria-expanded based on open
state), and preserve the existing inline style properties on the button so
visual appearance remains unchanged; ensure any pointer/user-select/CSS
properties are applied to the button and remove any unnecessary role attributes
since a native button is used.
|
@coderabbitai review |
✅ Action performedReview finished.
|
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/GenerativeSteering.jsx (1)
39-59:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winThe
nTokensvalidation concern from the previous review is still present.Line 92 stores the input value as a string (
setNTokens(e.target.value)), and line 48 converts it without validation. If the user clears the field,Number('')evaluates to0, violating themin={1}constraint. HTML5 validation doesn't prevent theonClickhandler from executing.However, the previous comment overstated the scope:
- Temperature is safe: Lines 84-86 use a range input with controlled
parseFloat, which guarantees a valid numeric value.- Strength validation is unnecessary:
0is a valid strength value (the UI help text notes "0 = suppress"), and the FeaturePicker UI already clamps values.Only
nTokensrequires client-side validation before submission.🛡️ Proposed validation fix (nTokens only)
const generate = async () => { setBusy(true) setError(null) try { + const tokens = Number(nTokens) + if (!Number.isInteger(tokens) || tokens < 1) { + throw new Error('Token count must be at least 1') + } const body = { prompt: cleanDNA(prompt), organism, tag: tag ?? (organismTags?.[organism] ?? ''), features: clamps.map((c) => ({ feature_id: c.id, strength: c.strength })), - n_tokens: Number(nTokens), + n_tokens: tokens, temperature: Number(temperature), compare_baseline: compareBaseline, }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/GenerativeSteering.jsx` around lines 39 - 59, The generate handler must validate nTokens before sending: parse nTokens into a numeric integer (e.g., const tokens = Number(nTokens) or parseInt(nTokens, 10)), verify Number.isInteger(tokens) && tokens >= 1, and if invalid call setError(...) and early-return (and setBusy(false) if needed) to prevent submission; then use that validated tokens value for the request body as n_tokens. Update the generate function (references: generate, nTokens, setNTokens, setError, setBusy, setResult, postJSON) to perform this check and avoid sending Number('') -> 0.
🧹 Nitpick comments (2)
bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/scripts/dashboard.py (2)
60-402: ⚖️ Poor tradeoffExpand docstrings to full Google-style format.
The coding guidelines require Google-style docstrings (pydocstyle convention) with Args, Returns, and Raises sections where applicable. Current docstrings provide brief descriptions but lack structured parameter and return value documentation.
Example:
📝 Google-style docstring example
def _load_sae_only(args): - """Load just the SAE (+ labels) by reusing the engine's loaders — no 7B / megatron. + """Load just the SAE (+ labels) by reusing the engine's loaders — no 7B / megatron. - ``Evo2SAE.__init__`` only records config; ``_load_sae``/``_load_feature_meta`` touch the - SAE checkpoint and annotation parquet but never ``bionemo.evo2``, so this stays light. - """ + ``Evo2SAE.__init__`` only records config; ``_load_sae``/``_load_feature_meta`` touch the + SAE checkpoint and annotation parquet but never ``bionemo.evo2``, so this stays light. + + Args: + args: Parsed command-line arguments containing sae_ckpt_path, layer, device, and + feature_annotations paths. + + Returns: + A tuple of (sae_module, n_features, label_dict) where sae_module is the loaded SAE + model, n_features is the SAE's feature count, and label_dict maps feature IDs to labels. + """As per coding guidelines: "Use Google-style docstrings (pydocstyle convention) in Python code" (applies to
**/*.py).🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/scripts/dashboard.py` around lines 60 - 402, Several top-level functions (e.g., _add_sae_args, parse_args, _load_sae_only, _write_label_columns, _iter_sampled_activations, _compute_layout, run_atlas, _pass1_max_acts, _pass2_examples, run_examples, main) have short or non-Google-style docstrings; update each to a full Google-style docstring including a one-line summary plus Args: (describe parameters and types), Returns: (describe return values and types) and Raises: (list raised exceptions) where applicable, preserving existing descriptions and mentioning side effects (I/O, file writes, device movement) so they comply with pydocstyle conventions.Source: Coding guidelines
60-402: 🏗️ Heavy liftAdd type hints to comply with Pyright guideline.
The coding guidelines require "Use Pyright for type checking in Python files following pyproject.toml configuration," but this file contains no type annotations on function parameters or return values. Adding type hints improves IDE support, catches type errors early, and documents expected types.
Example:
📝 Type hint examples for key functions
def _add_sae_args(p): +def _add_sae_args(p: argparse.ArgumentParser) -> None: """Args common to both modes: the SAE, labels, layer, device, output, UMAP knobs.""" -def parse_args(): +def parse_args() -> argparse.Namespace: """Parse the `atlas` / `examples` subcommand and its options.""" -def _load_sae_only(args): +def _load_sae_only(args: argparse.Namespace) -> tuple[torch.nn.Module, int, dict[int, str]]: """Load just the SAE (+ labels) by reusing the engine's loaders — no 7B / megatron.As per coding guidelines: "Use Pyright for type checking in Python files following pyproject.toml configuration" (applies to
**/*.py).🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/scripts/dashboard.py` around lines 60 - 402, Add explicit Pyright-friendly type hints to the functions in this file: annotate _add_sae_args(p: argparse.ArgumentParser) -> None; parse_args() -> argparse.Namespace; _load_sae_only(args) -> Tuple[torch.nn.Module, int, Dict[int, str]]; _write_label_columns(n_features: int, labels: Mapping[int, str]) -> Tuple[List[int], List[str]]; _iter_sampled_activations(shards: Sequence[Path], sample_tokens: int, batch_size: int) -> Iterator[torch.Tensor]; _compute_layout(sae, args) -> Tuple[np.ndarray, np.ndarray, str]; run_atlas(args) -> None; _pass1_max_acts(eng, seqs: Sequence[str], tag: str, tag_len: int, batch_size: int) -> torch.Tensor; _pass2_examples(eng, seqs: Sequence[str], ids: Sequence[str], tag: str, tag_len: int, top_idx: torch.Tensor, peak: torch.Tensor, labels: Mapping[int, str], max_example_bp: int, batch_size: int) -> List[Dict[str, Any]]; run_examples(args) -> None; main() -> None. Import required typing names (Iterator, Tuple, List, Dict, Mapping, Sequence, Any) and consider adding "from __future__ import annotations" to keep runtime cost low; update signatures only (no logic changes).Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In
`@bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/scripts/dashboard.py`:
- Around line 162-167: The try/except around importing umap currently catches
all exceptions and masks real errors; change the broad except Exception to
except ImportError so only a missing dependency switches method to "pca" while
other import errors still surface; update the try block that imports umap and
sets method = "umap" to only fall back to method = "pca" on ImportError.
---
Duplicate comments:
In
`@bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/GenerativeSteering.jsx`:
- Around line 39-59: The generate handler must validate nTokens before sending:
parse nTokens into a numeric integer (e.g., const tokens = Number(nTokens) or
parseInt(nTokens, 10)), verify Number.isInteger(tokens) && tokens >= 1, and if
invalid call setError(...) and early-return (and setBusy(false) if needed) to
prevent submission; then use that validated tokens value for the request body as
n_tokens. Update the generate function (references: generate, nTokens,
setNTokens, setError, setBusy, setResult, postJSON) to perform this check and
avoid sending Number('') -> 0.
---
Nitpick comments:
In
`@bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/scripts/dashboard.py`:
- Around line 60-402: Several top-level functions (e.g., _add_sae_args,
parse_args, _load_sae_only, _write_label_columns, _iter_sampled_activations,
_compute_layout, run_atlas, _pass1_max_acts, _pass2_examples, run_examples,
main) have short or non-Google-style docstrings; update each to a full
Google-style docstring including a one-line summary plus Args: (describe
parameters and types), Returns: (describe return values and types) and Raises:
(list raised exceptions) where applicable, preserving existing descriptions and
mentioning side effects (I/O, file writes, device movement) so they comply with
pydocstyle conventions.
- Around line 60-402: Add explicit Pyright-friendly type hints to the functions
in this file: annotate _add_sae_args(p: argparse.ArgumentParser) -> None;
parse_args() -> argparse.Namespace; _load_sae_only(args) ->
Tuple[torch.nn.Module, int, Dict[int, str]]; _write_label_columns(n_features:
int, labels: Mapping[int, str]) -> Tuple[List[int], List[str]];
_iter_sampled_activations(shards: Sequence[Path], sample_tokens: int,
batch_size: int) -> Iterator[torch.Tensor]; _compute_layout(sae, args) ->
Tuple[np.ndarray, np.ndarray, str]; run_atlas(args) -> None;
_pass1_max_acts(eng, seqs: Sequence[str], tag: str, tag_len: int, batch_size:
int) -> torch.Tensor; _pass2_examples(eng, seqs: Sequence[str], ids:
Sequence[str], tag: str, tag_len: int, top_idx: torch.Tensor, peak:
torch.Tensor, labels: Mapping[int, str], max_example_bp: int, batch_size: int)
-> List[Dict[str, Any]]; run_examples(args) -> None; main() -> None. Import
required typing names (Iterator, Tuple, List, Dict, Mapping, Sequence, Any) and
consider adding "from __future__ import annotations" to keep runtime cost low;
update signatures only (no logic changes).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: c9062082-12bd-4acb-872a-6b64eb650205
📒 Files selected for processing (12)
bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/index.htmlbionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/package.jsonbionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/App.jsxbionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/EmbeddingView.jsxbionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/GenerativeSteering.jsxbionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/SequenceInspector.jsxbionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/utils.jsbionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/pyproject.tomlbionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/scripts/dashboard.pybionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/scripts/launch_dashboard.pybionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/src/evo2_sae/server.pybionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/tests/test_server.py
🚧 Files skipped from review as they are similar to previous changes (8)
- bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/utils.js
- bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/package.json
- bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/src/evo2_sae/server.py
- bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/tests/test_server.py
- bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/scripts/launch_dashboard.py
- bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/SequenceInspector.jsx
- bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/EmbeddingView.jsx
- bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/feature_explorer/src/App.jsx
| try: | ||
| import umap # noqa: F401 | ||
|
|
||
| method = "umap" | ||
| except Exception: | ||
| method = "pca" |
There was a problem hiding this comment.
Narrow the exception handler to ImportError.
The broad except Exception: on line 166 will catch all exceptions during the umap import attempt, potentially masking real errors (e.g., syntax errors, attribute errors, or compatibility issues in the umap module itself) rather than just a missing dependency. This can hide failures that should surface during development or deployment.
🔒 Proposed fix
if method == "auto":
try:
import umap # noqa: F401
method = "umap"
- except Exception:
+ except ImportError:
method = "pca"
print("[atlas] umap-learn unavailable (NumPy/numba) — falling back to --layout pca")🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/scripts/dashboard.py`
around lines 162 - 167, The try/except around importing umap currently catches
all exceptions and masks real errors; change the broad except Exception to
except ImportError so only a missing dependency switches method to "pca" while
other import errors still surface; update the try block that imports umap and
sets method = "umap" to only fall back to method = "pca" on ImportError.
|
This used to be static, right? Why is it on a live backend now? Is that optional? We mentioned rolling the Atlas out into it's own separate library, would this support that or require rewriting the setup of the other Atlas? (Also, can you add some screenshots? :) ) |
|
@jwilber There is a vizualization demo here: https://5176-shzs9ka8n.brevlab.com/ |
a47088c to
8325741
Compare
a819d98 to
dc46ad5
Compare
#1637 was re-landed on the migrated #1622 (new top-level layout); #1623 is stacked on it, so it's layered onto the new #1637 by hand (the old dashboard branch predates the #1637 hardening, so a whole-file diff would revert it). - Clean adds: feature_explorer/ (the React dashboard, incl. the gene_embed firing-column reduction, request timeouts, per-pane descriptions, shared components.jsx, crash-safe localStorage), scripts/{dashboard,launch_dashboard}.py, tests/test_launch_dashboard.py. - server.py: add GeneEmbedRequest + /gene_embed (uses core.clean_dna + _require_ready). - core.py: add Evo2SAE.embed_bundle (ships only firing columns + feature_ids, not the full 65536-wide matrix). - tests: gene_embed contract test in test_server.py; bind embed_bundle on the conftest FakeEngine. - pyproject: add scikit-learn (scripts/dashboard.py atlas PCA/TSNE). Validated in the evo2_megatron venv: CPU 44 passed; frontend `vite build` clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Polina Binder <pbinder@nvidia.com>
8325741 to
b7843e3
Compare
Move the API routes under an /api prefix (one APIRouter + include_router) and, when a built frontend is configured (build_app(static_dir=...) or DASHBOARD_DIST env), mount it at / via StaticFiles(html=True). This lets a single container serve both the dashboard and the API on one origin: the frontend always calls /api/* (in dev via the Vite proxy, in prod from the same server). The static mount is generic — it serves whatever dir it's pointed at and knows nothing about the dashboard; the dashboard recipe (#1623) supplies the dir + the Docker build. With no frontend configured the server is API-only and / 404s (never crashes). Startup already tolerates a load failure (stays not-ready -> 503), so a frontend+API smoke needs no GPU/checkpoints. Tests: re-point existing contract tests to /api/*, add SPA-index/asset served, API-reachable-under- prefix, unknown-/api-is-404-not-SPA, and API-only-when-no-frontend. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Polina Binder <pbinder@nvidia.com>
it's layered onto the new #1637 by hand (the old dashboard branch predates the #1637 hardening, so a whole-file diff would revert it). - Clean adds: feature_explorer/ (the React dashboard, incl. the gene_embed firing-column reduction, request timeouts, per-pane descriptions, shared components.jsx, crash-safe localStorage), scripts/{dashboard,launch_dashboard}.py, tests/test_launch_dashboard.py. - server.py: add GeneEmbedRequest + /gene_embed (uses core.clean_dna + _require_ready). - core.py: add Evo2SAE.embed_bundle (ships only firing columns + feature_ids, not the full 65536-wide matrix). - tests: gene_embed contract test in test_server.py; bind embed_bundle on the conftest FakeEngine. - pyproject: add scikit-learn (scripts/dashboard.py atlas PCA/TSNE). Validated in the evo2_megatron venv: CPU 44 passed; frontend `vite build` clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Polina Binder <pbinder@nvidia.com>
b7843e3 to
dbd1792
Compare
no bionemo-recipes/ prefix) and squash-replayed, so #1637 is layered on by hand rather than rebased. - Clean adds: src/evo2_sae/{server,cli}.py, scripts/launch_inference.sh, tests/test_{cli,server}.py. - pyproject: add pandas/fastapi/uvicorn/anyio. - tests/conftest.py: keep #1622's 1B GPU fixtures + bionemo.common loader; append the serve-layer FakeEngine + fake_engine fixture. - core.py (semantic merge): keep #1622's _sanitize_steering (all CPU sanitize tests) and fold in the explicit non-finite-strength guard (no min/max arg-order reliance); add the shared annotate() + parse_clamp_spec() (CLI strings ⇄ API dicts) and feed parse_clamp_spec in front of _sanitize_steering; add _is_unrecoverable_cuda + flip the engine not-ready on an unrecoverable CUDA fault in generate(). (Kept _sanitize_steering rather than swapping in _normalize_clamps — non-redundant and preserves #1622's sampler hardening + tests.) - test_steering.py: keep #1622's sanitize + GPU tests; add the _is_unrecoverable_cuda test. Preserved from #1622: clamp_hook canonical encode/decode, TopKSAE-only _load_sae (no ReLU), bionemo.common(/core fallback) loader. Validated in the evo2_megatron venv: CPU 38 passed, GPU test_steering 13 passed on the 1B (ran, not skipped). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Polina Binder <pbinder@nvidia.com>
… int port, fake shape
1. /annotate pick mode now range-checks user-supplied feature_ids -> 400 (was: out-of-range
IndexError -> 500, negative id silently indexed the wrong feature via torch negative-index).
+ test_annotate_pick_rejects_out_of_range_id.
2. core.generate rejects an over-context prompt ("too long" -> server 413), instead of letting
tokenize() silently truncate it — makes the /generate 413 branch live and matches /annotate.
+ test_generate_rejects_overlong_prompt.
3. cli.py: int() the env-var defaults (PORT/EMBEDDING_LAYER/MAX_SEQ_LEN) — argparse type= only
coerces command-line values, so `serve` was handing uvicorn a str port.
4. conftest FakeEngine.generate now returns features keyed {id, label, strength} (the real
feat_meta shape the dashboard consumes), not {feature_id, strength}; test_cli updated so the
contract test pins the real API shape.
5. Note body-size limit is advisory (Content-Length only; chunked/lying bypasses).
6. Note the CUDA-wedge guard depends on a readiness-based recycler (else 503 until manual restart).
Validated in the evo2_megatron venv: CPU 40 passed (was 38), GPU unaffected.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Polina Binder <pbinder@nvidia.com>
…art loop) A device-side assert poisons the process's CUDA context (unrecoverable in-process), so ready=False alone only recovers under a readiness-based recycler. Add restart-on-exit recovery, which almost every host provides: - core.generate: on an unrecoverable CUDA fault, if EXIT_ON_CUDA_WEDGE=1, os._exit(1) the worker (after ready=False). Default unset -> just fail-closed at 503 (safe for library/CLI/test use). - launch_inference.sh: for `serve`, export EXIT_ON_CUDA_WEDGE=1 and wrap in a restart loop (respawn on crash/wedge exit; stop on clean exit / Ctrl-C 130 / SIGTERM 143). Recovery now works with no external orchestrator (and composes with docker --restart / systemd / k8s). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Polina Binder <pbinder@nvidia.com>
… 413 test - launch_inference.sh: stop managing the venv — assume it's already active (Docker: on PATH; bare metal: source the evo2_megatron .venv first, like the tests). Drops the messy VENV= passing; adds a clear "bionemo.evo2 not importable" preflight. - Restart loop signal fix (was a graceful-shutdown regression): run the worker in the background and `wait`, with a trap that forwards SIGTERM/SIGINT to it (uvicorn graceful shutdown) and stops the loop — so `docker stop`/k8s on PID 1 no longer orphans the worker. Adds a 10-restart cap + backoff so a persistent crash (e.g. port already bound) doesn't loop forever. Smoke-tested: SIGTERM stops in ~1s, not the worker's full lifetime. - /generate 413 now pinned at the server layer: FakeEngine raises "too long" past max_seq_len and test_generate_rejects_too_long drives POST /generate -> 413 (was only covered via test_core). - Reframe the CUDA-wedge comment: it's PURELY DEFENSIVE — _sanitize_steering neutralizes every client-reachable assert trigger, so a wedge implies a hardware/driver fault, not a crafted request (exit+restart is not a remote DoS). New triggers must extend _sanitize_steering. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Polina Binder <pbinder@nvidia.com>
Move the API routes under an /api prefix (one APIRouter + include_router) and, when a built frontend is configured (build_app(static_dir=...) or DASHBOARD_DIST env), mount it at / via StaticFiles(html=True). This lets a single container serve both the dashboard and the API on one origin: the frontend always calls /api/* (in dev via the Vite proxy, in prod from the same server). The static mount is generic — it serves whatever dir it's pointed at and knows nothing about the dashboard; the dashboard recipe (#1623) supplies the dir + the Docker build. With no frontend configured the server is API-only and / 404s (never crashes). Startup already tolerates a load failure (stays not-ready -> 503), so a frontend+API smoke needs no GPU/checkpoints. Tests: re-point existing contract tests to /api/*, add SPA-index/asset served, API-reachable-under- prefix, unknown-/api-is-404-not-SPA, and API-only-when-no-frontend. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Polina Binder <pbinder@nvidia.com>
cdda2f7 to
28e49be
Compare
it's layered onto the new #1637 by hand (the old dashboard branch predates the #1637 hardening, so a whole-file diff would revert it). - Clean adds: feature_explorer/ (the React dashboard, incl. the gene_embed firing-column reduction, request timeouts, per-pane descriptions, shared components.jsx, crash-safe localStorage), scripts/{dashboard,launch_dashboard}.py, tests/test_launch_dashboard.py. - server.py: add GeneEmbedRequest + /gene_embed (uses core.clean_dna + _require_ready). - core.py: add Evo2SAE.embed_bundle (ships only firing columns + feature_ids, not the full 65536-wide matrix). - tests: gene_embed contract test in test_server.py; bind embed_bundle on the conftest FakeEngine. - pyproject: add scikit-learn (scripts/dashboard.py atlas PCA/TSNE). Validated in the evo2_megatron venv: CPU 44 passed; frontend `vite build` clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Polina Binder <pbinder@nvidia.com>
…o image)
Make the dashboard + API deployable as ONE container so a public-repo user (or coworker) can
launch it with a single 'docker run', no Node and no second process:
* Dockerfile: add a Node build stage that compiles feature_explorer -> static dist/, COPY it
into the runtime image, and set DASHBOARD_DIST so server.py serves it at / (Node lives only
in the build stage, never in the runtime GPU image).
* vite.config.js: drop the /api rewrite — proxy /api straight through to :8001/api. Dev now hits
the same paths as production (the server serves the API under /api), so there's no dev/prod drift.
* move the /gene_embed endpoint onto the /api router (rebased onto the new server.py); frontend
already calls ${BACKEND}/gene_embed (= /api/gene_embed), so no frontend code change.
* re-point the gene_embed contract test to /api/gene_embed.
Dev mode (vite + launch_dashboard.py) is unchanged for UI iteration; the baked-in static build is
the deploy path. Frontend compile is exercised by the image build.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Polina Binder <pbinder@nvidia.com>
…ting Two related transparency fixes: * Each tab now shows a collapsible 'Limitations' disclosure under its description, with caveats grounded in real behavior (atlas is a static snapshot + 2-D UMAP; steering clamp capped ±300 and only affects the continuation; inspector reads one layer and rejects over-length input; sequence UMAP is stochastic 2-D, mean/max pooled, min_firing-filtered, 1000-seq cap). * /gene_embed no longer silently truncates/drops: sequences past the 1000 cap, shorter than 3 nt, or longer than the context limit are skipped (not truncated into a misleading vector) and the counts are returned (n_received/n_skipped_short/n_skipped_too_long/n_dropped_over_cap/max_seq_len/max_genes). The Sequence-UMAP tab surfaces a warning banner; an all-invalid request 400s with an accounting message. Matches /annotate's reject-don't-truncate behavior. Tests: assert the new accounting fields, that too-short/over-length sequences are reported (not embedded), and that an all-invalid request 400s. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Polina Binder <pbinder@nvidia.com>
… modes Document the actual ways to run now that the API is under /api and the frontend can be baked into the image: (1) single container (docker run, UI + /api on one port, no Node at runtime) for sharing; (2) local dev (Vite + backend, Node >=18, tsh tunnel of :5176, /api proxied straight through with no rewrite); (3) offline/static (precompute artifacts, atlas always + sequence-UMAP from a bundle). Adds a tab summary and notes the in-app per-tab Limitations. Touches only this PR's README. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Polina Binder <pbinder@nvidia.com>
…red helper App.jsx read the featureTitle_<id> localStorage key directly in 5 places; route them through components.userLabel (single source of truth for the in-UI rename convention, and crash-safe via its try/catch). No behavior change. Build verified (npm run build green). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Polina Binder <pbinder@nvidia.com>
…P banner reads The Sequence-UMAP warning banner interpolates max_genes and max_seq_len from the response, but the contract test only asserted the four counts. Assert those two fields too so the UI message can't silently break on a field rename. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Polina Binder <pbinder@nvidia.com>
Add tests for the error branches that lacked coverage: /api/annotate and /api/gene_embed reject an unknown organism with 400 (not a 500 from a None tag), and build_app with a bogus static_dir degrades to API-only (/ 404s, /api still serves) instead of crashing at mount time. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Polina Binder <pbinder@nvidia.com>
…ngine-only
The multi-stage Dockerfile previously always ran the Node front-end build, coupling the SAE engine
image to the dashboard toolchain (an SAE-only user paid the npm build and a broken/networkless
front-end blocked the image build). Restructure into engine / engine-with-dashboard stages selected
by --build-arg WITH_DASHBOARD (default 0): the default build is engine + server only and never pulls
Node or builds the front-end; WITH_DASHBOARD=1 also bakes the static dashboard in and sets
DASHBOARD_DIST so one container serves UI + API. No runtime change either way. README updated.
Note: not build-validated here (no docker in this env); the FROM final-${WITH_DASHBOARD} stage
selection is a standard BuildKit idiom.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Polina Binder <pbinder@nvidia.com>
The atlas tab loads its three parquets from the served dir; they aren't baked in (per-SAE generated data). Document that they just need to land in $DASHBOARD_DIST — generate them in-container via dashboard.py --output-dir $DASHBOARD_DIST, or cp a pre-made set in before serve. No code/staging mechanism needed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Polina Binder <pbinder@nvidia.com>
ba7434b to
6919590
Compare
## Summary FastAPI **server + CLI** over the Evo2SAE engine (#1622). Thin wrappers — all model work lives in `core.py` — plus the input validation, resource governance, and recovery needed for a shared backend (runs behind NVIDIA SSO on Brev, reachable by many users). API routes live under **`/api`**, and the server can mount a prebuilt front-end at `/`, so the dashboard (#1623) and the API can be served from **one origin / one container**. **Rebased onto the single-engine #1622** (one inference engine serves both encode and generate; new top-level layout `interpretability/sparse_autoencoders/…`). ## Contents (new layout) - `…/src/evo2_sae/server.py` — `/api/health`, `/api/features`, `/api/annotate`, `/api/generate` (+ optional static-frontend mount at `/`) - `…/src/evo2_sae/cli.py` — `serve` / `encode` / `batch` / `generate` - `…/scripts/launch_inference.sh`; CPU contract tests `tests/test_cli.py`, `tests/test_server.py` + the shared `FakeEngine` appended to #1622's `tests/conftest.py` ## Shared logic (CLI ⇄ server live in `core`) - **`core.annotate(engine, …)`** — clean → resolve-tag → encode → tag-len, behind both CLI `encode` and server `/api/annotate`. - **`core.parse_clamp_spec(spec)`** — one parser for clamps as CLI `"ID[:STRENGTH]"` strings or server `FeatureClamp` JSON; fed in front of #1622's `_sanitize_steering` so both surfaces validate identically. ## Single-origin serving (`/api` + optional static mount) - API routes are grouped under **`/api`** (one `APIRouter` + `include_router`). - `build_app(engine, static_dir=None)` mounts a prebuilt front-end at `/` via `StaticFiles(html=True)` when `static_dir` (or the `DASHBOARD_DIST` env) points at a real directory; otherwise the server is **API-only** and `/` 404s (never crashes). The mount is generic — it serves whatever dir it's pointed at and knows nothing about the dashboard; #1623 supplies the dir + the Docker build that produces it. - This is what lets a single container serve UI + API on one port. Dev hits the same `/api/*` paths (the Vite proxy forwards `/api` without rewriting), so there's no dev/prod path drift. ## Reliability & governance - **`/api/health` 503 until ready** so readiness probes don't route to a still-loading pod; a startup load failure is caught and leaves the engine not-ready (503) rather than crashing. - **Length limits** — `/api/annotate` and `/api/generate` reject input longer than `max_seq_len` (**413**) instead of silently truncating (which would misalign the per-base `activations`/`bases` the viz plots). Generation length is otherwise auto-capped to the remaining context (no fixed token cap). - **Pick-id validation** — `/api/annotate` `mode=pick` range-checks user-supplied `feature_ids` → **400** (an out-of-range id would otherwise 500 on `IndexError`, a negative one would silently return the wrong feature). - **Steering sanitation** — out-of-range ids, extreme/non-finite strengths, `temperature<=0`, negative `top_k` are all rejected/coerced before the GPU (`_sanitize_steering`). - **CUDA-wedge recovery** — a device-side assert poisons the process's CUDA context (unrecoverable in-process). Not client-inducible (sanitation covers the reachable triggers — purely defensive), but if it happens `generate()` flips the engine not-ready (→ 503) and, when `EXIT_ON_CUDA_WEDGE=1` (set by `serve`), exits the worker so any restart-on-exit supervisor respawns it — host-independent recovery. - **Signal-safe serve** — `launch_inference.sh serve` runs the worker in the background, forwards `SIGTERM`/`SIGINT` (uvicorn graceful shutdown) before respawning, with a retry cap + backoff, so `docker stop`/k8s shuts down cleanly instead of orphaning the worker. - **Request body-size limit** (`MAX_BODY_BYTES`, default 16 MiB) → 413 — advisory (trusts `Content-Length`). - **Bounded concurrency** — Starlette's sync-endpoint threadpool capped (`MAX_CONCURRENCY`, default 8); the engine lock already serializes the single GPU. ## Architectural decisions - **Two layers: engine vs. surface.** All model work stays in `core.Evo2SAE` (#1622); `server.py`/`cli.py` are thin and share `core.annotate` + `core.parse_clamp_spec`, so the HTTP API and the CLI can't drift and there's one validated path. - **FastAPI, not raw/Flask.** We get pydantic structural validation + an async threadpool we can bound (`MAX_CONCURRENCY`) for almost no code; the domain validation that matters (`_sanitize_steering`, pick-id range) is manual either way. Raw Python would hand-roll routing/validation/concurrency; Flask would add the threadpool governance by hand. - **No app-level auth.** Deployed behind NVIDIA SSO on Brev; auth is the proxy's job, not duplicated here (CORS removed too — calls are same-origin). - **Single GPU, serialized.** The engine lock + bounded threadpool match one GPU; data-parallel replicas behind a balancer are a deferred follow-up (touches no engine code). - **`/api` prefix + generic static mount** (above) so one origin/container can serve both UI and API. ## How to run Run **inside the evo2_megatron venv** (provides `bionemo.evo2` + megatron); in the Docker image it's already active. Full dashboard run modes are in #1623's `feature_explorer/README.md`. ```bash export EVO2_CKPT_DIR=<mbridge> SAE_CKPT_PATH=<sae.pt> export FEATURE_ANNOTATIONS=<feature_metadata.parquet> EMBEDDING_LAYER=26 scripts/launch_inference.sh serve # API on :8001 (+ UI at / if DASHBOARD_DIST set) scripts/launch_inference.sh encode --sequence ATGC... # one sequence -> top features (JSON) scripts/launch_inference.sh batch --fasta in.fa --out out.parquet # many -> parquet scripts/launch_inference.sh generate --prompt ATGC... --clamp 29244:300 # steered generation ``` Tunables (env): `MAX_BODY_BYTES`, `MAX_CONCURRENCY`, `MAX_SEQ_LEN`, `PORT`, `EXIT_ON_CUDA_WEDGE`, `DASHBOARD_DIST`. ## Tests No dedicated CI lane (deferred — see #1622). Run them via the recipe's build script: ```bash cd interpretability/sparse_autoencoders/recipes/evo2 bash .ci_build.sh && source .ci_test_env.sh pytest tests/ ``` - **CPU (no model):** `test_cli.py` + `test_server.py` (FastAPI `TestClient` + `FakeEngine`: response shapes, 400/413/503, pick out-of-range → 400, `/api/generate` too-long → 413, body-size, k-bounds, clamp validation, **static-frontend mount: SPA at `/`, asset served, API reachable under `/api`, unknown `/api/*` → 404, API-only when no frontend**), plus #1622's `test_core.py` + `test_steering.py` sanitize guards. - **GPU:** `test_steering.py` — encode, in-distribution generation, steering changes the continuation, batched/empty encode, max-clamp finite, **highlight↔steer interleaving** (single-engine state-bleed check). Gated by `@pytest.mark.skipif(not torch.cuda.is_available())` — runs on a GPU box, skips otherwise. Validated on the 1B; the single-engine backend also serves the **7B at layer 26** live. ## Deferred follow-up Multi-GPU **data-parallel replicas** (one worker per GPU behind a `least_conn` balancer) for concurrent throughput — touches no engine code; left until concurrency is an observed need. **Stacked on #1622.** The dashboard (#1623) builds on this. --------- Signed-off-by: Polina Binder <pbinder@nvidia.com> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Re-add .github/workflows/unit-tests-interpretability-recipes.yaml (removed in c31d4e9 "defer CI for now"), recovered verbatim from 9bedf2b. Path-gated GPU lane (L4 + megatron squashed image) that builds the evo2 SAE recipe via .ci_build.sh and runs pytest tests/ — including the GPU steering/encode tests. Rooted on this PR (#1622), so the stacked SAE PRs (#1623/#1629/#1635) inherit it. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
vite.config.js binds 0.0.0.0 for remote access but omitted server.allowedHosts,
so reaching the dev server via a public hostname (brev/ngrok/Codespaces) instead
of an ssh -L localhost tunnel hit Vite's Host check ('host not allowed'). Pre-allow
the common tunnel domains and document it in the README.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: polinabinder1 <pbinder@nvidia.com> # Conflicts: # interpretability/sparse_autoencoders/recipes/evo2/pyproject.toml # interpretability/sparse_autoencoders/recipes/evo2/src/evo2_sae/server.py # interpretability/sparse_autoencoders/recipes/evo2/tests/conftest.py # interpretability/sparse_autoencoders/recipes/evo2/tests/test_server.py
Signed-off-by: polinabinder1 <pbinder@nvidia.com>
Summary
React feature-explorer dashboard (4 tabs) over the Evo2SAE backend (#1622) — and the packaging
to ship it: the front-end can be built to static files baked into the recipe image, so one
container serves the dashboard and the API on a single port. Runnable end-to-end.
Tabs: Feature atlas · Sequence inspector (
/api/annotate) · Generative steering(
/api/generate) · Sequence UMAP (/api/gene_embed).How to run
Full instructions (three modes) live in
feature_explorer/README.md. In short:What this PR adds
feature_explorer/, React + Vite): 4 tabs, dark/light, search, CSV export,KaTeX steering equation, crash-safe localStorage, shared widgets in
components.jsx.--build-arg WITH_DASHBOARD=1adds a multi-stage Node buildthat compiles the front-end to static
dist/(Node only in the build stage) and bakes it in;server.pymounts it at/viaDASHBOARD_DIST. Onedocker run, no Node at runtime, no secondprocess. The default build is engine + server only — it never pulls Node or builds the
front-end, so an SAE-only image isn't coupled to the dashboard toolchain.
/gene_embedendpoint +Evo2SAE.embed_bundle— pool sequences into per-feature vectors for theSequence-UMAP tab; shared with the offline precompute so the static bundle matches the live response.
UMAP; ±300 steering clamp, continuation-only; single-layer encode; stochastic 2-D UMAP, pooled,
min_firing-filtered, 1000-seq cap)./gene_embedskips (not truncates) over-length/too-shortsequences and anything past the 1000 cap, reports the counts
(
n_received/n_skipped_short/n_skipped_too_long/n_dropped_over_cap/max_seq_len/max_genes),and the tab shows a warning banner; all-invalid → 400 with an accounting message. Matches
/annotate's reject-don't-truncate./api/health; with no backend, the Feature atlas (staticparquets) and Sequence UMAP (precomputed
sequmap_embeddings.json) work, steering/inspector hide.Precompute via
scripts/dashboard.py atlas|examples|embeddings;npm run build→ a fully static site.Architectural decisions
evo2_saebackend (evo2 SAE: inference engine + steering + server/CLI, tests, Dockerfile #1622).Keeps one validated inference path; the UI can't drift from the engine.
/api, frontend on/— so a single origin serves both. The front-end always calls/api/*; in dev Vite proxies it through without a path rewrite, so dev and the single-containerbuild hit identical paths (no dev/prod drift). (
/apiprefix + static mount are in the base, evo2 SAE: inference engine + steering + server/CLI, tests, Dockerfile #1622.)dependency, so gating it behind
WITH_DASHBOARDkeeps the SAE engine image buildable without npmnetwork access or a working front-end, and an SAE-only user pays nothing for the viz. Node never
reaches the runtime image regardless.
embed_bundleis a method onEvo2SAE(incore.py) — it needsencode_batch+ the SAE, andliving in one place keeps the live
/gene_embedresponse and the offline precompute identical. It'sdashboard-only (nothing else in evo2 SAE: inference engine + steering + server/CLI, tests, Dockerfile #1622 uses it), so it ships with this PR.
partial layout, consistent with the backend's reject-don't-truncate stance.
/gene_embedwire format — one encode per sequence yields both mean- and max-pooled vectors(client toggles pooling without re-running the model), and the response ships only features firing
in ≥
min_firingsequences (65,536 →n_firing, with a remap to real feature ids) as a base64float32
[n_genes, n_firing]matrix. Shipping only the firing set is sound because TopK/ReLU codesare ≥ 0, so the firing set is pooling-invariant (mean and max agree on what's nonzero).
dashboard.py atlasbuilds firing-rates from the cachedactivation store and the 2-D layout from the SAE decoder geometry, so most dashboard data is
generatable with no big model loaded;
examples/embeddingsload the 7B but only over a smallFASTA. (Splits cheap, model-free precompute from the expensive model-touching kind.)
--layout auto(UMAP → PCA fallback) — UMAP gives the best clusters but needs numba (NumPy ≤ 2.3);autofalls back to PCA/t-SNE (numba-free) so the atlas runs single-env without forcing a fragilenumba/NumPy pin on the whole recipe.
Running the tests
No dedicated CI lane (deferred — see #1622). Run them via the recipe's build script:
test_server.py(incl./api/gene_embeddecodable matrix + the skip/over-cap accounting + unknown-organism 400) andtest_cli.pyviaFakeEngine; plus evo2 SAE: inference engine + steering + server/CLI, tests, Dockerfile #1622's contract tests. The dashboard front-end is build-validated (npm run build), no JS unit suite.test_steering.py, gated by@pytest.mark.skipif(not torch.cuda.is_available())— runs on a GPU box, skips otherwise.Stacked on #1622 (which now includes the server + CLI, formerly the separate #1637).