Skip to content

RFC - UI Architecture Proposal #4032

@gcgoncalves

Description

@gcgoncalves

1 Executive Summary

Rewriting our UI using a more modern library for supporting a complex application like Context Forge

2 Motivation

The Context Forge UI was written initially using HTMX + Alpine.js, a great stack for quick prototyping and for a lightweight UI. Both the application interaction and security has been handled by custom JavaScript code. The application has since then grew in complexity, and maintaining the custom JS has become harder with time.

This has led to issues like poor security, low AI support and complex maintenance. We’ll break those down in the sections below

2.1 Poor Security

Having to handle security with custom code makes the application more prone to vulnerabilities. It might dodge the risks of the supply chain, but ends up missing on the benefits of field-tested, well audited, community watched code.

2.2 Low AI support

Having agents work on a ~34000 Lines file (1.2Mb) ends up in heaps of wasted tokens and longer processing times. The complexity of the file often leads agents to errors (e.g.: Update plugin configuration and defaults by crivetimihai · Pull Request #3869 · IBM/mcp-context-forge) and repeated code.

2.3 Complex maintenance

As for agents, it is complex for humans to read, understand and familiarise with such a large file. Efforts have been made to reduce this complexity, but it doesn’t eliminate the previous items completely.

We believe modern libraries such as React, Vue and Svelte are more equipped to deal with complex apps, allowing for utilities like security middlewares, type safety (namely TypeScript) and input schemas.

3 Proposed Implementation

The ultimate goals of our UI should be, in order of relevance:

  1. Security
  2. Speed
  3. Minimal backend changes

Those were our north stars while defining this architecture.

Security

Security is the primary constraint on every decision in this RFC. Two concerns dominate: keeping the dependency tree small enough to audit, and ensuring the running application cannot be used as an attack surface.

Supply chain

Every npm package added to the project is a potential vector for a compromised or malicious transitive dependency. The strategy here is to keep the runtime dependency list short enough that each entry can be individually justified.

The build toolchain — Vite, Tailwind, TypeScript, the test runner, all linters — never reaches the browser. The Node image used to compile assets is discarded in the multi-stage Containerfile; only the compiled static files are copied into the Python image. A compromised build tool cannot inject code into a deployed container without also compromising the CI pipeline itself.

Notably absent from the runtime list: a data-fetching library, a state manager, a date library, an HTTP client, and a UI framework runtime. Each of these would bring its own transitive graph. Native fetch, React context, and Intl cover the same ground with zero additional exposure.

shadcn/ui components are an explicit exception to the "no runtime library" principle. They are vendored — the component source is copied into client/components/ui/ and owned by this repository. Once copied, they have no runtime npm dependency and no transitive update path. The trade-off is manual updates in exchange for a frozen, auditable copy.

For more on this decision see Component library below.

Control Mechanism
Known-good lockfile package-lock.json pins every transitive dep to an exact version
Dependency audit in CI retire scans the installed graph for published CVEs on every build
No CDN imports All scripts and styles are bundled locally; no <script src="https://…">
Content-hashed filenames Vite appends a build hash to every output filename; browsers cache by hash, not path
Toolchain isolation Node stage discarded after build; no node_modules in the final image
Vendored UI components shadcn source lives in-repo — no runtime npm dep, no surprise transitive updates

App hardening

The SPA inherits the backend's existing security posture (CSP, HSTS, SameSite cookies, X-Frame-Options: DENY, and X-Content-Type-Options: nosniff are all set by SecurityHeadersMiddleware). The frontend adds its own layer on top.

Authentication boundary

The JWT is stored in a SameSite=Strict; HttpOnly cookie. It is never placed in localStorage, never appended to a URL as a query parameter, and never exposed to JavaScript running in a third-party frame. The HttpOnly flag means that even if an XSS payload runs in the page, it cannot read the token.

Routing and URL safety

Every URL the app reads or writes passes through validation before use:

Threat Control
Open redirect ?redirect= validated against a compile-time AUTHENTICATED_PATHS allowlist; any other value is stripped
Unknown path Routes not in the router tree render NotFound; no fallthrough to arbitrary handlers
Malicious query params Per-route Zod schemas declare exactly which params are accepted and in what shape; unknown params are silently dropped by the layout-level empty schema
Parameter injection safeParamValue regex (/^[a-zA-Z0-9\-_.~%:/]*$/) applied to all free-text param values; excludes raw URL special characters
Path traversal Paths are matched against the router's static route tree, not interpolated into file paths or API calls

XSS

React escapes all interpolated values by default. dangerouslySetInnerHTML is not used anywhere in the codebase. User-supplied strings from the API (gateway names, tags, descriptions) are rendered as text nodes, not HTML.

Content-Security-Policy is set by SecurityHeadersMiddleware on the FastAPI side. Because the SPA is same-origin and all scripts are bundled (no eval, no dynamic <script> injection, no CDN), script-src 'self' is the only directive needed for scripts.

DOMPurify

Although React's default escaping covers the vast majority of XSS vectors, there are specific cases where HTML must be rendered rather than escaped — for example, rich-text descriptions returned by the API, Markdown-to-HTML output, or content sourced from an upstream MCP server that is not fully trusted. In these cases, rendering raw HTML directly through dangerouslySetInnerHTML is never acceptable.

DOMPurify is the designated sanitiser for any code path that must render HTML. It strips all elements, attributes, and URI schemes that can execute script before the string is handed to the DOM.

Usage contract:

import DOMPurify from 'dompurify';

// Only this pattern is permitted when rendering HTML
<div dangerouslySetInnerHTML={{ __html: DOMPurify.sanitize(richText) }} />

All other use of dangerouslySetInnerHTML — without DOMPurify.sanitize wrapping the value — is a lint error and must not be merged.

Supply chain justification: DOMPurify has no transitive runtime dependencies, is independently security-audited (Cure53), and is among the most widely used browser sanitisation libraries. Its inclusion is narrowly scoped: it is called only at the explicit dangerouslySetInnerHTML boundary, not in general rendering. Adding it is explicitly justified in the same way Radix primitives are — a narrow, stable, auditable surface covering a security control that cannot be implemented safely with native browser APIs alone.

Configuration: DOMPurify is called with the default configuration. FORCE_BODY, ALLOWED_TAGS, and ALLOWED_ATTR overrides must be explicitly reviewed and documented at each call site. Calls that relax the default policy (e.g., ADD_TAGS, ADD_ATTR) require a code-review sign-off and a comment explaining why the exception is necessary.

Threat Control
Script injection via innerHTML DOMPurify.sanitize() strips all script-executing elements and event attributes before DOM insertion
javascript: URI in href/src DOMPurify removes javascript: and data: URIs by default
Mutation XSS (mXSS) DOMPurify serialises through the browser's own parser to close mutation-based bypass paths
Overly permissive call sites Lint rule forbids dangerouslySetInnerHTML without DOMPurify.sanitize; policy relaxations require explicit review

CSRF

All API calls are same-origin (the SPA is served from the same host and port as the API). The JWT cookie is SameSite=Strict, which blocks cross-origin form submissions and fetch calls from triggering authenticated requests. No separate CSRF token is required.

TypeScript as a security layer

TypeScript enforces that every <Link to="…"> and useNavigate() call references a route that actually exists in the router tree. Routing to an unregistered path is a compile error, not a runtime 404. Route search params are typed from their Zod schemas, so accessing an undeclared param returns undefined rather than an unsanitised string from window.location.search.

Bundle

Build pipeline

Vite compiles the client/ source tree into a set of static files and writes them to mcpgateway/static/app/. FastAPI mounts that directory at /app using its built-in StaticFiles. The Python process that was already running now also serves the UI.

In production the build runs as a Node stage inside the existing multi-stage Containerfile. The final image is the same Python image it always was; the Node toolchain is discarded after the build step.

No extra containers

Concern How it is handled
Serving static files FastAPI StaticFiles — already a project dependency
SPA fallback (index.html for all /app/* paths) html=True flag on the mount
API calls from the browser Same origin — no CORS, no proxy sidecar
Dev-time API proxying Vite server.proxy forwards /api to the running FastAPI process

Small bundle

Vite tree-shakes the entire dependency graph at build time; only imported symbols are emitted. Tailwind v4 (a Vite plugin) scans the source and emits only the utility classes that appear in the code — the CSS output for the current pages is under 10 kB before compression.

shadcn/ui components are vendored source files inside client/components/ui/, not a runtime library. Vite only bundles the components that are imported.

Vite splits the output automatically into:

Chunk Contents
index-[hash].js App shell, router, context providers
vendor-[hash].js React, React DOM
Per-route chunks Each page, loaded on demand

Minimal runtime dependencies

The production dependency list is intentionally short:

Package Purpose Alternatives not chosen
react + react-dom UI framework
zod Search param and form validation yup, hand-rolled

Notably absent: a global state manager (Redux, Zustand), a data-fetching library (React Query, SWR), a date library, and an HTTP client (Axios). React context covers the state surface of an admin UI; native fetch is sufficient for REST calls against a same-origin API.

Component library

We use shadcn/ui — but not as a library. shadcn is a CLI that copies component source code directly into the repository. There is no shadcn package imported at runtime.

Once copied, a component is just our code. The CLI and the upstream registry are entirely out of the picture at runtime and at build time.

Why shadcn over a traditional component library:

Concern Traditional library (MUI, Carbon, …) shadcn
Runtime npm dependency Yes — versioned package in node_modules None
Transitive dep graph Dozens of packages, grows with upgrades Zero (Radix primitives only, per component)
CVE exposure Any dep in the graph is a vector Only what is copied and imported
Customisation Theme overrides, often limited Direct source edit — full control
Bundle size Entire library unless tree-shaken Only copied + imported components
Upgrade path npm update — can introduce breaking changes Explicit, per-component, opt-in

The supply chain argument in full: a traditional UI library is a single package.json entry that expands into dozens of transitive packages, each of which can be compromised between your last audit and your next deploy. A vendored component has no update path because there is no package to update — the attack surface is frozen at copy time. The cost is that upstream bug fixes require a deliberate, reviewed source edit rather than a lockfile bump. For a security-first admin UI that is an acceptable trade-off.

Radix under the hood: shadcn components delegate interactive behaviour (focus trapping, ARIA roles, keyboard navigation) to Radix UI primitives. Each Radix package (@radix-ui/react-dialog, @radix-ui/react-select, …) is a real runtime dependency — but it is scoped to the specific primitives used, has zero third-party transitive deps of its own, and is extremely stable. The Radix surface is narrow, auditable, and justified.

Fast application

Property Mechanism
Instant route transitions Client-side routing — no full page reload
No render blocking JS and CSS chunks are loaded in parallel; route chunks are lazy
Auth state available immediately JWT stored in a SameSite=Strict cookie; AuthProvider reads it synchronously on mount
No SSR cold start Pure SPA — first byte is always the cached index.html
Dev iteration speed Vite HMR replaces changed modules in < 50 ms without losing component state

Speed

Navigation is free. Client-side routing means moving between pages never touches the network. The router matches the new URL against its static route tree in memory, unmounts the previous page component, and mounts the next one — all synchronously, with no HTTP request and no server render cycle on the critical path. The transition is bounded only by how long the incoming page takes to paint, not by any network RTT.

The initial download is lean and precisely cached. Vite splits the output into content-hashed chunks: a small app shell, a shared vendor chunk (React + React DOM), and per-route chunks that are fetched only when the user first navigates to that route. Because filenames are keyed by content hash, a chunk that has not changed between deploys is served directly from the browser cache without a revalidation request — the browser does not even issue a conditional If-None-Match request. Only the chunks that actually changed in a given deploy are re-downloaded. Tailwind v4 scans the source tree at build time and emits only the utility classes that appear in the code; the CSS for the current pages is under 10 kB before compression.

API calls return data, not markup. Every backend call returns JSON. The browser receives structured field values and renders them into the component tree; it does not parse HTML, walk a fragment for insertion points, or evaluate inline event attributes. For large resource lists — gateways, tools, users — the JSON representation of the same data is materially smaller than an equivalent HTML table fragment, reducing both transfer time and parse cost. React's reconciler then performs a minimal diff against the existing DOM, touching only the nodes that actually changed.

Writes can be optimistic. React state can be updated immediately on user action, before the API call completes. The request races in the background; the UI reconciles with the server response on completion and rolls back only on error. Perceived latency for create, update, and delete operations is effectively zero — the user sees the result of their action instantly rather than waiting for a round trip before the UI responds.

Everything is served from the same origin. No scripts, styles, or fonts are loaded from external CDNs. There are no extra DNS lookups, no additional TLS handshakes, and no third-party availability dependency on the critical render path. A network partition between the user and any external CDN cannot degrade or break the UI.

Open decisions

The following choices are still being evaluated:

Concern Options
Client-side routing Custom useRoute hook · TanStack Router
Internationalisation FormatJS (react-intl) · react-i18next

4 Drawbacks

Build pipeline complexity. The project now requires a Node stage in CI and in the multi-stage Containerfile. That is an additional tool version to pin, an additional cache layer to manage, and an additional failure mode in the build. The Node stage is discarded after the build, so there is no runtime cost — but the operational surface of the CI pipeline grows.

Vendored components require manual upkeep. shadcn components are copied into the repository and frozen. Upstream bug fixes — including accessibility fixes and security patches to Radix primitives — do not flow in automatically. They require a deliberate, reviewed source edit per component. The team must track upstream changes itself or accept that vendored components will drift from the originals over time.

5 Alternatives

Svelte. Svelte compiles away the framework at build time, producing smaller and faster output than React. It has no virtual DOM and achieves finer-grained reactivity with near-zero runtime overhead. That makes it a defensible choice for a performance-first admin UI. The trade-off is a smaller ecosystem, fewer available developers, and less mature tooling for type-safe routing and accessible component primitives.

Preact. Preact is an API-compatible drop-in for React at approximately one-tenth the bundle size (~4 kB vs ~130 kB). The entire React ecosystem — including TanStack Router and shadcn components — is compatible via the preact/compat alias. This would yield the same developer experience and the same supply chain story at a meaningfully smaller runtime cost. It is the closest alternative to the proposed approach.

No change. Accepting the status quo and investing engineering time elsewhere. The HTMX UI works; it covers the existing feature set; it is understood by the team. The cost is ongoing maintenance of a large, untyped, unstructured JavaScript codebase with no component model and little test surface.

## 6 Conclusion

The existing admin UI has accumulated significant complexity in a single unstructured JavaScript file. It is difficult to test, difficult to extend, and provides no type safety between the API contract and the rendered interface. These are not cosmetic concerns — they are constraints that slow down every feature addition and make refactoring risky.

The React SPA resolves the root causes rather than working around them. TypeScript + TanStack Router makes the routing contract a compile-time guarantee. The component model makes UI behaviour composable and independently testable. The intentionally short runtime dependency list keeps the supply chain auditable. And crucially, none of this requires a new service, a new runtime, or a change to the deployment model — the compiled assets drop into the existing FastAPI container.

The drawbacks are real. The build pipeline grows slightly more complex, vendored components require active maintenance, and the frontend knowledge surface expands beyond what Python contributors may be familiar with today. These costs are manageable and bounded; the costs of continuing on the current trajectory are open-ended.

The current UI has grown complex enough that maintaining it carries a compounding cost: bugs are difficult to isolate, fixes introduce regressions, and the lack of structure makes every change harder than it should be. Given that the feature set is stable, that cost is no longer buying anything. A three-month migration to a typed, component-based architecture is a bounded investment with a clear return — lower defect rate, faster iteration, and a codebase that does not get harder to work with over time.

Metadata

Metadata

Assignees

Labels

designArchitecture and Designdesign-highcritical to user experience, to prioritize soonerfrontendFrontend development (HTML, CSS, JavaScript)typescriptTypeScript programming languageuiUser Interfaceui-rewriteTasks for the isolated ui rewrite feature branch
No fields configured for Feature.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions