Skip to content

docs(design): multi-port routes and streaming for NebariApp#120

Draft
viniciusdc wants to merge 1 commit into
mainfrom
docs/multi-backend-routes-design
Draft

docs(design): multi-port routes and streaming for NebariApp#120
viniciusdc wants to merge 1 commit into
mainfrom
docs/multi-backend-routes-design

Conversation

@viniciusdc

@viniciusdc viniciusdc commented May 19, 2026

Copy link
Copy Markdown
Collaborator

Summary

Design document (no code in this PR — multi-port implementation is the stacked PR #121; streaming implementation TBD). Proposes two related routing-contract extensions:

  1. Multi-port routes. Optional RouteMatch.port override so a single NebariApp can route different path prefixes to different ports on the same backend Service under one hostname.
  2. Streaming timeouts. Opt-in routing.streaming: true flag that makes the operator emit an Envoy Gateway BackendTrafficPolicy disabling the default 15s request timeout and setting a 5-minute idle timeout. Covers SSE, long-poll, gRPC streaming.

Plus two contract-tightening items that ride along:

  • Removal of ServiceReference.Namespace. Half-feature on the operator side (cross-namespace BackendObjectReference needs a ReferenceGrant the operator never creates), but it is still read by nebari-landing for health-probe DNS construction — that dependency is now dormant and falls back gracefully. See the doc's Downstream consumer section.
  • Explicit "one NebariApp = one hostname = one backend Service" constraint codified in the schema docstrings.

Iteration note

An earlier draft proposed per-route backend: {name, port} overrides (multi-Service). That was narrowed: a NebariApp targets exactly one Service. Streaming was originally listed as a follow-up; folded into this design after iteration. The doc keeps the original filename for URL stability and explains the rename inline.

Why now

  • Multi-port: real cases where a single Service exposes multiple ports (UI + admin, app + metrics, app + SSE on a separate port). The proposed routing.routes[].port knob plugs that with a one-line addition to RouteMatch.
  • Streaming: Envoy Gateway's 15s default requestTimeout cuts off SSE / long-poll / gRPC streams. The downstream PR openteams-ai/nebari.openteams.ai#12 is hand-rolling a BackendTrafficPolicy targeting the operator's HTTPRoute by name — a fragile contract pack authors shouldn't need to learn the Envoy schema for. The proposed routing.streaming: true flag makes the operator emit a single owner-referenced policy with canned timeouts (requestTimeout: 0s, connectionIdleTimeout: 300s).

nebari-landing dependency

spec.service.namespace is read by nebari-landing to build in-cluster health-probe URLs that deliberately bypass the gateway. Two reasons removal is non-breaking:

  • The watcher uses unstructured.NestedString with a fallback to u.GetNamespace() when the field is absent.
  • The original consumers — Keycloak and ArgoCD NebariApps in the kind dev cluster — have moved to NIC's foundational Argo-apps layer and are no longer modelled as NebariApps. No current production NebariApp sets the field.

Follow-up nebari-landing PR (separate, after this lands) will remove the inert App.ServiceNamespace field and the cross-namespace fallback branch.

What the doc covers

  • Goals / non-goals (no per-route Services, no multi-hostname, no cross-namespace backends, no Envoy-typed timeout passthrough).
  • Design principles tied to the existing NIC architectural rules.
  • Proposed RouteMatch.port, RoutingConfig.streaming, and ServiceReference slim-down.
  • BackendTrafficPolicy resource shape, lifecycle, and target-refs (covers both main and public HTTPRoutes).
  • Downstream-consumer analysis (nebari-landing).
  • File-by-file operator impact (api/v1/, internal/controller/reconcilers/), including a new streaming.go reconciler and an RBAC bump for backendtrafficpolicies.
  • Validation: route port must be exposed by spec.service; failure surfaces on the NebariApp's status.
  • Backwards compatibility (spec.service.namespace removal in v1; project README still flags the API as unstable, so this fits within the stated contract).
  • Open questions left explicit: publicRoutes symmetry, status surface for resolved ports, whether streaming should apply to the public HTTPRoute, and whether to expose individual timeout knobs vs. a boolean.

Test plan

  • Reviewers confirm RouteMatch.port shape (*int32, optional, falls back to spec.service.port)
  • Reviewers confirm RoutingConfig.streaming shape (boolean intent vs. a streamingTimeouts struct — see Open questions)
  • Reviewers confirm the policy targetRefs should cover both main and public HTTPRoutes (vs. main only)
  • Reviewers confirm the ServiceReference.Namespace removal is acceptable given the nebari-landing analysis
  • Reviewers confirm the one-hostname / one-Service per-NebariApp constraint

Follow-up

@viniciusdc viniciusdc force-pushed the docs/multi-backend-routes-design branch 2 times, most recently from 7a5ea5b to 71aad60 Compare May 19, 2026 20:10
@viniciusdc viniciusdc changed the title docs(design): multi-backend routes for NebariApp docs(design): multi-port routes for NebariApp May 19, 2026
@viniciusdc viniciusdc force-pushed the docs/multi-backend-routes-design branch 2 times, most recently from 78e47b6 to bdc887b Compare May 20, 2026 13:53
@viniciusdc viniciusdc changed the title docs(design): multi-port routes for NebariApp docs(design): multi-port routes and streaming for NebariApp May 20, 2026
@viniciusdc viniciusdc force-pushed the docs/multi-backend-routes-design branch from bdc887b to 7e1cc69 Compare May 20, 2026 17:30
Adds a design doc proposing an optional per-route port override on
RouteMatch so a single NebariApp can route different path prefixes to
different ports on the same backend Service under one hostname. Tightens
the same-namespace contract by removing ServiceReference.Namespace, and
codifies the "one NebariApp = one hostname = one backend Service"
boundary that has been implicit until now.

nebari-landing reads the namespace field for in-cluster health-probe DNS
construction but has graceful fallback to the NebariApp's own namespace,
and the original consumers (Keycloak/ArgoCD as NebariApps in the kind
dev cluster) have moved to NIC's foundational Argo-apps layer, so the
removal is non-breaking. A follow-up PR on nebari-landing should drop
the now-inert ServiceNamespace plumbing.

The streaming/BackendTrafficPolicy concern (Envoy SSE timeouts) is
covered in a separate design doc (docs/design/streaming-timeouts.md).

Filename kept as multi-backend-routes.md for URL stability; the doc
explains the rename.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants