Skip to content

Commit a4dcada

Browse files
jerm-droclaude
andcommitted
Address second round of review feedback on RFC-0060
- Restructure implementation plan into 3 phases: POC (with rate limiting benchmark), backwards compatibility, new functionality - Update rate limiting interaction section: THV-0057 ships as standalone middleware first, Starlark built-in benchmarked later - Clarify current_user() returns same value for session lifetime - Add security section note on identity-based filtering being intentional and complementary to Cedar - Rewrite Open Question 1: note two-hook model as future direction for sessionless MCP, remove ordering heuristic Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 97f78a7 commit a4dcada

File tree

1 file changed

+39
-36
lines changed

1 file changed

+39
-36
lines changed

rfcs/THV-0060-starlark-programmable-middleware.md

Lines changed: 39 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
- **Status**: Draft
44
- **Author(s)**: Jeremy Drouillard (@jerm-dro)
55
- **Created**: 2026-03-24
6-
- **Last Updated**: 2026-03-27
6+
- **Last Updated**: 2026-04-07
77
- **Target Repository**: toolhive
88
- **Related Issues**: [stacklok-epics#213](https://github.com/stacklok/stacklok-epics/issues/213)
99
- **Related**: [THV-0051 (Starlark Scripted Tools)](./THV-0051-starlark-scripted-tools.md) — this RFC broadens the scope of Starlark in vMCP from composite tool workflows to a unified session initialization model
@@ -298,9 +298,14 @@ The programming model makes it straightforward to add new capabilities as simple
298298

299299
| Built-in | Signature | Description |
300300
|----------|-----------|-------------|
301-
| `current_user()` | `current_user() → struct(sub, email, groups)` | Returns the authenticated user's identity. The user is known at init time, but this built-in is deferred to a future version. |
302301
| `scrub_pii(text)` | `scrub_pii(text) → string` | Redacts PII patterns (emails, phones, SSNs, credit cards) from text. |
303-
| `check_rate_limit(key, limit, window)` | `check_rate_limit(key, limit, window) → (bool, int)` | Checks a token bucket counter in Redis. Returns `(allowed, retry_after_seconds)`. |
302+
303+
The following built-ins are planned for Phase 2 (backwards compatibility) and Phase 1 (proof of concept) respectively:
304+
305+
| Built-in | Signature | Phase | Description |
306+
|----------|-----------|-------|-------------|
307+
| `current_user()` | `current_user() → struct(sub, email, groups)` | Phase 2 | Returns the authenticated user's identity. Returns the same value for the lifetime of the session — the user who created it. Needed for feature parity with existing identity-aware behavior. |
308+
| `check_rate_limit(key, limit, window)` | `check_rate_limit(key, limit, window) → (bool, int)` | Phase 1 (POC) | Checks a token bucket counter in Redis. Returns `(allowed, retry_after_seconds)`. Used to benchmark Starlark vs. native middleware performance. |
304309

305310
### Presets: Making it Easy for Non-Power-Users
306311

@@ -541,10 +546,14 @@ Both coexist. A request passes through webhooks first (external policy), then re
541546

542547
#### Interaction with rate limiting
543548

544-
THV-0057's Redis-backed token bucket is the *mechanism*. A future `check_rate_limit()` built-in could expose it to scripts. Once available, the *policy* could be:
549+
THV-0057's Redis-backed token bucket ships first as standalone middleware, covering both `MCPServers` and `MCPRemoteProxy` endpoints. This gives us production data on usage patterns and performance before committing to a Starlark-based approach.
550+
551+
Once the middleware is stable, a `check_rate_limit()` built-in can expose the same Redis token bucket to Starlark scripts. The plan is to benchmark both implementations — middleware vs. Starlark built-in with the same backing mechanism — to compare implementation ease and performance. The results will inform how heavily we lean on Starlark for future capabilities.
545552

546-
1. **Config-driven**: The `default` preset reads `rateLimiting` from config and applies limits internally
547-
2. **Script-driven**: Custom scripts implement context-aware rate limiting using the built-in
553+
Once the built-in is available, the *policy* could be:
554+
555+
1. **Config-driven**: The `default` preset reads `rateLimiting` from config and applies limits internally (same behavior as the standalone middleware)
556+
2. **Script-driven**: Custom scripts implement context-aware rate limiting using the built-in (e.g., role-based limits, per-backend policies)
548557

549558
### API Changes
550559

@@ -592,6 +601,8 @@ type SessionInitConfig struct {
592601

593602
**Trust model**: Session initialization scripts are written by administrators, not end users. An administrator who can write a Starlark script already has the authority to configure vMCP.
594603

604+
**Identity-based filtering**: With `current_user()`, scripts can filter or shape tools based on user identity (e.g., `if "admin" in current_user().groups`). This is intentional and complementary to Cedar — there is no one-size-fits-all approach. Administrators who prefer declarative access control can continue using Cedar policies. Those who need something simpler or prefer an imperative approach can express it in the script. Both are valid and coexist.
605+
595606
### Data Security
596607

597608
- Scripts cannot access filesystem, network, or environment variables (Starlark sandbox)
@@ -716,47 +727,39 @@ New built-in functions can be added without breaking existing scripts. New prese
716727

717728
### Phase 1: Proof of concept
718729

719-
A fast, rough POC to validate the high-level design. The goal is to prove the programming model works end-to-end and surface any surprises before committing to a production implementation.
730+
A fast, rough POC to answer two questions: *does the programming model work end-to-end?* and *is Starlark execution fast enough?*
720731

721732
- Implement `backends()`, `publish()`, `metadata()` built-ins in the Starlark engine
722733
- Session factory runs the script and constructs `MultiSession` from `publish()` results
723-
- Implement the `default` preset that reads existing config knobs (`aggregation`, `optimizer`, etc.)
724-
- Run the Starlark engine alongside existing decorators for comparison testing
725-
- All existing tests must pass **except** those that test legacy composite tools (`compositeTools`, `compositeToolRefs`)
726-
- Update this RFC with any findings — design changes, missing built-ins, edge cases discovered
734+
- Implement `check_rate_limit()` built-in backed by the same Redis token bucket as THV-0057
735+
- Benchmark native Go middleware vs. Starlark built-in for rate limiting — same mechanism, same config surface, comparing implementation ease and performance overhead
736+
- Update this RFC with findings — design changes, missing built-ins, performance data
727737

728-
### Phase 2: Safe capabilities
738+
### Phase 2: Backwards compatibility
729739

730-
Ship the capabilities that don't interact with the authz boundary. These are the "safe" features that can be validated independently.
740+
Full feature parity with the existing config-driven system. This phase is substantial and will ship incrementally.
731741

732742
- Production-quality `backends()`, `publish()`, `metadata()` built-ins
733-
- Name resolution, filtering, and overrides via the `default` preset
734-
- Rate limiting integration
735-
- `thv vmcp show-preset` command to inspect built-in presets
736-
- Config model: `sessionInit.preset`, `sessionInit.script`, `sessionInit.scriptFile`
737-
- Preset equivalence tests for the capabilities in scope
738-
- Remove the decorator code for features replaced in this phase
739-
740-
### Phase 3: Optimizer + authz integration
741-
742-
Ship the optimizer and authz capabilities together so the relationship between them is explicit. Today, the optimizer bypasses Cedar because handlers dispatch directly to backends ([#4374](https://github.com/stacklok/toolhive/issues/4374), interim fix in [PR #4385](https://github.com/stacklok/toolhive/pull/4385)). By shipping them together, the script can enforce Cedar policies on the tool set before the optimizer builds its dispatch table (e.g. `enforce_cedar_policies(all_published)`).
743-
743+
- `current_user()` built-in for identity-aware policies
744+
- Name resolution, filtering, overrides, conflict resolution via the `default` preset
744745
- `search_index()` built-in ported from current optimizer implementation
745-
- Authz built-in (e.g. `enforce_cedar_policies()`) that filters the `(metadata, handler)` list
746+
- Authz built-in (e.g. `enforce_cedar_policies()`) shipped together with optimizer to avoid the [#4374](https://github.com/stacklok/toolhive/issues/4374) bypass (interim fix in [PR #4385](https://github.com/stacklok/toolhive/pull/4385))
746747
- Updated `default` preset with optimizer + authz integration
747-
- Preset equivalence tests for optimizer behavior
748-
- Remove remaining decorator code
749-
- The exact design of the authz built-ins will be detailed in a follow-up RFC
748+
- `thv vmcp list-presets` / `show-preset` commands
749+
- Config model: `sessionInit.preset`, `sessionInit.script`, `sessionInit.scriptFile`
750+
- Preset equivalence tests — all existing behavior reproduced identically
751+
- Mark `compositeTools` and `compositeToolRefs` as deprecated; log warnings; document migration path
752+
- Remove decorator code for replaced features
753+
- Documentation: user guide, built-in reference, migration guide
754+
- E2E tests for custom scripts in K8s via ConfigMap
750755

751-
### Phase 4: Deprecate composite tools, ship and document
756+
### Phase 3: New functionality
752757

753-
- Mark `compositeTools` and `compositeToolRefs` as deprecated
754-
- Log deprecation warnings when these fields are used
755-
- Document migration path from declarative composite tools to Starlark scripts
756-
- E2E tests for custom scripts in K8s via ConfigMap
757-
- Documentation: user guide, built-in reference, migration guide, advanced use cases
758+
Capabilities enabled by the programming model that don't exist today.
758759

759-
New built-in functions like `scrub_pii()` and `check_rate_limit()` are out of scope for this RFC. The programming model makes them straightforward to add as follow-up work.
760+
- `scrub_pii()` built-in for response redaction
761+
- Code mode: tools and skills enabling LLMs to build Starlark workflows predictably within the sandbox
762+
- Additional built-ins as use cases emerge
760763

761764

762765
## Testing Strategy
@@ -776,7 +779,7 @@ New built-in functions like `scrub_pii()` and `check_rate_limit()` are out of sc
776779

777780
## Open Questions
778781

779-
1. **Sessionless MCP requests**: What happens when MCP supports requests without sessions? Do we have to run this heavy script on every request? We could actually run the script once at startup, since it does not depend on request-time information. However, if we fold in authz concerns from above, then `current_user()` will be request-time information. We could cheat around this by recommending all logic which depends on `current_user()` be placed at the end of the script. When that's encountered during startup, we block and restore the state on each request. Alternatively, we could support two different scripts. One for initialization and one per-request.
782+
1. **Sessionless MCP requests**: What happens when MCP supports requests without sessions? The current single-script model works well for session-scoped execution, but sessionless requests would require running the script per-request or once at startup. A promising direction is a two-hook model: `on_session_init()` for tool shape and presentation (runs once), `on_request()` for per-request concerns like rate limiting and user-specific filtering. This cleanly maps to the distinction the RFC already makes and would survive the sessionless transition without script-ordering footguns. We'll design the right model when concrete requirements emerge.
780783

781784
## References
782785

0 commit comments

Comments
 (0)