You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Address second round of review feedback on RFC-0060
- Restructure implementation plan into 3 phases: POC (with rate
limiting benchmark), backwards compatibility, new functionality
- Update rate limiting interaction section: THV-0057 ships as
standalone middleware first, Starlark built-in benchmarked later
- Clarify current_user() returns same value for session lifetime
- Add security section note on identity-based filtering being
intentional and complementary to Cedar
- Rewrite Open Question 1: note two-hook model as future direction
for sessionless MCP, remove ordering heuristic
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
-**Related**: [THV-0051 (Starlark Scripted Tools)](./THV-0051-starlark-scripted-tools.md) — this RFC broadens the scope of Starlark in vMCP from composite tool workflows to a unified session initialization model
@@ -298,9 +298,14 @@ The programming model makes it straightforward to add new capabilities as simple
298
298
299
299
| Built-in | Signature | Description |
300
300
|----------|-----------|-------------|
301
-
|`current_user()`|`current_user() → struct(sub, email, groups)`| Returns the authenticated user's identity. The user is known at init time, but this built-in is deferred to a future version. |
|`check_rate_limit(key, limit, window)`|`check_rate_limit(key, limit, window) → (bool, int)`| Checks a token bucket counter in Redis. Returns `(allowed, retry_after_seconds)`. |
302
+
303
+
The following built-ins are planned for Phase 2 (backwards compatibility) and Phase 1 (proof of concept) respectively:
304
+
305
+
| Built-in | Signature | Phase | Description |
306
+
|----------|-----------|-------|-------------|
307
+
|`current_user()`|`current_user() → struct(sub, email, groups)`| Phase 2 | Returns the authenticated user's identity. Returns the same value for the lifetime of the session — the user who created it. Needed for feature parity with existing identity-aware behavior. |
308
+
|`check_rate_limit(key, limit, window)`|`check_rate_limit(key, limit, window) → (bool, int)`| Phase 1 (POC) | Checks a token bucket counter in Redis. Returns `(allowed, retry_after_seconds)`. Used to benchmark Starlark vs. native middleware performance. |
304
309
305
310
### Presets: Making it Easy for Non-Power-Users
306
311
@@ -541,10 +546,14 @@ Both coexist. A request passes through webhooks first (external policy), then re
541
546
542
547
#### Interaction with rate limiting
543
548
544
-
THV-0057's Redis-backed token bucket is the *mechanism*. A future `check_rate_limit()` built-in could expose it to scripts. Once available, the *policy* could be:
549
+
THV-0057's Redis-backed token bucket ships first as standalone middleware, covering both `MCPServers` and `MCPRemoteProxy` endpoints. This gives us production data on usage patterns and performance before committing to a Starlark-based approach.
550
+
551
+
Once the middleware is stable, a `check_rate_limit()` built-in can expose the same Redis token bucket to Starlark scripts. The plan is to benchmark both implementations — middleware vs. Starlark built-in with the same backing mechanism — to compare implementation ease and performance. The results will inform how heavily we lean on Starlark for future capabilities.
545
552
546
-
1.**Config-driven**: The `default` preset reads `rateLimiting` from config and applies limits internally
547
-
2.**Script-driven**: Custom scripts implement context-aware rate limiting using the built-in
553
+
Once the built-in is available, the *policy* could be:
554
+
555
+
1.**Config-driven**: The `default` preset reads `rateLimiting` from config and applies limits internally (same behavior as the standalone middleware)
556
+
2.**Script-driven**: Custom scripts implement context-aware rate limiting using the built-in (e.g., role-based limits, per-backend policies)
548
557
549
558
### API Changes
550
559
@@ -592,6 +601,8 @@ type SessionInitConfig struct {
592
601
593
602
**Trust model**: Session initialization scripts are written by administrators, not end users. An administrator who can write a Starlark script already has the authority to configure vMCP.
594
603
604
+
**Identity-based filtering**: With `current_user()`, scripts can filter or shape tools based on user identity (e.g., `if "admin" in current_user().groups`). This is intentional and complementary to Cedar — there is no one-size-fits-all approach. Administrators who prefer declarative access control can continue using Cedar policies. Those who need something simpler or prefer an imperative approach can express it in the script. Both are valid and coexist.
@@ -716,47 +727,39 @@ New built-in functions can be added without breaking existing scripts. New prese
716
727
717
728
### Phase 1: Proof of concept
718
729
719
-
A fast, rough POC to validate the high-level design. The goal is to prove the programming model works end-to-end and surface any surprises before committing to a production implementation.
730
+
A fast, rough POC to answer two questions: *does the programming model work end-to-end?* and *is Starlark execution fast enough?*
720
731
721
732
- Implement `backends()`, `publish()`, `metadata()` built-ins in the Starlark engine
722
733
- Session factory runs the script and constructs `MultiSession` from `publish()` results
723
-
- Implement the `default` preset that reads existing config knobs (`aggregation`, `optimizer`, etc.)
724
-
- Run the Starlark engine alongside existing decorators for comparison testing
725
-
- All existing tests must pass **except** those that test legacy composite tools (`compositeTools`, `compositeToolRefs`)
726
-
- Update this RFC with any findings — design changes, missing built-ins, edge cases discovered
734
+
- Implement `check_rate_limit()` built-in backed by the same Redis token bucket as THV-0057
735
+
- Benchmark native Go middleware vs. Starlark built-in for rate limiting — same mechanism, same config surface, comparing implementation ease and performance overhead
736
+
- Update this RFC with findings — design changes, missing built-ins, performance data
727
737
728
-
### Phase 2: Safe capabilities
738
+
### Phase 2: Backwards compatibility
729
739
730
-
Ship the capabilities that don't interact with the authz boundary. These are the "safe" features that can be validated independently.
740
+
Full feature parity with the existing config-driven system. This phase is substantial and will ship incrementally.
- Preset equivalence tests for the capabilities in scope
738
-
- Remove the decorator code for features replaced in this phase
739
-
740
-
### Phase 3: Optimizer + authz integration
741
-
742
-
Ship the optimizer and authz capabilities together so the relationship between them is explicit. Today, the optimizer bypasses Cedar because handlers dispatch directly to backends ([#4374](https://github.com/stacklok/toolhive/issues/4374), interim fix in [PR #4385](https://github.com/stacklok/toolhive/pull/4385)). By shipping them together, the script can enforce Cedar policies on the tool set before the optimizer builds its dispatch table (e.g. `enforce_cedar_policies(all_published)`).
743
-
743
+
-`current_user()` built-in for identity-aware policies
744
+
- Name resolution, filtering, overrides, conflict resolution via the `default` preset
744
745
-`search_index()` built-in ported from current optimizer implementation
745
-
- Authz built-in (e.g. `enforce_cedar_policies()`) that filters the `(metadata, handler)` list
746
+
- Authz built-in (e.g. `enforce_cedar_policies()`) shipped together with optimizer to avoid the [#4374](https://github.com/stacklok/toolhive/issues/4374) bypass (interim fix in [PR #4385](https://github.com/stacklok/toolhive/pull/4385))
746
747
- Updated `default` preset with optimizer + authz integration
747
-
- Preset equivalence tests for optimizer behavior
748
-
- Remove remaining decorator code
749
-
- The exact design of the authz built-ins will be detailed in a follow-up RFC
- Preset equivalence tests — all existing behavior reproduced identically
751
+
- Mark `compositeTools` and `compositeToolRefs` as deprecated; log warnings; document migration path
752
+
- Remove decorator code for replaced features
753
+
- Documentation: user guide, built-in reference, migration guide
754
+
- E2E tests for custom scripts in K8s via ConfigMap
750
755
751
-
### Phase 4: Deprecate composite tools, ship and document
756
+
### Phase 3: New functionality
752
757
753
-
- Mark `compositeTools` and `compositeToolRefs` as deprecated
754
-
- Log deprecation warnings when these fields are used
755
-
- Document migration path from declarative composite tools to Starlark scripts
756
-
- E2E tests for custom scripts in K8s via ConfigMap
757
-
- Documentation: user guide, built-in reference, migration guide, advanced use cases
758
+
Capabilities enabled by the programming model that don't exist today.
758
759
759
-
New built-in functions like `scrub_pii()` and `check_rate_limit()` are out of scope for this RFC. The programming model makes them straightforward to add as follow-up work.
760
+
-`scrub_pii()` built-in for response redaction
761
+
- Code mode: tools and skills enabling LLMs to build Starlark workflows predictably within the sandbox
762
+
- Additional built-ins as use cases emerge
760
763
761
764
762
765
## Testing Strategy
@@ -776,7 +779,7 @@ New built-in functions like `scrub_pii()` and `check_rate_limit()` are out of sc
776
779
777
780
## Open Questions
778
781
779
-
1.**Sessionless MCP requests**: What happens when MCP supports requests without sessions? Do we have to run this heavy script on every request? We could actually run the script once at startup, since it does not depend on request-time information. However, if we fold in authz concerns from above, then `current_user()`will be request-time information. We could cheat around this by recommending all logic which depends on `current_user()` be placed at the end of the script. When that's encountered during startup, we block and restore the state on each request. Alternatively, we could support two different scripts. One for initialization and one per-request.
782
+
1.**Sessionless MCP requests**: What happens when MCP supports requests without sessions? The current single-script model works well for session-scoped execution, but sessionless requests would require running the script per-request or once at startup. A promising direction is a two-hook model: `on_session_init()` for tool shape and presentation (runs once), `on_request()`for per-request concerns like rate limiting and user-specific filtering. This cleanly maps to the distinction the RFC already makes and would survive the sessionless transition without script-ordering footguns. We'll design the right model when concrete requirements emerge.
0 commit comments