Skip to content

Commit 51fda65

Browse files
jerm-droclaude
andauthored
Add rate limiting CRD types and token-bucket package (#4577)
* Add rate limiting types to MCPServerSpec Add RateLimitConfig, RateLimitBucket, and ToolRateLimitConfig CRD types to support configurable token-bucket rate limiting on MCPServer. The rateLimiting field accepts shared and per-tool bucket configs with maxTokens and refillPeriod (metav1.Duration) parameters. "Shared" denotes a bucket shared across all users, distinguishing it from future per-user buckets (#4550). CEL validation ensures: - At least one of shared or tools is configured - Redis session storage is required when rateLimiting is set ToolRateLimitConfig.Shared is required since a tool entry without a bucket is meaningless. Tools uses +listType=map with +listMapKey=name to reject duplicate tool names at admission. Part of #4551 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Add token-bucket rate limiter package with tests Introduce pkg/ratelimit/ with a public Limiter interface and internal token bucket implementation backed by a Redis Lua script. Public API surface is minimal: - Limiter interface with Allow(ctx, toolName, userID) - Decision result type with Allowed and RetryAfter - NewLimiter constructor accepting CRD types directly Internal (pkg/ratelimit/internal/bucket/): - Atomic multi-key Lua script that checks all buckets before consuming from any, preventing rejected per-tool calls from draining the server-level budget - Uses Redis TIME for clock consistency and miniredis testability - Guards against negative elapsed time from clock drift - Key format and derivation fully encapsulated in the internal package Tests cover all acceptance criteria for global rate limits: - AC3: requests exceeding maxTokens are rejected - AC4: Retry-After computed from the bucket refill rate - AC5: per-tool limit on one tool does not affect other tools - AC6: request must pass both server-level and per-tool limits - Atomic multi-bucket: tool rejection does not drain server budget - Redis unavailability returns error (fail-open is caller's job) - No-op limiter when config is nil Part of #4551 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent bd02232 commit 51fda65

File tree

10 files changed

+1083
-1
lines changed

10 files changed

+1083
-1
lines changed

cmd/thv-operator/api/v1alpha1/mcpserver_types.go

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -154,6 +154,7 @@ const SessionStorageProviderRedis = "redis"
154154
//
155155
// +kubebuilder:validation:XValidation:rule="!(has(self.oidcConfig) && has(self.oidcConfigRef))",message="oidcConfig and oidcConfigRef are mutually exclusive; use oidcConfigRef to reference a shared MCPOIDCConfig"
156156
// +kubebuilder:validation:XValidation:rule="!(has(self.telemetry) && has(self.telemetryConfigRef))",message="telemetry and telemetryConfigRef are mutually exclusive; migrate to telemetryConfigRef"
157+
// +kubebuilder:validation:XValidation:rule="!has(self.rateLimiting) || (has(self.sessionStorage) && self.sessionStorage.provider == 'redis')",message="rateLimiting requires sessionStorage with provider 'redis'"
157158
//
158159
//nolint:lll // CEL validation rules exceed line length limit
159160
type MCPServerSpec struct {
@@ -331,6 +332,11 @@ type MCPServerSpec struct {
331332
// When nil, no session storage is configured.
332333
// +optional
333334
SessionStorage *SessionStorageConfig `json:"sessionStorage,omitempty"`
335+
336+
// RateLimiting defines rate limiting configuration for the MCP server.
337+
// Requires Redis session storage to be configured for distributed rate limiting.
338+
// +optional
339+
RateLimiting *RateLimitConfig `json:"rateLimiting,omitempty"`
334340
}
335341

336342
// ResourceOverrides defines overrides for annotations and labels on created resources
@@ -481,6 +487,54 @@ type SessionStorageConfig struct {
481487
PasswordRef *SecretKeyRef `json:"passwordRef,omitempty"`
482488
}
483489

490+
// RateLimitConfig defines rate limiting configuration for an MCP server.
491+
// At least one of shared or tools must be configured.
492+
//
493+
// +kubebuilder:validation:XValidation:rule="has(self.shared) || (has(self.tools) && size(self.tools) > 0)",message="at least one of shared or tools must be configured"
494+
//
495+
//nolint:lll // CEL validation rules exceed line length limit
496+
type RateLimitConfig struct {
497+
// Shared defines a token bucket shared across all users for the entire server.
498+
// +optional
499+
Shared *RateLimitBucket `json:"shared,omitempty"`
500+
501+
// Tools defines per-tool rate limit overrides.
502+
// Each entry applies additional rate limits to calls targeting a specific tool name.
503+
// A request must pass both the server-level limit and the per-tool limit.
504+
// +listType=map
505+
// +listMapKey=name
506+
// +optional
507+
Tools []ToolRateLimitConfig `json:"tools,omitempty"`
508+
}
509+
510+
// RateLimitBucket defines a token bucket configuration.
511+
type RateLimitBucket struct {
512+
// MaxTokens is the maximum number of tokens (bucket capacity).
513+
// This is also the burst size: the maximum number of requests that can be served
514+
// instantaneously before the bucket is depleted.
515+
// +kubebuilder:validation:Required
516+
// +kubebuilder:validation:Minimum=1
517+
MaxTokens int32 `json:"maxTokens"`
518+
519+
// RefillPeriod is the duration to fully refill the bucket from zero to maxTokens.
520+
// The effective refill rate is maxTokens / refillPeriod tokens per second.
521+
// Format: Go duration string (e.g., "1m0s", "30s", "1h0m0s").
522+
// +kubebuilder:validation:Required
523+
RefillPeriod metav1.Duration `json:"refillPeriod"`
524+
}
525+
526+
// ToolRateLimitConfig defines rate limits for a specific tool.
527+
type ToolRateLimitConfig struct {
528+
// Name is the MCP tool name this limit applies to.
529+
// +kubebuilder:validation:Required
530+
// +kubebuilder:validation:MinLength=1
531+
Name string `json:"name"`
532+
533+
// Shared defines a token bucket shared across all users for this specific tool.
534+
// +kubebuilder:validation:Required
535+
Shared *RateLimitBucket `json:"shared"`
536+
}
537+
484538
// Permission profile types
485539
const (
486540
// PermissionProfileTypeBuiltin is the type for built-in permission profiles

cmd/thv-operator/api/v1alpha1/mcpserver_types_test.go

Lines changed: 62 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,9 +6,11 @@ package v1alpha1
66
import (
77
"encoding/json"
88
"testing"
9+
"time"
910

1011
"github.com/stretchr/testify/assert"
1112
"github.com/stretchr/testify/require"
13+
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
1214
)
1315

1416
func TestSessionStorageConfigJSONRoundtrip(t *testing.T) {
@@ -65,6 +67,55 @@ func TestSessionStorageConfigJSONRoundtrip(t *testing.T) {
6567
}
6668
}
6769

70+
func TestRateLimitConfigJSONRoundtrip(t *testing.T) {
71+
t.Parallel()
72+
73+
tests := []struct {
74+
name string
75+
input RateLimitConfig
76+
wantJSON string
77+
}{
78+
{
79+
name: "shared only",
80+
input: RateLimitConfig{
81+
Shared: &RateLimitBucket{MaxTokens: 100, RefillPeriod: metav1.Duration{Duration: time.Minute}},
82+
},
83+
wantJSON: `{"shared":{"maxTokens":100,"refillPeriod":"1m0s"}}`,
84+
},
85+
{
86+
name: "tools only",
87+
input: RateLimitConfig{
88+
Tools: []ToolRateLimitConfig{
89+
{Name: "search", Shared: &RateLimitBucket{MaxTokens: 5, RefillPeriod: metav1.Duration{Duration: 10 * time.Second}}},
90+
},
91+
},
92+
wantJSON: `{"tools":[{"name":"search","shared":{"maxTokens":5,"refillPeriod":"10s"}}]}`,
93+
},
94+
{
95+
name: "shared with tools",
96+
input: RateLimitConfig{
97+
Shared: &RateLimitBucket{MaxTokens: 100, RefillPeriod: metav1.Duration{Duration: time.Minute}},
98+
Tools: []ToolRateLimitConfig{
99+
{
100+
Name: "search",
101+
Shared: &RateLimitBucket{MaxTokens: 5, RefillPeriod: metav1.Duration{Duration: 10 * time.Second}},
102+
},
103+
},
104+
},
105+
wantJSON: `{"shared":{"maxTokens":100,"refillPeriod":"1m0s"},"tools":[{"name":"search","shared":{"maxTokens":5,"refillPeriod":"10s"}}]}`,
106+
},
107+
}
108+
109+
for _, tc := range tests {
110+
t.Run(tc.name, func(t *testing.T) {
111+
t.Parallel()
112+
b, err := json.Marshal(tc.input)
113+
require.NoError(t, err)
114+
assert.JSONEq(t, tc.wantJSON, string(b))
115+
})
116+
}
117+
}
118+
68119
func TestMCPServerSpecScalingFieldsJSONRoundtrip(t *testing.T) {
69120
t.Parallel()
70121

@@ -80,7 +131,7 @@ func TestMCPServerSpecScalingFieldsJSONRoundtrip(t *testing.T) {
80131
{
81132
name: "nil replicas are omitted",
82133
spec: MCPServerSpec{Image: "example/mcp:latest"},
83-
wantAbsent: []string{`"replicas"`, `"backendReplicas"`, `"sessionStorage"`},
134+
wantAbsent: []string{`"replicas"`, `"backendReplicas"`, `"sessionStorage"`, `"rateLimiting"`},
84135
},
85136
{
86137
name: "set replicas are serialized",
@@ -102,6 +153,16 @@ func TestMCPServerSpecScalingFieldsJSONRoundtrip(t *testing.T) {
102153
},
103154
wantKeys: []string{`"sessionStorage"`, `"provider":"redis"`},
104155
},
156+
{
157+
name: "rateLimiting is serialized when set",
158+
spec: MCPServerSpec{
159+
Image: "example/mcp:latest",
160+
RateLimiting: &RateLimitConfig{
161+
Shared: &RateLimitBucket{MaxTokens: 100, RefillPeriod: metav1.Duration{Duration: time.Minute}},
162+
},
163+
},
164+
wantKeys: []string{`"rateLimiting"`, `"maxTokens":100`, `"refillPeriod":"1m0s"`},
165+
},
105166
}
106167

107168
for _, tc := range tests {

cmd/thv-operator/api/v1alpha1/zz_generated.deepcopy.go

Lines changed: 68 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

deploy/charts/operator-crds/files/crds/toolhive.stacklok.dev_mcpservers.yaml

Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -497,6 +497,82 @@ spec:
497497
maximum: 65535
498498
minimum: 1
499499
type: integer
500+
rateLimiting:
501+
description: |-
502+
RateLimiting defines rate limiting configuration for the MCP server.
503+
Requires Redis session storage to be configured for distributed rate limiting.
504+
properties:
505+
shared:
506+
description: Shared defines a token bucket shared across all users
507+
for the entire server.
508+
properties:
509+
maxTokens:
510+
description: |-
511+
MaxTokens is the maximum number of tokens (bucket capacity).
512+
This is also the burst size: the maximum number of requests that can be served
513+
instantaneously before the bucket is depleted.
514+
format: int32
515+
minimum: 1
516+
type: integer
517+
refillPeriod:
518+
description: |-
519+
RefillPeriod is the duration to fully refill the bucket from zero to maxTokens.
520+
The effective refill rate is maxTokens / refillPeriod tokens per second.
521+
Format: Go duration string (e.g., "1m0s", "30s", "1h0m0s").
522+
type: string
523+
required:
524+
- maxTokens
525+
- refillPeriod
526+
type: object
527+
tools:
528+
description: |-
529+
Tools defines per-tool rate limit overrides.
530+
Each entry applies additional rate limits to calls targeting a specific tool name.
531+
A request must pass both the server-level limit and the per-tool limit.
532+
items:
533+
description: ToolRateLimitConfig defines rate limits for a specific
534+
tool.
535+
properties:
536+
name:
537+
description: Name is the MCP tool name this limit applies
538+
to.
539+
minLength: 1
540+
type: string
541+
shared:
542+
description: Shared defines a token bucket shared across
543+
all users for this specific tool.
544+
properties:
545+
maxTokens:
546+
description: |-
547+
MaxTokens is the maximum number of tokens (bucket capacity).
548+
This is also the burst size: the maximum number of requests that can be served
549+
instantaneously before the bucket is depleted.
550+
format: int32
551+
minimum: 1
552+
type: integer
553+
refillPeriod:
554+
description: |-
555+
RefillPeriod is the duration to fully refill the bucket from zero to maxTokens.
556+
The effective refill rate is maxTokens / refillPeriod tokens per second.
557+
Format: Go duration string (e.g., "1m0s", "30s", "1h0m0s").
558+
type: string
559+
required:
560+
- maxTokens
561+
- refillPeriod
562+
type: object
563+
required:
564+
- name
565+
- shared
566+
type: object
567+
type: array
568+
x-kubernetes-list-map-keys:
569+
- name
570+
x-kubernetes-list-type: map
571+
type: object
572+
x-kubernetes-validations:
573+
- message: at least one of shared or tools must be configured
574+
rule: has(self.shared) || (has(self.tools) && size(self.tools) >
575+
0)
500576
replicas:
501577
description: |-
502578
Replicas is the desired number of proxy runner (thv run) pod replicas.
@@ -886,6 +962,9 @@ spec:
886962
- message: telemetry and telemetryConfigRef are mutually exclusive; migrate
887963
to telemetryConfigRef
888964
rule: '!(has(self.telemetry) && has(self.telemetryConfigRef))'
965+
- message: rateLimiting requires sessionStorage with provider 'redis'
966+
rule: '!has(self.rateLimiting) || (has(self.sessionStorage) && self.sessionStorage.provider
967+
== ''redis'')'
889968
status:
890969
description: MCPServerStatus defines the observed state of MCPServer
891970
properties:

0 commit comments

Comments
 (0)