You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Packaging, automation and deployment (ex: pypi, docker, quay.io, kubernetes, terraform)
Other
🧭 Epic
Title: Clear rate-limiter persistent counter state when its mode is toggled to disabled.
Goal: When an operator flips the rate-limiter plugin to disabled at runtime, the counter state it has accumulated in Redis should be released, not left behind to silently reapply when the plugin is re-enabled.
Why now: This is the concrete follow-up to design question #3 in the parent issue (#4514): "Should disabled mode actually reset Redis state? Today it stops checking counters but doesn't clear them." The team has converged on "yes, clear them" as the right behavioural contract for the runtime toggle. This issue tracks delivering that contract.
The user-visible problem today: operators using the runtime toggle as an incident-response release valve (or part of a rollout) expect disabled to mean "the limiter's effect is gone." In practice, because the counters persist, the moment the operator flips the plugin back to enforce, traffic that was perfectly legitimate during the disabled window can be blocked instantly by counter values inherited from before the toggle. The behaviour is surprising and works against the use case the toggle was added for.
🧑🏻💻 User Story 1
As an operator using the runtime mode toggle to disable rate-limiting during an incident or rollout, I want the rate-limiter's persistent counter state to be cleared when I flip the plugin to disabled, So that when I subsequently re-enable enforcement, traffic is judged against a fresh window and is not immediately blocked by counters that accumulated before the toggle.
✅ Acceptance Criteria
Scenario: counters cleared on enforce → disabledGiven the rate-limiter is in `enforce` mode
And there are non-empty counter keys for the limiter in Redis
When an operator toggles the plugin's mode to `disabled`
Then the limiter's counter keys are no longer present in Redis once the toggle has propagated
Scenario: re-enable starts a fresh windowGiven the limiter has just been toggled to `disabled` (and its counters cleared)
When the limiter is toggled back to `enforce`
Then a freshly-arriving request under the configured rate is allowed
And no request is blocked solely on the basis of counter state that existed before the disable
🧑🏻💻 User Story 2
As a tenant served by a multi-tenant gateway, I want the cleanup that happens on disabled to respect tenant scoping and the rate-limiter's key prefix, So that unrelated keys in the same Redis instance (other plugins, shared infra, application data) are unaffected by the toggle.
✅ Acceptance Criteria
Scenario: blast radius is confined to the rate-limiter's keyspaceGiven the rate-limiter has counter keys across multiple tenants
And there are unrelated keys in the same Redis database that do not belong to the limiter
When the limiter is toggled to `disabled`
Then all of the limiter's counter keys (across every tenant) are cleared
And every unrelated key is left untouched
📐 Design Sketch (optional)
Out of scope here. The agreed direction is "clear on disable"; the approach to clearing — single-flight, batched, prefix-scoped — is being worked through on the investigation branch linked below.
🔗 MCP Standards Check
Change adheres to current MCP specifications
No breaking changes to existing MCP-compliant integrations
If deviations exist, please describe them below
The runtime mode toggle is an internal gateway concern; the MCP-visible surface (tool calls, prompts) is unchanged.
Clear only on disabled → enforce transition. Same end-state on re-enable, but the counters continue consuming Redis memory during the entire disabled window with no functional purpose. Rejected as wasteful and as creating an opportunity for stale state to drift.
Operator-driven manual clear (e.g. an admin endpoint). Useful as a follow-up affordance but doesn't solve the surprise-on-re-enable problem unless operators remember to call it on every toggle.
Initial investigation (cpex-plugins): branch feat/rate-limiter-wipe-on-disable-only — three commits, including a round of review feedback on single-flight, UNLINK vs. DEL, and prefix scoping. Treat as a starting point for discussion, not a commitment to ship as-is.
🧭 Type of Feature
🧭 Epic
Title: Clear rate-limiter persistent counter state when its mode is toggled to
disabled.Goal: When an operator flips the rate-limiter plugin to
disabledat runtime, the counter state it has accumulated in Redis should be released, not left behind to silently reapply when the plugin is re-enabled.Why now: This is the concrete follow-up to design question #3 in the parent issue (#4514): "Should
disabledmode actually reset Redis state? Today it stops checking counters but doesn't clear them." The team has converged on "yes, clear them" as the right behavioural contract for the runtime toggle. This issue tracks delivering that contract.The user-visible problem today: operators using the runtime toggle as an incident-response release valve (or part of a rollout) expect
disabledto mean "the limiter's effect is gone." In practice, because the counters persist, the moment the operator flips the plugin back toenforce, traffic that was perfectly legitimate during the disabled window can be blocked instantly by counter values inherited from before the toggle. The behaviour is surprising and works against the use case the toggle was added for.🧑🏻💻 User Story 1
As an operator using the runtime mode toggle to disable rate-limiting during an incident or rollout,
I want the rate-limiter's persistent counter state to be cleared when I flip the plugin to
disabled,So that when I subsequently re-enable enforcement, traffic is judged against a fresh window and is not immediately blocked by counters that accumulated before the toggle.
✅ Acceptance Criteria
🧑🏻💻 User Story 2
As a tenant served by a multi-tenant gateway,
I want the cleanup that happens on
disabledto respect tenant scoping and the rate-limiter's key prefix,So that unrelated keys in the same Redis instance (other plugins, shared infra, application data) are unaffected by the toggle.
✅ Acceptance Criteria
📐 Design Sketch (optional)
Out of scope here. The agreed direction is "clear on disable"; the approach to clearing — single-flight, batched, prefix-scoped — is being worked through on the investigation branch linked below.
🔗 MCP Standards Check
The runtime mode toggle is an internal gateway concern; the MCP-visible surface (tool calls, prompts) is unchanged.
🔄 Alternatives Considered
FLUSHDBthe appropriate prefix) before re-enable behaves as expected. This is the source of the surprise reported in Rate limiter: runtime mode-toggle convergence behaviour and SLA decisions #4514 design question Update Makefile and pyproject.toml with packaging steps #3.disabled → enforcetransition. Same end-state on re-enable, but the counters continue consuming Redis memory during the entiredisabledwindow with no functional purpose. Rejected as wasteful and as creating an opportunity for stale state to drift.📓 Additional Context
disabledmode actually reset Redis state?"). This issue is the implementation half of that question once the team converged on "yes."feat/rate-limiter-wipe-on-disable-only— three commits, including a round of review feedback on single-flight,UNLINKvs.DEL, and prefix scoping. Treat as a starting point for discussion, not a commitment to ship as-is.