Skip to content

feat(security): add API rate limiting with slowapi#327

Merged
frankbria merged 11 commits into
mainfrom
feature/api-rate-limiting
Feb 3, 2026
Merged

feat(security): add API rate limiting with slowapi#327
frankbria merged 11 commits into
mainfrom
feature/api-rate-limiting

Conversation

@frankbria

@frankbria frankbria commented Feb 2, 2026

Copy link
Copy Markdown
Owner

Summary

Add comprehensive API rate limiting to protect endpoints from abuse using the slowapi library.

  • Rate limit categories: auth (10/min), standard (100/min), AI (20/min), websocket (30/min)
  • Key extraction: User ID for authenticated requests, IP address for anonymous
  • Storage options: In-memory (default) or Redis for distributed deployments
  • Proper 429 responses: Includes Retry-After header and JSON error body

Changes

New Files

  • codeframe/config/rate_limits.py - Configuration with environment variable support
  • codeframe/lib/rate_limiter.py - Middleware with slowapi integration
  • tests/config/test_rate_limits.py - 8 configuration tests
  • tests/lib/test_rate_limiter.py - 15 middleware tests

Modified Files

  • Applied rate limits to chat, projects, tasks, and agents routers
  • Added audit logging events for rate limit exceeded
  • Updated .env.example with rate limiting configuration

Configuration

RATE_LIMIT_ENABLED=true
RATE_LIMIT_AUTH=10/minute
RATE_LIMIT_STANDARD=100/minute
RATE_LIMIT_AI=20/minute
RATE_LIMIT_STORAGE=memory  # or "redis"

Test plan

  • 23 new rate limiting tests pass
  • 289 existing API tests pass (no regressions)
  • Ruff linting passes
  • Manual verification of 429 responses when limits exceeded
  • Verify rate limit headers in responses

Summary by CodeRabbit

  • New Features

    • Global, configurable API rate limiting with per-category limits (auth, standard, AI, WebSocket), middleware, consistent 429 + Retry-After responses; many endpoints updated to accept request context and body payloads for rate-limited handling.
  • Documentation

    • Added RATE_LIMIT_* and REDIS_URL entries to the environment example.
  • Audit

    • New audit events for rate-limit warnings and exceeded limits.
  • Tests

    • Extensive unit and integration tests covering configuration, limiter behavior, key extraction, decorators, and disabled mode.

Add comprehensive rate limiting to protect API endpoints from abuse:

- Add slowapi dependency for FastAPI-compatible rate limiting
- Create RateLimitConfig with environment variable support:
  - RATE_LIMIT_ENABLED: Enable/disable rate limiting (default: true)
  - RATE_LIMIT_AUTH: Auth endpoints limit (default: 10/minute)
  - RATE_LIMIT_STANDARD: Standard API limit (default: 100/minute)
  - RATE_LIMIT_AI: AI operations limit (default: 20/minute)
  - RATE_LIMIT_STORAGE: memory or redis (default: memory)
- Create rate limiter middleware with:
  - Key extraction from user ID (authenticated) or IP (anonymous)
  - X-Forwarded-For/X-Real-IP header support for proxied requests
  - Custom 429 response with Retry-After header
- Apply rate limits to critical endpoints:
  - AI rate limits on chat and agent start/resume endpoints
  - Standard rate limits on CRUD operations
- Add rate limit audit logging events
- Add 23 TDD tests for rate limiting functionality

Rate limit categories:
- auth: 10/minute (authentication endpoints)
- standard: 100/minute (CRUD operations)
- ai: 20/minute (LLM-powered operations)
- websocket: 30/minute (connection rate limiting)
@frankbria frankbria linked an issue Feb 2, 2026 that may be closed by this pull request
5 tasks
@coderabbitai

coderabbitai Bot commented Feb 2, 2026

Copy link
Copy Markdown
Contributor

Walkthrough

Adds environment-driven rate limiting: new RateLimitConfig and GlobalConfig fields, a slowapi-based Limiter (memory or Redis), per-category decorators applied across many routes, FastAPI middleware/handler wiring, audit events for rate limits, and tests plus test updates. (49 words)

Changes

Cohort / File(s) Summary
Env & Config
\.env.example, codeframe/config/rate_limits.py, codeframe/core/config.py
Added RATE_LIMIT_* env vars and REDIS_URL example; new RateLimitConfig dataclass, trusted-proxy parsing, from_global_config() + cached get_rate_limit_config() and reset helper; added GlobalConfig rate-limit fields, storage validator, and global config caching utilities.
Rate Limiter Core
codeframe/lib/rate_limiter.py
New slowapi-based limiter: trusted-proxy-aware client IP extraction, user/IP key generation, memory or Redis backend, lazy Limiter init, per-category decorators (rate_limit_auth/standard/ai/websocket), 429 handler with Retry-After and audit hooks, plus reset helper for tests.
Audit Logging
codeframe/lib/audit_logger.py
Added RATE_LIMIT_EXCEEDED and RATE_LIMIT_WARNING event types and AuditLogger.log_rate_limit_event() to emit rate-limit events with endpoint and category metadata.
Server Integration
codeframe/ui/server.py
Wired limiter into FastAPI app lifecycle: registers rate-limit exception handler, conditionally applies SlowAPIMiddleware when enabled, initializes app.state limiter reference, and logs limiter status.
Route Integrations (v1 & v2)
codeframe/ui/routers/...
codeframe/ui/routers/*, many files
Injected request: Request into numerous endpoints, applied rate-limit decorators across many routers, converted several handlers to explicit body payloads, and updated logging/field access to use body.*. Review signature changes, added imports, and decorator placements.
Dependencies
pyproject.toml
Added runtime dependency slowapi>=0.1.9.
Tests & Fixtures
tests/config/test_rate_limits.py, tests/lib/test_rate_limiter.py, tests/ui/*, tests/config/__init__.py
Added unit and integration tests for RateLimitConfig and limiter behavior (key extraction, decorators, 429 flows, disabled mode); introduced/updated mock_request fixtures and updated UI tests to pass explicit request and body args.
Test Adjustments
tests/ui/test_assign_pending_tasks.py, tests/ui/test_task_approval*.py
Updated tests to accept mock_request fixture and call endpoints with request=mock_request and separate body parameters.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant App as "FastAPI App"
    participant Middleware as "SlowAPI Middleware"
    participant Limiter as "Limiter (memory/Redis)"
    participant Route as "Route Handler"
    participant Audit as "Audit Logger"

    Client->>App: HTTP request + headers
    App->>Middleware: forward request
    Middleware->>Limiter: check/increment key (IP or user)
    Limiter-->>Middleware: allowed? / Retry-After

    alt allowed
        Middleware-->>App: continue
        App->>Route: execute handler
        Route-->>App: response
        App-->>Client: 2xx response
    else exceeded
        Middleware-->>App: raise RateLimitExceeded
        App->>Audit: log_rate_limit_event(endpoint, category)
        Audit-->>App: logged
        App-->>Client: 429 JSON + Retry-After
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

Poem

🐰 I hopped through envs and keys with glee,
counted requests in Redis and memory,
decorators guard chat, tasks, and more,
logs a nibble, then a gentle roar,
a rabbit cheers — limits set, and free!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat(security): add API rate limiting with slowapi' directly and clearly summarizes the main change—adding API rate limiting functionality using the slowapi library. It accurately reflects the primary objective of the PR.
Docstring Coverage ✅ Passed Docstring coverage is 97.66% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/api-rate-limiting

Comment @coderabbitai help to get the list of available commands and usage tips.

@macroscopeapp

macroscopeapp Bot commented Feb 2, 2026

Copy link
Copy Markdown
Contributor

Add API rate limiting with slowapi across UI routers and initialize limiter/middleware in codeframe/ui/server.py

Introduce rate_limit_standard and rate_limit_ai decorators on v1/v2 routers, add Request parameters to handlers, wire up limiter middleware and a 429 handler, and add configuration, utilities, and tests for limiter keys, trusted proxies, and env-driven limits. Core additions are in server.py, rate_limiter.py, and rate_limits.py.

📍Where to Start

Start with the limiter setup and exception handler in app initialization in server.py.


Macroscope summarized f6144c5.

@claude

claude Bot commented Feb 2, 2026

Copy link
Copy Markdown

Code Review - API Rate Limiting Implementation

This is a well-structured implementation of rate limiting for the FastAPI server. The code is clean, well-tested, and follows good practices.

Strengths

Architecture & Design

  • Clean separation of concerns: Configuration (rate_limits.py) is separate from middleware logic (rate_limiter.py)
  • Flexible configuration with environment variable support
  • Category-based limiting: Different rate limits for auth/standard/AI/websocket operations
  • Smart key generation: uses user ID for authenticated requests, IP for anonymous

Code Quality

  • Comprehensive tests: 23 tests covering configuration, key extraction, integration, and edge cases
  • Good error handling: Proper 429 responses with Retry-After headers and JSON error bodies
  • Audit logging integration: Rate limit exceeded events are logged
  • Documentation: Clear docstrings, inline comments, and .env.example documentation

Testing

  • TDD approach with tests written first
  • Integration tests with TestClient
  • Edge cases covered (missing client, X-Forwarded-For parsing, disabled rate limiting)

Issues & Concerns

1. Decorator Implementation Pattern (Medium Priority)

The _create_rate_limit_decorator at lines 165-201 in codeframe/lib/rate_limiter.py has an unused wrapped function that is defined but never called when rate limiting is enabled (line 180-181). This is dead code that should be removed for clarity.

The configuration check happens at decorator application time (module load), not at request time. This means toggling RATE_LIMIT_ENABLED requires a restart. This is acceptable but worth documenting.

2. Architecture Concern: v1 vs v2 Alignment (High Priority)

According to CLAUDE.md, this codebase is in a v2 refactor phase where the goal is CLI-first, headless functionality. This PR adds rate limiting to the v1 FastAPI server, which is marked as optional adapter and legacy.

Questions:

  1. Does the Golden Path CLI need rate limiting? (Probably not - it is local)
  2. Is this enhancement aligned with v2 Phase 1 priorities? (CLI completion, cf prd generate, cf work follow)
  3. Should rate limiting be deferred to Phase 2 (Server Layer) or Phase 3 (Web UI Rebuild)?

Recommendation: Get explicit sign-off that rate limiting is an acceptable v1 enhancement during the v2 refactor.

(continued in next comment...)

@claude

claude Bot commented Feb 2, 2026

Copy link
Copy Markdown

3. Configuration Duplication

Rate limit configuration exists in TWO places:

  • codeframe/config/rate_limits.py - New module with RateLimitConfig (7 fields)
  • codeframe/core/config.py - Added rate_limit_enabled, rate_limit_storage, redis_url to GlobalConfig (3 fields)

Recommendation: Either unify into GlobalConfig (add all 7 fields) OR remove from GlobalConfig entirely and only use RateLimitConfig. Document why both are needed if there is a good reason.

4. Missing Redis Dependency

pyproject.toml adds slowapi but not redis. If RATE_LIMIT_STORAGE=redis, the code will fail with ImportError. While the code handles this gracefully (falls back to memory at line 112), it would be better to:

  1. Add redis as an optional dependency
  2. Update documentation to mention how to install it
  3. Or clearly document that Redis support requires manual installation

5. X-Forwarded-For Security Risk (High Priority)

Lines 49-52 in codeframe/lib/rate_limiter.py trust X-Forwarded-For unconditionally. This is a security risk.

Security concern: X-Forwarded-For can be spoofed by clients. An attacker could set this header to bypass rate limits by using different IPs for each request.

Recommendations:

  1. Only trust X-Forwarded-For when behind a known reverse proxy
  2. Add configuration option TRUST_PROXY_HEADERS (default: false)
  3. In hosted mode, trust proxy headers; in self-hosted mode, default to request.client.host
  4. Document the security implications in .env.example

This is especially important given the deployment mode system already in place (is_hosted_mode()).

Minor Issues

6. Test Assertion Could Be Stronger
tests/lib/test_rate_limiter.py:256 has: assert limiter is None or limiter is not None
This always passes. Should be: assert limiter is None

7. Missing Rate Limit Headers on Success
The test test_rate_limit_headers_in_response (line 134) has a weak assertion. Consider documenting that slowapi does not add headers on successful requests by default, or add custom middleware to include X-RateLimit-Remaining.

8. No Distributed Rate Limiting Test
Tests cover memory storage but not Redis. Consider adding mocked Redis tests or documentation on manual testing.

9. Missing Manual Testing Checklist
PR description has unchecked items for manual verification. Provide a testing guide in the PR description.

@claude

claude Bot commented Feb 2, 2026

Copy link
Copy Markdown

Code Review - API Rate Limiting Implementation

I've reviewed this PR and overall the implementation is solid and well-structured. The rate limiting functionality is comprehensive, properly tested, and follows good security practices. However, there is one critical architectural concern that needs to be addressed based on the v2 refactor guidelines.

Critical: v2 Architecture Alignment

According to CLAUDE.md, CodeFRAME is undergoing an in-place v2 refactor with a headless, CLI-first architecture. The current implementation adds rate limiting exclusively to the FastAPI server layer (codeframe/ui/routers/* and server.py), which is described as an optional adapter in the v2 architecture.

The issue: Rate limiting is a security feature that should protect the entire system, not just the legacy API layer. If the v2 CLI or future adapters need rate limiting, this implementation won't help.

Recommendation

Since this is a v1/legacy feature (FastAPI server), I suggest accepting it as v1-only infrastructure. Acknowledge this is legacy code and may need refactoring in future phases. Add a TODO comment noting this should be reconsidered when Phase 2 (Server Layer) or Phase 3 (Web UI Rebuild) begins.

For now, recommend option 1 with a clear comment in the code, since the v2 roadmap shows Phase 1 (CLI Completion) is still in progress.

Strengths

1. Excellent Test Coverage

  • 23 new tests with clear TDD approach
  • Covers configuration parsing, key extraction, integration scenarios
  • Tests for disabled state and edge cases (no client, X-Forwarded-For chains)

2. Security Best Practices

  • User-based limiting for authenticated requests (prevents IP spoofing bypass)
  • Proper header handling (X-Forwarded-For, X-Real-IP) for proxied deployments
  • Proper 429 responses with Retry-After header
  • Audit logging for rate limit events

3. Clean Abstractions

  • Configuration is environment-driven and testable
  • Decorators provide clean API
  • Singleton pattern for limiter initialization

4. Deployment Flexibility

  • In-memory storage for development/self-hosted
  • Redis support for distributed/hosted deployments
  • Graceful degradation when rate limiting is disabled

Issues and Suggestions

1. Decorator Implementation Quirk (Minor)

In rate_limiter.py:175-200, the decorator factory has an unnecessary nesting level. The function is called with an extra () which works but feels awkward. Impact: Low - it works fine, just unusual API design.

2. Missing Import Validation (Minor)

rate_limiter.py:111 catches ImportError for Redis, but slowapi itself might not handle missing Redis gracefully. Consider checking if redis is installed before attempting to initialize with storage_uri. Impact: Low - but prevents potential runtime errors in production.

3. Duplicate Limiter Initialization (Code Smell)

In server.py, the limiter is initialized in two places: inside lifespan() startup (lines 212-224) and at module level before lifespan runs (lines 263-268). Verify whether the duplicate initialization is actually needed.

4. Test Isolation Concerns (Medium)

Several tests use reset functions to clean up global state. Not all tests reset state after running, and global state can leak between tests in parallel execution. Use pytest fixtures to ensure cleanup with an autouse fixture.

5. Missing Rate Limit Headers on Success (Info)

Per RFC 6585 and common API practices, rate limit headers should be included on all responses, not just 429s (X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset). Consider adding middleware to inject these headers. Impact: Low - not critical for MVP, but improves API developer experience.

Testing Suggestions

  1. Missing Integration Test - No tests verify the audit logging integration. Add a test that exhausts the limit, triggers a 429, and verifies the audit log entry was created.

  2. Test Concurrent Requests - Add a test that fires multiple concurrent requests to verify the limiter handles race conditions correctly.

  3. Test Different Time Windows - Tests only use /minute limits. Add tests for /second and /hour to verify the library handles different time windows correctly.

Summary

Aspect Rating Notes
Architecture Needs Discussion v2 alignment issue - needs decision on v1-only vs future-proofing
Security Excellent Proper user-based limiting, header handling, audit logging
Code Quality Good Clean, well-structured, minor decorator quirk
Test Coverage Good 23 tests, could add audit logging + concurrency tests
Documentation Good Clear docstrings and .env comments

Approval Status

Conditionally approve pending:

  1. Decision on v2 architecture alignment (add TODO comment or refactor)
  2. Fix duplicate limiter initialization (if possible)
  3. Add test isolation fixture (autouse fixture for cleanup)

The implementation itself is solid. The main question is whether this fits the v2 roadmap direction or should be tagged as legacy infrastructure.

Great work on the TDD approach and comprehensive coverage!

@claude

claude Bot commented Feb 2, 2026

Copy link
Copy Markdown

Recommendations Summary

Must Address Before Merge:

  1. Get explicit approval for v1 enhancement during v2 refactor (Issue Brainstorming: Integrate remaining general concepts into specification #2) - strategic alignment
  2. Address X-Forwarded-For spoofing risk (Issue feat: Project schema refactoring with API endpoint integration #5) - security concern

Should Fix Before Merge:
3. Resolve configuration duplication (Issue #3) - code cleanliness
4. Document Redis dependency approach (Issue #4) - user experience
5. Fix weak test assertion (Issue #6) - test quality
6. Clean up unused wrapped function (Issue #1) - code clarity

Nice to Have:
7. Add rate limit headers on success - API usability
8. Add Redis testing/documentation - completeness
9. Provide manual testing guide - verification

Overall Assessment

This is a solid, well-tested implementation with good architecture and comprehensive test coverage (23 tests). The code demonstrates good engineering practices with TDD, clear separation of concerns, and flexible configuration.

Main concerns:

  1. Strategic alignment with v2 refactor (needs confirmation this is acceptable v1 work)
  2. Security consideration around proxy header spoofing
  3. Configuration duplication that could be streamlined

Test Coverage: Excellent - 23 tests covering unit, integration, and edge cases
Security: Generally good, but needs proxy header trust mechanism
Documentation: Good, but could be enhanced with manual testing guide

Recommendation: Request clarification on v2 alignment (Issue #2) and fixes for security issue (Issue #5), then approve after addressing feedback.

Great work overall! The implementation is clean and well-thought-out. Once the strategic alignment is confirmed and the security concern is addressed, this will be a valuable addition to the codebase.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
codeframe/ui/routers/tasks.py (1)

294-301: ⚠️ Potential issue | 🔴 Critical

Bug: Wrong variable used - request.approved should be body.approved.

Line 294 accesses request.approved, but request is the FastAPI Request object, not the Pydantic body model. This will cause an AttributeError at runtime since Request has no approved attribute.

🐛 Fix the variable reference
     # Check if user is rejecting
-    if not request.approved:
+    if not body.approved:
         return TaskApprovalResponse(
             success=False,
             phase=project.get("phase", "planning"),
🤖 Fix all issues with AI agents
In `@codeframe/config/rate_limits.py`:
- Around line 118-123: The current logger.info in the rate limit initialization
(referencing _rate_limit_config and logger.info) must not emit sensitive
credentials; ensure you do not log _rate_limit_config.redis_url or, if you need
to log the host, mask credentials first (e.g., strip or replace user:password@
from the URL) before including it in logger.info; audit any future additions
near the RateLimitConfig/_rate_limit_config initialization to either omit
redis_url entirely or use a sanitized value when constructing the log message.

In `@codeframe/lib/rate_limiter.py`:
- Around line 48-52: The X-Forwarded-For handling in the get_client_ip (or
equivalent) function currently trusts request.headers["X-Forwarded-For"]
unconditionally (variable forwarded_for), enabling header spoofing; change this
to respect a configuration flag (e.g., TRUST_X_FORWARDED_FOR or
trust_x_forwarded_for) so the code only parses forwarded_for when that flag is
true and the app is deployed behind a trusted proxy, otherwise ignore the header
and use the actual remote address (e.g., request.client.host or equivalent);
also add a short docstring or comment on the function noting the assumption and
how to disable header trust for direct deployments.
- Around line 175-197: The decorator currently reads get_rate_limiter() and
get_rate_limit_config() at decoration time inside decorator()/wrapper(), causing
config to be frozen at import; fix this by moving retrieval of limiter, config,
and limit_value (get_rate_limiter(), get_rate_limit_config(), and
getattr(config, limit_key, ...)) into the inner async wrapped(*args, **kwargs)
so the limit is evaluated at request time and then apply limiter.limit(...)
there (or if you prefer keeping decoration-time semantics, update tests to call
reset_rate_limiter() to clear the cached _limiter before re-importing);
reference decorator(), wrapped(), get_rate_limiter(), get_rate_limit_config(),
limit_key, _limiter, and reset_rate_limiter().
- Around line 143-146: The Retry-After header is currently set to exc.detail (a
human-readable string) which violates RFC 7231; update the headers construction
in the rate limit handler (where headers is built, and exc is the
RateLimitExceeded exception) to use a numeric delay-seconds value instead—either
a fixed string like "60" or, if the exception exposes a retry delay (e.g.,
exc.retry_after or exc.reset_time), convert that to an integer string and use it
for "Retry-After" so the header contains a numeric value.

In `@tests/lib/test_rate_limiter.py`:
- Around line 249-258: The test currently contains a tautological assertion;
replace it with a meaningful check: after calling _reset_rate_limit_config() and
get_rate_limiter(), assert that either limiter is None OR (if limiter is not
None) that its no-op behavior is verifiable by exercising its public
rate-checking API (e.g., call limiter.allow("test") or limiter.acquire("test")
as appropriate) and assert the call returns True / does not block / does not
raise and does not decrement any real quota; keep references to get_rate_limiter
and _reset_rate_limit_config so the test verifies the disabled case
deterministically.
🧹 Nitpick comments (4)
tests/config/test_rate_limits.py (1)

102-111: Consider adding a test for invalid storage values.

The test validates that "memory" and "redis" are accepted, but doesn't verify behavior when an invalid value is provided. Based on the relevant code snippet, RateLimitConfig.from_environment() logs a warning and falls back to "memory" for invalid storage values.

🧪 Suggested test for invalid storage handling
def test_storage_validation_invalid_fallback(self):
    """Invalid storage should fall back to memory with warning."""
    from codeframe.config.rate_limits import RateLimitConfig

    with patch.dict(os.environ, {"RATE_LIMIT_STORAGE": "invalid"}, clear=True):
        config = RateLimitConfig.from_environment()
        # Should fall back to memory
        assert config.storage == "memory"
codeframe/ui/server.py (1)

212-224: Duplicate rate limiter initialization.

The rate limiter is initialized twice:

  1. At module level (lines 263-268) - runs at import time
  2. In lifespan (lines 212-222) - runs when app starts

The module-level initialization is necessary for middleware setup, but the lifespan block is redundant. This causes duplicate log messages and unnecessary work.

♻️ Suggested cleanup

Remove the redundant initialization in lifespan since the module-level code already handles it:

     # Log that authentication is now always required
     logger.info("🔒 Authentication: ENABLED (always required)")

-    # Initialize rate limiting
-    rate_limit_config = get_rate_limit_config()
-    if rate_limit_config.enabled:
-        limiter = get_rate_limiter()
-        if limiter:
-            app.state.limiter = limiter
-            logger.info(
-                f"🚦 Rate limiting: ENABLED "
-                f"(storage={rate_limit_config.storage}, "
-                f"standard={rate_limit_config.standard_limit})"
-            )
-    else:
-        logger.info("🚦 Rate limiting: DISABLED")
+    # Log rate limiting status (already initialized at module level)
+    rate_limit_config = get_rate_limit_config()
+    if rate_limit_config.enabled:
+        logger.info(
+            f"🚦 Rate limiting: ENABLED "
+            f"(storage={rate_limit_config.storage}, "
+            f"standard={rate_limit_config.standard_limit})"
+        )
+    else:
+        logger.info("🚦 Rate limiting: DISABLED")

     # Start background session cleanup task

Also applies to: 263-268

codeframe/config/rate_limits.py (1)

69-72: Consider validating rate limit format strings.

The rate limit strings (e.g., "10/minute") are passed directly from environment variables without format validation. Invalid formats like "abc/xyz" will cause runtime errors in slowapi.

This is optional since slowapi provides clear error messages, but early validation would fail fast with a helpful message:

import re

def _validate_rate_limit(value: str, name: str) -> str:
    """Validate rate limit format (e.g., '10/minute')."""
    if not re.match(r'^\d+/(second|minute|hour|day)$', value):
        logger.warning(f"Invalid {name}: {value}. Using default.")
        return None  # Return None to use dataclass default
    return value
tests/lib/test_rate_limiter.py (1)

144-171: Test fixture should reset both config and limiter.

The app_with_low_limit fixture resets the config but doesn't reset the global _limiter. This could cause test interference since the limiter is a global singleton.

♻️ Add limiter reset to fixture
     `@pytest.fixture`
     def app_with_low_limit(self):
         """Create a test app with very low rate limit for testing."""
         from codeframe.lib.rate_limiter import (
             rate_limit_exceeded_handler,
+            reset_rate_limiter,
         )
         from codeframe.config.rate_limits import _reset_rate_limit_config
         from slowapi import Limiter
         from slowapi.util import get_remote_address
         from slowapi.errors import RateLimitExceeded

         # Reset config to apply new limits
         _reset_rate_limit_config()
+        reset_rate_limiter()

Comment thread codeframe/config/rate_limits.py Outdated
Comment thread codeframe/lib/rate_limiter.py Outdated
Comment on lines +143 to +146
# Build response headers
headers = {
"Retry-After": str(exc.detail) if hasattr(exc, "detail") else "60",
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

slowapi RateLimitExceeded exception detail attribute format

💡 Result:

In slowapi, RateLimitExceeded.detail is a plain string (not a dict / Pydantic model).

  • The built-in _rate_limit_exceeded_handler formats its response using an f-string like f"Rate limit exceeded: {exc.detail}", so exc.detail is expected to be string-like. [1]
  • That string ultimately comes from the limit’s configured error_message (which can be a static string or a callable that returns a string). [2]

So in practice you should treat exc.detail as “the human-readable message for the limit that was hit” (whatever you configured via error_message, or the library’s default message). [1], [2]

Sources: [1] [2]


🏁 Script executed:

# Search for the rate_limiter.py file and its context
fd "rate_limiter.py" -type f

Repository: frankbria/codeframe

Length of output: 233


🏁 Script executed:

# Look for slowapi imports and RateLimitExceeded usage in the codebase
rg "RateLimitExceeded|from slowapi" -A 3 -B 1

Repository: frankbria/codeframe

Length of output: 4056


🏁 Script executed:

# Check the specific file to understand the full context around lines 143-146
fd "rate_limiter.py" -type f -exec wc -l {} \;

Repository: frankbria/codeframe

Length of output: 233


🏁 Script executed:

# Get the rate_limiter.py file and view the full context around lines 143-146
cat -n codeframe/lib/rate_limiter.py | sed -n '130,160p'

Repository: frankbria/codeframe

Length of output: 1247


🏁 Script executed:

# Check if there are any other references to how exc.detail is used or what it contains
rg "exc\.detail" -B 2 -A 2

Repository: frankbria/codeframe

Length of output: 340


Use numeric value for Retry-After header instead of exc.detail.

The Retry-After header at line 145 incorrectly uses exc.detail, which slowapi populates with a human-readable rate limit message (e.g., "10 per 1 minute"), not a numeric retry delay. RFC 7231 requires Retry-After to be either delay-seconds (integer) or an HTTP date. The fallback "60" is unreachable since RateLimitExceeded always has a detail attribute.

Replace with a fixed numeric value like "60" or extract the actual retry delay from the exception if available.

🤖 Prompt for AI Agents
In `@codeframe/lib/rate_limiter.py` around lines 143 - 146, The Retry-After header
is currently set to exc.detail (a human-readable string) which violates RFC
7231; update the headers construction in the rate limit handler (where headers
is built, and exc is the RateLimitExceeded exception) to use a numeric
delay-seconds value instead—either a fixed string like "60" or, if the exception
exposes a retry delay (e.g., exc.retry_after or exc.reset_time), convert that to
an integer string and use it for "Retry-After" so the header contains a numeric
value.

Comment thread codeframe/lib/rate_limiter.py Outdated
Comment thread tests/lib/test_rate_limiter.py Outdated
The rate limiting decorators require a `request: Request` parameter,
which changed function signatures. This commit:

- Fixes tasks.py to use `body.approved` instead of `request.approved`
- Adds mock_request fixture to test files that call endpoints directly
- Updates all direct endpoint calls to pass mock_request as first arg
- Updates TaskApprovalRequest parameter from `request=` to `body=`
- Updates function signature tests to expect `request` parameter

This resolves the 30 failing tests caused by the rate limiting changes.
@claude

claude Bot commented Feb 2, 2026

Copy link
Copy Markdown

Code Review - PR #327: API Rate Limiting

Thank you for this comprehensive rate limiting implementation! The overall architecture is solid and follows good patterns. Here's my detailed analysis:


🟢 Strengths

  1. Clean Architecture: Config-driven design with proper separation of concerns
  2. Flexible Configuration: Environment variable support with sensible defaults
  3. Robust IP Extraction: Handles proxy chains correctly (X-Forwarded-For, X-Real-IP)
  4. Smart Key Strategy: User ID for authenticated, IP for anonymous
  5. Good Test Coverage: 23 tests covering main scenarios
  6. Proper HTTP Standards: 429 responses with Retry-After headers

🔴 Critical Issues (Must Fix Before Merge)

1. Dead Code in Rate Limiter Decorator

File: codeframe/lib/rate_limiter.py:179-181

The wrapped() function is defined but never used:

def wrapper(func: Callable) -> Callable:
    @wraps(func)
    async def wrapped(*args, **kwargs):  # ← Dead code
        return await func(*args, **kwargs)
    
    limiter = get_rate_limiter()
    # ... code that returns limited_func, not wrapped
    return limited_func  # ← wrapped() is never called

Fix: Simplify the decorator to remove unused code:

def wrapper(func: Callable) -> Callable:
    limiter = get_rate_limiter()
    config = get_rate_limit_config()
    
    if limiter is None or not config.enabled:
        return func
    
    limit_value = getattr(config, limit_key, "100/minute")
    return limiter.limit(limit_value)(func)

2. Tautological Test Assertion

File: tests/config/test_rate_limits.py:255

assert limiter is None or limiter is not None  # Always True!

This test provides no value. Fix: Actually test that disabled rate limiting allows unlimited requests:

# Test that limiter is None when disabled
config = get_rate_limit_config()
assert config.enabled is False
limiter = get_rate_limiter()
assert limiter is None

🟡 High Priority Issues (Should Fix Soon)

3. Thread-Safety in Config Singleton

File: codeframe/config/rate_limits.py:116

Race condition possible during initialization:

_rate_limit_config: Optional[RateLimitConfig] = None

def get_rate_limit_config() -> RateLimitConfig:
    global _rate_limit_config
    if _rate_limit_config is None:  # ← Not thread-safe
        _rate_limit_config = RateLimitConfig.from_environment()

Fix: Use functools.lru_cache for thread-safety:

from functools import lru_cache

@lru_cache(maxsize=1)
def get_rate_limit_config() -> RateLimitConfig:
    return RateLimitConfig.from_environment()

4. User ID Null Handling

File: codeframe/lib/rate_limiter.py:80

If user.id is None, returns "user:None" as key:

if user and hasattr(user, "id"):
    return f"user:{user.id}"  # ← What if user.id is None?

Fix: Add null check:

if user and hasattr(user, "id") and user.id:
    return f"user:{user.id}"

5. Redis Import Error Silently Caught

File: codeframe/lib/rate_limiter.py:111

ImportError for Redis is caught but only logged as info:

except ImportError:
    logger.info("Redis not available")  # ← Should be ERROR level

Fix: Log at ERROR level so users know Redis storage is unavailable:

except ImportError as e:
    logger.error(f"Redis storage requested but redis module not available: {e}")

🔵 Test Coverage Gaps

Missing Test Scenarios:

  1. Actual rate limit enforcement: Test that multiple rapid requests get blocked
  2. Authenticated user rate limiting: All tests use IP-based limits
  3. Disabled rate limiting behavior: Verify unlimited requests when disabled
  4. Redis backend: Only in-memory storage tested
  5. Concurrent requests: Thread-safety not validated
  6. User ID edge cases: Empty string, None values

Example test to add:

async def test_authenticated_user_rate_limiting():
    """Test that authenticated users are rate limited by user ID."""
    limiter = RateLimiter()
    
    # Mock authenticated request
    request = MagicMock()
    request.state.user.id = "user123"
    
    key = limiter._get_rate_limit_key(request)
    assert key == "user:user123"
    
    # Test that multiple requests from same user are blocked
    # ... implementation

📊 Architecture Assessment

Following CLAUDE.md Guidelines?

YES - This is v1 legacy code (FastAPI server), not v2 core
✅ Proper placement in codeframe/lib/ and codeframe/config/
✅ No changes to codeframe/core/ (respects v2 refactor boundaries)
✅ Tests follow existing patterns

Note: Per CLAUDE.md, this PR enhances the optional FastAPI adapter layer, which is appropriate for v1 legacy maintenance.


🎯 Overall Verdict

Rating: 7.5/10
Recommendation: Merge with fixes for critical issues (#1, #2)

The architecture is sound and the implementation is production-ready after addressing the dead code and test assertion. The thread-safety and user ID handling issues are lower priority but should be addressed in a follow-up PR.


📝 Action Items

Before merge:

  • Remove dead code in _create_rate_limit_decorator()
  • Fix tautological assertion in test line 255
  • Add comment documenting request: Request parameter requirement

Post-merge (follow-up PR):

  • Add thread-safety to config singleton
  • Fix user ID null handling
  • Improve Redis error logging
  • Add authenticated user rate limiting tests
  • Add concurrent request tests

Great work on this feature! The rate limiting system will significantly improve API resilience. 🚀

Comment thread codeframe/ui/server.py

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
codeframe/ui/routers/tasks.py (1)

265-269: ⚠️ Potential issue | 🟡 Minor

Docstring mismatch for approve_tasks parameters.
The Args section still describes request as the approval payload; that should be body now.

✏️ Suggested docstring fix
-        request: Approval request with approved flag and optional exclusions
+        request: FastAPI request (rate limiter context)
+        body: Approval request with approved flag and optional exclusions
🧹 Nitpick comments (1)
tests/ui/test_assign_pending_tasks.py (1)

17-26: Consider centralizing the mock Request fixture.
This fixture is duplicated in other UI test modules; a shared conftest.py fixture would reduce drift and simplify updates.

Critical fixes:
- Remove dead code: unused wrapped() function in decorator
- Fix tautological test assertion (always true) with meaningful test

High priority fixes:
- Use lru_cache for thread-safe config singleton
- Add null check for user.id in rate limit key generation
- Change Redis import error log from WARNING to ERROR level
- Remove unused functools.wraps import
@claude

claude Bot commented Feb 2, 2026

Copy link
Copy Markdown

Code Review: API Rate Limiting Implementation

I have reviewed PR #327. Here is my detailed feedback on the rate limiting implementation.


Strengths

1. Solid Architecture

  • Clean separation of concerns: configuration, middleware, and integration
  • Good use of dependency injection patterns with slowapi
  • Proper environment-based configuration with sensible defaults
  • Lazy initialization of the limiter instance prevents startup issues

2. Comprehensive Key Extraction

  • Excellent IP extraction logic respecting proxy headers (X-Forwarded-For, X-Real-IP)
  • Smart fallback from authenticated user ID to IP address
  • Handles edge cases like missing client info gracefully

3. Proper HTTP Semantics

  • Correct 429 status code with Retry-After header
  • Clean JSON error response format
  • Rate limit headers included in response

4. Good Test Coverage

  • 23 new tests covering key functionality
  • Tests for IP extraction, key generation, decorator behavior
  • Integration tests for 429 responses

5. Configuration Flexibility

  • Multiple rate limit categories (auth, standard, AI, websocket)
  • Support for both in-memory and Redis storage backends
  • Easy on/off toggle via RATE_LIMIT_ENABLED

Issues and Concerns

1. CRITICAL: Audit Logging Not Implemented

The PR description claims "Added audit logging events for rate limit exceeded" but this is NOT implemented:

  • rate_limiter.py does not import or use AuditLogger
  • The rate_limit_exceeded_handler() only does basic logging via logger.warning()
  • The new AuditEventType.RATE_LIMIT_EXCEEDED and log_rate_limit_event() are defined but never called

Recommendation: Either implement the audit logging integration in rate_limit_exceeded_handler(), remove the audit logging claim from the PR description, or remove the unused audit logging code from audit_logger.py

2. V2 Architecture Concerns

According to CLAUDE.md, the v2 architecture principle is core-first with server/UI as optional adapters. This PR adds rate limiting only to the FastAPI layer. While this makes sense for HTTP-based rate limiting, consider adding a comment clarifying this is an HTTP-specific concern and does not apply to CLI/core operations.

3. Potential Race Condition in Limiter Initialization

server.py lines 263-268 runs at module import time, but the lifespan function (lines 213-224) also initializes the limiter. This could cause duplicate initialization or the limiter being created before environment variables are loaded.

Recommendation: Move all rate limiting initialization into the lifespan function after load_environment()

4. Decorator Pattern Issues

The decorator factory in _create_rate_limit_decorator() gets the config at module import time when decorators are applied, not at runtime. If config changes (e.g., in tests), the decorators will not see the new values.

Recommendation: Move config fetching to request time, or document that config is frozen at startup.

5. Missing Integration Points

The PR description mentions applying rate limits to chat, projects, tasks, and agents routers, but:

  • No rate limiting on authentication endpoints despite having a dedicated rate_limit_auth() decorator
  • WebSocket endpoints not rate limited despite having rate_limit_websocket()
  • Many v2 routers are not rate limited

Recommendation: Apply rate_limit_auth() to authentication endpoints and document which endpoints are intentionally NOT rate limited and why.

6. Security: IP Spoofing Risk

get_client_ip() trusts X-Forwarded-For and X-Real-IP headers without validation. In a self-hosted deployment without a trusted proxy, clients can spoof these headers to bypass rate limits.

Recommendation:

  • Add configuration to enable/disable proxy header trust (e.g., RATE_LIMIT_TRUST_PROXY)
  • In untrusted environments, only use request.client.host
  • Document this in the security section

Suggestions

1. Documentation Needed

  • Update .env.example explanation comments for each variable
  • Add a troubleshooting section
  • Document the security implications of RATE_LIMIT_STORAGE=redis vs memory

2. Observability Improvements

  • Add metrics/counters for rate limit hits
  • Log rate limit warnings before hitting the limit (e.g., at 80%)
  • Include more context in logs: rate limit key, limit value, current count

3. Testing Gaps

  • No tests for Redis storage backend (only in-memory)
  • No tests verifying the audit logging integration
  • No tests for the decorator behavior with different config values
  • Integration tests do not verify the Retry-After header value accuracy

Summary

This is a solid foundation for rate limiting with good architecture and test coverage. However, there are several gaps between the PR description and implementation:

  1. Audit logging is not actually integrated (claimed but not implemented)
  2. Authentication endpoints not rate limited (despite having a decorator for them)
  3. WebSocket endpoints not rate limited (despite having a decorator for them)
  4. Potential initialization race condition (limiter created before env loading)

Recommended Actions:

  1. Either implement or remove the audit logging claim
  2. Apply rate limits to auth and websocket endpoints
  3. Fix the initialization order issue in server.py
  4. Add documentation about security considerations (IP spoofing, proxy trust)
  5. Consider adding per-category enable/disable flags

Verdict:

The code quality is good, but the PR feels incomplete. The claimed features should either be fully implemented or removed from the description. I recommend addressing the audit logging discrepancy and initialization issues before merging.

Great work on the core implementation! The rate limiting logic itself is solid. Just needs some polish to match the description and handle edge cases.

The PR description claimed audit logging was implemented but it wasn't
actually being called. This commit integrates the audit logger into
the rate_limit_exceeded_handler to log security-relevant events:

- Imports AuditLogger and AuditEventType when database is available
- Logs rate limit exceeded events with user_id, IP, endpoint, and metadata
- Gracefully handles audit logging failures without affecting response
- Consistent with existing audit logging patterns in the codebase
@claude

claude Bot commented Feb 2, 2026

Copy link
Copy Markdown

Code Review - PR #327: API Rate Limiting with slowapi

🚨 Critical Architecture Concern

According to CLAUDE.md and docs/V2_STRATEGIC_ROADMAP.md, CodeFRAME is in Phase 2 of a v2 refactor with a "CLI-first, headless Golden Path" architecture. The current server/UI layers are v1 legacy that should be treated as thin adapters.

This PR invests significant effort in the v1 FastAPI layer which conflicts with the roadmap:

  • New modules: codeframe/lib/rate_limiter.py, codeframe/config/rate_limits.py
  • Modified routers: chat.py, projects.py, tasks.py, agents.py
  • 23 new tests for server-layer functionality

From CLAUDE.md:

"Legacy code is reference material. Do NOT import legacy UI/server modules into core. Do NOT 'fix the UI' during Golden Path work."

⚠️ Recommendation

Consider deferring this PR until Phase 2/3 when the server layer architecture is redesigned. Current priorities according to the roadmap:

  • Phase 2: Server layer as thin adapter over core
  • Phase 3: Web UI rebuild

If rate limiting is critical now, consider:

  1. Implementing at infrastructure level (nginx, API gateway) rather than application code
  2. Waiting for Phase 2 server redesign to ensure rate limiting fits the v2 architecture

📝 Technical Review (If Moving Forward)

Despite the architecture concerns, the implementation quality is solid:

✅ Strengths

  1. Clean configuration design

    • Environment-variable driven (RATE_LIMIT_* variables)
    • Sensible defaults (auth: 10/min, standard: 100/min, AI: 20/min)
    • Memory vs Redis storage options
  2. Proper security practices

    • Audit logging for rate limit events (AuditEventType.RATE_LIMIT_EXCEEDED)
    • User ID-based keying for authenticated requests
    • IP-based keying with proper header handling (X-Forwarded-For, X-Real-IP)
  3. Good HTTP compliance

    • 429 status with Retry-After header
    • Descriptive JSON error bodies
  4. Comprehensive test coverage

    • 23 new tests covering key extraction, decorators, and integration
    • Good edge case handling (None client, missing headers)

🐛 Potential Issues

  1. Rate limiter state management (rate_limiter.py:87-117)

    • Global _limiter variable could be problematic in testing/multi-instance scenarios
    • Consider dependency injection pattern instead
  2. Error handling in audit logging (rate_limiter.py:146-166)

    • Silent failure with only debug logging might hide issues
    • Consider warning-level logging for audit failures
  3. Validation gap (rate_limits.py:77-91)

    • Storage validation happens in from_environment() but not enforced in dataclass
    • Could allow invalid configs if created directly
  4. Redis fallback behavior (rate_limiter.py:110-112)

    • ImportError catch is good, but generic Exception might be safer for connection failures
    • Consider logging at WARNING level rather than ERROR

🔧 Minor Code Quality Notes

  1. Duplicate validation (core/config.py:425-431 + rate_limits.py:78-83)

    • Storage validation exists in both modules
    • Consider consolidating to avoid drift
  2. Type hints (rate_limiter.py:200-220)

    • Decorator wrapper returns Callable but could be more specific: Callable[[Callable], Callable]
  3. Documentation

    • Excellent docstrings throughout
    • Could add examples to main module docstring showing decorator usage

🧪 Testing

  • Tests look comprehensive (IP extraction, key generation, decorators)
  • Consider adding tests for:
    • Redis connection failures
    • Concurrent request handling
    • Rate limit header presence in responses

📊 Summary

Aspect Assessment
Architecture fit ❌ Conflicts with v2 roadmap
Code quality ✅ Clean, well-documented
Security ✅ Proper audit logging, key extraction
Test coverage ✅ Comprehensive
Recommendation Defer to Phase 2/3 or use infrastructure-level solution

🤔 Questions for Author

  1. Is rate limiting a critical security requirement right now, or can it wait for Phase 2?
  2. Have you considered infrastructure-level rate limiting (nginx, Cloudflare, API Gateway)?
  3. If proceeding: How do you plan to integrate this with the upcoming v2 server redesign?

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@codeframe/lib/rate_limiter.py`:
- Around line 97-99: The repeated "Rate limiting is disabled" info log in
get_rate_limiter() causes duplicate startup messages; change behavior so the
message is emitted only once or downgraded to debug. Modify get_rate_limiter
(and use get_rate_limit_config/_limiter as needed) to either call
logger.debug("Rate limiting is disabled") instead of logger.info, or add a
module-level boolean (e.g., _logged_disabled) and set it the first time you log
so subsequent calls skip logging; ensure you mark the variable global in
get_rate_limiter before updating it.
🧹 Nitpick comments (1)
codeframe/lib/rate_limiter.py (1)

93-117: Consider thread-safe initialization for multi-worker deployments.

The global _limiter initialization in get_rate_limiter() has a potential race condition if multiple threads call it simultaneously before _limiter is set. While typical FastAPI async deployments run on a single thread per worker (making this a non-issue), if the app runs with threading or multiple workers access shared state, duplicate Limiter instances could be created.

Since slowapi's Limiter state lives in the storage backend (Redis/memory), this is unlikely to cause data corruption—just unnecessary object creation and duplicate log messages. This is acceptable for current use but worth noting for future consideration.

🔒 Thread-safe initialization pattern (optional)
import threading

_limiter: Optional[Limiter] = None
_limiter_lock = threading.Lock()

def get_rate_limiter() -> Optional[Limiter]:
    global _limiter
    config = get_rate_limit_config()
    
    if not config.enabled:
        return None
    
    if _limiter is None:
        with _limiter_lock:
            if _limiter is None:  # Double-check locking
                # ... initialization code ...
    return _limiter

Comment thread codeframe/lib/rate_limiter.py
Add module-level _logged_disabled flag to track if we've already logged
the disabled message. This prevents duplicate INFO logs on every call
to get_rate_limiter() when rate limiting is disabled.

Also reset the flag in reset_rate_limiter() for test isolation.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@codeframe/lib/rate_limiter.py`:
- Around line 108-120: The current Limiter initialization only catches
ImportError and doesn't handle runtime Redis failures; update the code that
constructs _limiter (the Limiter call using key_func=get_rate_limit_key and
storage_uri=config.redis_url) to enable SlowAPI's built-in fallback by passing
in_memory_fallback_enabled=True and optionally swallow_errors=True, and consider
adding storage_options (e.g., short socket_connect_timeout/socket_timeout) so
Redis connection problems automatically fall back to in-memory and limiter
errors are gracefully suppressed.

Comment on lines +108 to +120
if config.storage == "redis" and config.redis_url:
try:
_limiter = Limiter(
key_func=get_rate_limit_key,
storage_uri=config.redis_url,
)
logger.info("Rate limiter initialized with Redis storage")
except ImportError as e:
logger.error(f"Redis storage requested but redis module not available: {e}. Falling back to memory.")
_limiter = Limiter(key_func=get_rate_limit_key)
else:
_limiter = Limiter(key_func=get_rate_limit_key)
logger.info("Rate limiter initialized with in-memory storage")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the actual file to confirm the context
find . -name "rate_limiter.py" -type f

Repository: frankbria/codeframe

Length of output: 94


🏁 Script executed:

# Let's also search for imports and usage of slowapi/Limiter
rg -i "from slowapi|import.*limiter|slowapi" --type py -A 2 -B 2

Repository: frankbria/codeframe

Length of output: 5149


🌐 Web query:

slowapi Limiter Redis storage connection error handling

💡 Result:

To handle Redis connection problems when using SlowAPI’s Limiter with Redis storage, you generally want to (1) fail open with a controlled fallback (so your API doesn’t 500 when Redis blips) and (2) set sane Redis timeouts/pooling so failures happen fast and recover cleanly.

1) Enable SlowAPI’s built-in in-memory fallback (recommended)

SlowAPI’s Limiter supports automatically falling back to in-memory storage if the primary backend (Redis) is down via in_memory_fallback_enabled=True, and you can optionally define separate fallback limits with in_memory_fallback=[...]. This is explicitly part of the Limiter API. [1]

from slowapi import Limiter
from slowapi.util import get_remote_address

limiter = Limiter(
    key_func=get_remote_address,
    storage_uri="redis://redis:6379/0",
    # fall back to in-memory when Redis is unavailable
    in_memory_fallback_enabled=True,
    # optional: different (usually more permissive) limits while degraded
    in_memory_fallback=["60/minute"],
)

This keeps requests flowing during Redis outages (with the tradeoff that limits won’t be shared across instances while degraded).

2) Decide how you want errors treated: swallow vs fail

SlowAPI exposes swallow_errors on the Limiter. If enabled, storage/ratelimit errors are swallowed (logged) instead of bubbling up as request failures. [1]

limiter = Limiter(
    key_func=get_remote_address,
    storage_uri="redis://redis:6379/0",
    in_memory_fallback_enabled=True,
    swallow_errors=True,  # “fail open” if limiter/storage errors occur
)

3) Set Redis client timeouts/pool limits via storage_options

SlowAPI passes storage_options through to the underlying storage implementation. [1] Use this to ensure Redis connection failures are fast (connect timeout, socket timeout) and to control pooling (max connections), depending on your redis client/version.

limiter = Limiter(
    key_func=get_remote_address,
    storage_uri="redis://redis:6379/0",
    storage_options={
        "socket_connect_timeout": 1,
        "socket_timeout": 1,
        # optionally:
        # "max_connections": 50,
    },
    in_memory_fallback_enabled=True,
)

4) What to catch if you implement your own handling

If you’re interacting with Redis directly (or diagnosing failures), Redis documents that connection failures show up as ConnectionError / TimeoutError / AuthenticationError and are typically recoverable with retry/backoff or fallback logic. [2]


Sources: SlowAPI Limiter parameters (storage_options, swallow_errors, in_memory_fallback, in_memory_fallback_enabled). [1] Redis error-handling guidance for connection errors and fallback/retry patterns. [2]


🏁 Script executed:

# Let's examine the actual rate_limiter.py file, especially around lines 108-120
cat -n ./codeframe/lib/rate_limiter.py

Repository: frankbria/codeframe

Length of output: 10270


Use SlowAPI's built-in fallback mechanisms for Redis connection resilience.

The code catches ImportError when the Redis module is unavailable, but runtime connection failures (e.g., Redis server unreachable, timeouts) are not handled. These errors would occur during limit checks, not during Limiter initialization, and could cause request failures if Redis becomes unavailable.

SlowAPI's Limiter provides explicit fallback support for this scenario. Configure it with in_memory_fallback_enabled=True to automatically fall back to in-memory storage on connection failures, and optionally swallow_errors=True to gracefully suppress limiter errors:

Example configuration
_limiter = Limiter(
    key_func=get_rate_limit_key,
    storage_uri=config.redis_url,
    in_memory_fallback_enabled=True,
    swallow_errors=True,
)

You can also set connection timeouts via storage_options to ensure failures occur quickly:

_limiter = Limiter(
    key_func=get_rate_limit_key,
    storage_uri=config.redis_url,
    storage_options={"socket_connect_timeout": 1, "socket_timeout": 1},
    in_memory_fallback_enabled=True,
)
🤖 Prompt for AI Agents
In `@codeframe/lib/rate_limiter.py` around lines 108 - 120, The current Limiter
initialization only catches ImportError and doesn't handle runtime Redis
failures; update the code that constructs _limiter (the Limiter call using
key_func=get_rate_limit_key and storage_uri=config.redis_url) to enable
SlowAPI's built-in fallback by passing in_memory_fallback_enabled=True and
optionally swallow_errors=True, and consider adding storage_options (e.g., short
socket_connect_timeout/socket_timeout) so Redis connection problems
automatically fall back to in-memory and limiter errors are gracefully
suppressed.

Apply rate limiting decorators to all 16 v2 API routers:
- @rate_limit_standard() (100/min) for CRUD operations
- @rate_limit_ai() (20/min) for AI-heavy operations (diagnose, discovery, execute)

Renamed body request parameters from 'request' to 'body' to avoid
conflict with FastAPI's Request object required by rate limiting.
@claude

claude Bot commented Feb 2, 2026

Copy link
Copy Markdown

Code Review - Rate Limiting Implementation

This PR adds comprehensive API rate limiting to the FastAPI server layer. Overall, this is a well-implemented, production-ready feature with good test coverage and proper architectural alignment.

✅ Strengths

  1. Correct architectural placement: Rate limiting belongs in the server layer (codeframe/lib and codeframe/config), not core. This properly treats the FastAPI server as an optional adapter over the headless core - excellent alignment with v2 principles.

  2. Environment-driven configuration: Using RateLimitConfig.from_environment() with proper defaults and validation is the right pattern for deployment flexibility.

  3. Smart key extraction: Using user ID for authenticated requests and IP for anonymous (get_rate_limit_key) provides proper per-user rate limiting while handling unauthenticated traffic.

  4. Comprehensive test coverage: 23 new tests covering configuration parsing, decorator behavior, integration with FastAPI, and edge cases (disabled limiter, missing client, etc.).

  5. Proper 429 responses: Includes Retry-After header and structured JSON error body with audit logging integration.

  6. Category-based limits: Separate limits for auth/standard/AI/websocket endpoints makes sense for different usage patterns.

🔍 Issues & Recommendations

1. CRITICAL: Rate limiting only applies to legacy v1 routes

Looking at the Strategic Roadmap (docs/V2_STRATEGIC_ROADMAP.md:132), issue #167 is listed as a Phase 2 deliverable. However, this implementation only decorates v1 legacy routes in codeframe/ui/routers/.

Problem: The v2 API routes (*_v2.py routers like blockers_v2.py, prd_v2.py, tasks_v2.py) are not rate limited. This creates an inconsistent security posture.

Recommendation:

  • Apply rate limiting to all v2 routers following the same pattern
  • Consider adding a test that verifies v2 routes have rate limiting applied
  • Document which routes are protected in the PR description

2. Decorator pattern adds implicit request dependency

In codeframe/lib/rate_limiter.py:221, the slowapi decorator expects a request: Request parameter in the decorated function. This is implicit and could break if someone forgets to add it.

# Current pattern in routers:
@rate_limit_ai()
async def chat_with_lead(
    request: Request,  # Required by decorator but not obvious
    project_id: int,
    ...
)

Recommendation: Add a docstring note to each decorator explaining the Request parameter requirement, or consider adding runtime validation that raises a clear error if request is missing.

3. Redis fallback is silent in production

In codeframe/config/rate_limits.py:86-91, when RATE_LIMIT_STORAGE=redis but REDIS_URL is not set, the code falls back to memory storage with only a warning log.

Problem: In a distributed deployment, this silent fallback could cause rate limits to not work across instances (each instance has separate memory), potentially allowing abuse.

Recommendation:

  • Consider making this a hard error in production (check an ENV variable)
  • Or add a metric/alert when running in degraded mode
  • Document the fallback behavior in .env.example more prominently

4. Audit logging failure is silently ignored

In rate_limit_exceeded_handler (codeframe/lib/rate_limiter.py:169-171), audit logging failures are caught with a debug log.

except Exception as e:
    logger.debug(f"Failed to log rate limit event to audit log: {e}")

Concern: Using debug level means this failure won't be visible in production. Rate limit violations are security events and logging failures should be investigated.

Recommendation: Change to logger.warning() at minimum, or add metrics to track audit logging failures.

5. Test isolation could be improved

In tests/lib/test_rate_limiter.py, several tests modify global state via _reset_rate_limit_config() and reset_rate_limiter(). While cleanup is present, there's risk of test pollution if a test fails mid-execution.

Recommendation: Use pytest fixtures with autouse cleanup:

@pytest.fixture(autouse=True)
def reset_rate_limiter_state():
    yield
    reset_rate_limiter()
    _reset_rate_limit_config()

6. Missing documentation

The .env.example additions are good, but consider adding:

  • A section in docs/ explaining rate limiting strategy and how to tune limits
  • Examples of monitoring rate limit metrics
  • Guidance on Redis deployment for distributed setups

📊 Alignment with CLAUDE.md Guidelines

No core pollution: Rate limiting correctly stays in server layer
Environment-driven config: Follows v2 pattern
Test coverage: Good coverage of happy/edge cases
⚠️ Phase 2 scope: Only covers legacy routes, not v2 routes
⚠️ Production readiness: Silent fallbacks need attention

🎯 Action Items

Before merge:

  1. Apply rate limiting to v2 API routes (*_v2.py routers)
  2. Document the Request parameter requirement in decorator docstrings
  3. Decide on Redis fallback behavior (error vs. warning + metric)
  4. Change audit logging failure to warning level

Nice to have:
5. Add rate limiting documentation to docs/
6. Improve test isolation with autouse fixtures
7. Add integration test verifying v2 routes have rate limiting

Summary

This is solid work with good engineering practices. The main gap is incomplete v2 coverage - the PR protects legacy routes but leaves the new v2 API routes unprotected. Once v2 routes are covered and the Redis fallback behavior is clarified, this is ready to merge.

Recommendation: Request changes for v2 route coverage before merging.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (16)
codeframe/ui/routers/gates_v2.py (1)

104-105: ⚠️ Potential issue | 🟡 Minor

Stale docstring references old parameter name.

The docstring still refers to request but the parameter was renamed to body. Update for accuracy.

📝 Proposed fix
     Args:
-        request: Gate run options
+        body: Gate run options (RunGatesRequest)
         workspace: v2 Workspace
codeframe/ui/routers/blockers_v2.py (2)

194-202: ⚠️ Potential issue | 🟡 Minor

Docstring parameter name is outdated.

The docstring refers to request: Blocker creation request but the parameter was renamed to body. Update for consistency.

📝 Proposed fix
     """Create a new blocker.

     Args:
-        request: Blocker creation request
+        body: Blocker creation request
         workspace: v2 Workspace

     Returns:
         Created blocker
     """

227-244: ⚠️ Potential issue | 🟡 Minor

Docstring parameter name is outdated.

The docstring refers to request: Answer request but the parameter was renamed to body. Update for consistency.

📝 Proposed fix
     """Answer a blocker.

     Answering a blocker also resets the associated task to READY status,
     so it can be restarted with `cf work start <task-id> --execute`.

     Args:
         blocker_id: Blocker to answer (can be partial ID)
-        request: Answer request
+        body: Answer request
         workspace: v2 Workspace

     Returns:
         Updated blocker
codeframe/ui/routers/checkpoints_v2.py (1)

80-99: ⚠️ Potential issue | 🟡 Minor

Docstring parameter name mismatch.

Line 94 documents request: Checkpoint creation request but the parameter was renamed to body. Update the docstring to match.

📝 Proposed fix
     Args:
-        request: Checkpoint creation request
+        body: Checkpoint creation request
         workspace: v2 Workspace
codeframe/ui/routers/discovery_v2.py (2)

206-209: ⚠️ Potential issue | 🟡 Minor

Docstring references outdated parameter name.

The docstring still refers to request but the parameter was renamed to body to avoid conflict with the FastAPI Request object.

📝 Proposed fix
     Args:
         session_id: Discovery session ID
-        request: Answer request with answer text
+        body: Answer request with answer text
         workspace: v2 Workspace

252-255: ⚠️ Potential issue | 🟡 Minor

Docstring references outdated parameter name.

The docstring still refers to request but the parameter was renamed to body.

📝 Proposed fix
     Args:
         session_id: Discovery session ID (must be complete)
-        request: Optional template selection
+        body: Optional template selection
         workspace: v2 Workspace
codeframe/ui/routers/batches_v2.py (2)

177-208: ⚠️ Potential issue | 🟡 Minor

Docstring references outdated parameter name.

The docstring at line 197 still documents request: Stop options but the parameter was renamed to body. This should be updated for consistency.

📝 Proposed fix for docstring
     Args:
         batch_id: Batch to stop
-        request: Stop options
+        body: Stop options
         workspace: v2 Workspace

227-250: ⚠️ Potential issue | 🟡 Minor

Docstring references outdated parameter name.

Same issue as stop_batch - the docstring at line 239 documents request: Resume options but the parameter is now body.

📝 Proposed fix for docstring
     Args:
         batch_id: Batch to resume
-        request: Resume options
+        body: Resume options
         workspace: v2 Workspace
codeframe/ui/routers/prd_v2.py (1)

244-249: ⚠️ Potential issue | 🟠 Major

Avoid leaking internal error details in 500 responses.
Returning str(e) exposes internal state to clients. Keep details in logs and return a generic error message instead.

🔧 Suggested fix
-        raise HTTPException(
-            status_code=500,
-            detail=api_error("Failed to create PRD", ErrorCodes.EXECUTION_FAILED, str(e)),
-        )
+        raise HTTPException(
+            status_code=500,
+            detail=api_error(
+                "Failed to create PRD",
+                ErrorCodes.EXECUTION_FAILED,
+                "An unexpected error occurred",
+            ),
+        )
-        raise HTTPException(
-            status_code=500,
-            detail=api_error("Failed to create version", ErrorCodes.EXECUTION_FAILED, str(e)),
-        )
+        raise HTTPException(
+            status_code=500,
+            detail=api_error(
+                "Failed to create version",
+                ErrorCodes.EXECUTION_FAILED,
+                "An unexpected error occurred",
+            ),
+        )

Also applies to: 376-381

codeframe/ui/routers/templates_v2.py (1)

195-219: ⚠️ Potential issue | 🟡 Minor

Update apply_template docstring to match new parameters.

The docstring still describes request as the payload and omits body. This can mislead readers now that the payload is in body.

✏️ Proposed docstring fix
-    Args:
-        request: Template application request
-        workspace: v2 Workspace
+    Args:
+        request: FastAPI Request (needed for rate limiting)
+        body: Template application request
+        workspace: v2 Workspace
codeframe/ui/routers/git_v2.py (1)

163-176: ⚠️ Potential issue | 🟡 Minor

Fix the create_commit docstring parameter description.
The docstring still describes request as the commit payload; it should document body as the CommitRequest and request as the FastAPI Request.

✏️ Suggested docstring update
     Args:
-        request: CommitRequest with files and message
+        request: FastAPI Request (required for rate limiting)
+        body: CommitRequest with files and message
         workspace: v2 Workspace

Also applies to: 182-186

codeframe/ui/routers/projects_v2.py (1)

79-93: ⚠️ Potential issue | 🟡 Minor

Update docstrings to include the new request parameter.

The signature includes request, but the Args section still lists only workspace.

✏️ Suggested docstring update
     Args:
+        request: Request for the current HTTP call.
         workspace: v2 Workspace
codeframe/ui/routers/workspace_v2.py (1)

224-252: ⚠️ Potential issue | 🟡 Minor

Allow explicit null to clear tech_stack.

The current code if body.tech_stack is not None: prevents calling update_workspace_tech_stack when a client explicitly sends null, making it impossible to clear the tech stack. The function signature explicitly documents support for None to clear the value (tech_stack: Optional[str] with comment "or None to clear"), but the conditional blocks this. Use model_dump(exclude_unset=True) to distinguish between "field not provided" and "field provided as null":

Suggested fix (Pydantic v2)
-        if body.tech_stack is not None:
-            updated = ws.update_workspace_tech_stack(workspace.repo_path, body.tech_stack)
+        data = body.model_dump(exclude_unset=True)
+        if "tech_stack" in data:
+            updated = ws.update_workspace_tech_stack(workspace.repo_path, data.get("tech_stack"))
codeframe/ui/routers/pr_v2.py (1)

211-231: ⚠️ Potential issue | 🟡 Minor

Update Args docs to reflect the body payload rename.

Both create and merge docstrings still describe the payload as request, which now refers to a different parameter. This makes the endpoint docs misleading.

📝 Suggested docstring updates
@@
-    Args:
-        request: PR creation request
-        workspace: v2 Workspace (for context)
+    Args:
+        request: Request context (rate limiting)
+        body: PR creation request payload
+        workspace: v2 Workspace (for context)
@@
-    Args:
-        pr_number: PR number to merge
-        request: Merge options
-        workspace: v2 Workspace (for context)
+    Args:
+        pr_number: PR number to merge
+        request: Request context (rate limiting)
+        body: Merge options
+        workspace: v2 Workspace (for context)

Also applies to: 253-269

codeframe/ui/routers/tasks_v2.py (2)

235-288: ⚠️ Potential issue | 🟡 Minor

Return 404 when status updates target a missing task.

tasks.update_status can raise a not-found error, but the inner handler currently maps it to “Invalid status transition” (400). That makes status-only updates on missing tasks return the wrong status code.

🔧 Proposed fix
-            except ValueError as e:
-                if "Invalid status" in str(e) or "not a valid" in str(e).lower():
+            except ValueError as e:
+                error_msg = str(e)
+                if "not found" in error_msg.lower():
+                    raise HTTPException(
+                        status_code=404,
+                        detail=api_error("Task not found", ErrorCodes.NOT_FOUND, error_msg),
+                    )
+                if "Invalid status" in error_msg or "not a valid" in error_msg.lower():
                     raise HTTPException(
                         status_code=400,
                         detail=api_error(
                             f"Invalid status: {body.status}",
                             ErrorCodes.VALIDATION_ERROR,
                             f"Valid values: {[s.value for s in TaskStatus]}",
                         ),
                     )
                 # Status transition error
                 raise HTTPException(
                     status_code=400,
-                    detail=api_error("Invalid status transition", ErrorCodes.INVALID_STATE, str(e)),
+                    detail=api_error("Invalid status transition", ErrorCodes.INVALID_STATE, error_msg),
                 )

465-526: ⚠️ Potential issue | 🟠 Major

Avoid treating an explicit empty task list as “run all”.

Using body.task_ids or ... treats [] as falsy and falls back to READY tasks. If a client passes an empty list (e.g., filtered out), this would start tasks unexpectedly. Prefer checking for None so an empty list results in the existing “No tasks to execute” error.

🔧 Proposed fix
-        task_ids = body.task_ids or runtime.get_ready_task_ids(workspace)
+        task_ids = (
+            runtime.get_ready_task_ids(workspace)
+            if body.task_ids is None
+            else body.task_ids
+        )
🤖 Fix all issues with AI agents
In `@codeframe/ui/routers/diagnose_v2.py`:
- Around line 106-112: The diagnose_task endpoint's docstring is out of date:
the function diagnose_task now accepts a FastAPI Request parameter and a
separate body parameter of type DiagnoseRequest, but the docstring still claims
request contains diagnosis options and omits body; update the docstring for
diagnose_task to describe both parameters (request: FastAPI Request for
HTTP/context info and body: DiagnoseRequest containing diagnosis
options/payload) and adjust any wording that references request as the sole
source of options; apply the same docstring correction to the other
diagnose_task variant/duplicate in this module so both reflect the new
signature.
- Around line 181-185: The docstring for get_diagnostic_report lacks
documentation for the new request: Request parameter; update the function's
docstring Args section to include a line for request (type Request) describing
that it carries the incoming HTTP request/context (used by rate
limiting/middleware or to access headers/session), and ensure the parameter list
and description match the signature of get_diagnostic_report so docs and static
analyzers stay consistent.
🧹 Nitpick comments (2)
codeframe/ui/routers/environment_v2.py (1)

125-127: Docstrings are now slightly out of sync with the new parameters.

The Args sections don’t mention request: Request, and install_tool still describes the tool name as coming from request instead of body. Consider updating for accuracy and clarity.

Also applies to: 156-158, 194-196

codeframe/ui/routers/discovery_v2.py (1)

244-244: Consider explicit Optional type hint for clarity.

While body: GeneratePrdRequest = None works, adding Optional makes the nullable intent clearer to readers and static analysis tools.

📝 Suggested improvement
-    body: GeneratePrdRequest = None,
+    body: Optional[GeneratePrdRequest] = None,

Comment thread codeframe/ui/routers/diagnose_v2.py
Comment thread codeframe/ui/routers/diagnose_v2.py
Update Args sections in v2 router endpoint docstrings:
- Add 'request: HTTP request for rate limiting' documentation
- Rename 'request' body references to 'body' for consistency
@claude

claude Bot commented Feb 3, 2026

Copy link
Copy Markdown

Code Review: Rate Limiting Implementation

Thank you for this comprehensive rate limiting implementation! The code quality is excellent with strong test coverage. However, I've identified some critical architectural concerns and security issues that need to be addressed.


🚨 Critical Issues

1. Architecture Misalignment with v2 Roadmap

Severity: CRITICAL

This PR implements server-side rate limiting, which is Phase 2 work according to our v2 roadmap. Per CLAUDE.md:

  • Current phase: Phase 1 - CLI Completion (CLI-first, headless core)
  • This PR: Server layer enhancements (FastAPI middleware/routers)
  • Target phase: Phase 2 - Server Layer rebuild

From docs/V2_STRATEGIC_ROADMAP.md:

Phase 2: Server Layer - Rebuild FastAPI server as thin adapter over core

Recommendation:

  • Retarget this PR to a Phase 2 feature branch, OR
  • Clearly mark as v1 legacy maintenance (not part of v2 architecture)
  • Do NOT merge to main until Phase 1 CLI work is complete

Reference: CLAUDE.md lines 1-50, docs/V2_STRATEGIC_ROADMAP.md


2. Duplicate Configuration Systems

Severity: IMPORTANT

You've created codeframe/config/rate_limits.py with its own configuration loading, but rate limiting config also exists in codeframe/core/config.py:

Existing code (core/config.py:399-401):

# Rate Limiting Configuration
rate_limit_enabled: bool = Field(True, alias="RATE_LIMIT_ENABLED")
rate_limit_storage: str = Field("memory", alias="RATE_LIMIT_STORAGE")
redis_url: Optional[str] = Field(None, alias="REDIS_URL")

New code (config/rate_limits.py:109-117):

enabled_str = os.getenv("RATE_LIMIT_ENABLED", "true")
storage = os.getenv("RATE_LIMIT_STORAGE", "memory")
redis_url = os.getenv("REDIS_URL")

Problems:

  • Two sources of truth for the same environment variables
  • Different validation mechanisms (Pydantic vs manual)
  • Potential for inconsistent behavior

Recommendation:

  • Consolidate into GlobalConfig in core/config.py, OR
  • Document why duplication is necessary and how they interact

⚠️ Security Issues

3. X-Forwarded-For Spoofing Vulnerability

Severity: MODERATE

Location: codeframe/lib/rate_limiter.py:313-316

forwarded_for = request.headers.get("X-Forwarded-For")
if forwarded_for:
    # Return first IP in the chain (real client)
    return forwarded_for.split(",")[0].strip()

Problem:

  • Clients can spoof X-Forwarded-For headers if CodeFRAME is deployed without a reverse proxy
  • Attackers can bypass rate limits by setting arbitrary IPs
  • No validation that request is from a trusted proxy

Recommendation:

def get_client_ip(request: Request, trusted_proxies: Optional[List[str]] = None) -> str:
    # Only trust X-Forwarded-For from trusted proxies
    if trusted_proxies and request.client.host in trusted_proxies:
        forwarded_for = request.headers.get("X-Forwarded-For")
        if forwarded_for:
            return forwarded_for.split(",")[0].strip()
    
    # Fall back to direct connection
    return request.client.host or "unknown"

Add configuration:

RATE_LIMIT_TRUSTED_PROXIES=10.0.0.0/8,172.16.0.0/12

4. "unknown" IP Fallback Allows DoS

Severity: MODERATE

Location: codeframe/lib/rate_limiter.py:362

return "unknown"

Problem:

  • All requests without determinable IPs share the same rate limit bucket
  • One user can exhaust the "unknown" bucket and block others
  • Enables denial of service

Recommendation:

  • Log warning when IP is "unknown" (security monitoring)
  • Apply more aggressive rate limits to "unknown" clients
  • Document that this indicates proxy misconfiguration

5. Silent Audit Log Failures

Severity: LOW

Location: codeframe/lib/rate_limiter.py:432-434

except Exception as e:
    # Don't let audit logging failure affect the rate limit response
    logger.debug(f"Failed to log rate limit event to audit log: {e}")

Problem:

  • logger.debug may not be visible in production
  • Security teams won't know if rate limit events aren't being logged

Recommendation:

  • Change to logger.warning or logger.error
  • Consider emitting metrics/alerts on audit log failures

💡 Code Quality Issues

6. Global Mutable State

Location: codeframe/lib/rate_limiter.py:293-295

_limiter: Optional[Limiter] = None
_logged_disabled: bool = False

Problem:

  • Global state makes testing harder
  • Race conditions possible in concurrent scenarios

Recommendation:

  • Use dependency injection: store limiter on app.state
  • Access via FastAPI dependency
  • Remove _logged_disabled - let logging config handle deduplication

7. Middleware Ordering Not Enforced

Location: codeframe/lib/rate_limiter.py:342-345

# Check if user is authenticated (set by auth middleware)
user = getattr(getattr(request, "state", None), "user", None)

Problem:

  • Assumes auth middleware sets request.state.user
  • If rate limiting runs before auth, will always use IP (weaker security)
  • Order matters but isn't enforced/documented

Recommendation:

  • Document middleware order requirements
  • Add assertion/warning if rate limit middleware runs before auth

✅ Positive Aspects

  1. Excellent test coverage - 23 comprehensive tests with edge cases
  2. Clean code structure - Well-documented, clear separation of concerns
  3. Audit logging integration - Good security monitoring
  4. Flexible configuration - Environment variables for all limits
  5. Tiered limits - Auth < AI < Standard makes sense for threat model

📋 Missing Test Coverage

Add tests for:

  1. X-Forwarded-For spoofing from untrusted clients
  2. Multiple requests sharing "unknown" IP bucket
  3. Audit log failure paths
  4. Redis connection failures
  5. Middleware ordering (rate limit after auth)

🎯 Recommendations

Must Address Before Merge:

  1. ✅ Retarget to Phase 2 feature branch OR mark as v1 legacy
  2. ✅ Consolidate duplicate configuration or document relationship
  3. ✅ Fix X-Forwarded-For spoofing vulnerability
  4. ✅ Add trusted proxy configuration

Should Address:

  1. Refactor global state to dependency injection
  2. Add security-focused tests
  3. Document middleware ordering requirements
  4. Improve "unknown" IP handling

Nice to Have:

  1. Integration tests with actual routers
  2. Deployment/security guide
  3. Dynamic configuration reload support

📊 Review Summary

Category Score Notes
Code Quality 8/10 Excellent tests, clean structure
Security 6/10 Good baseline, needs hardening
Architecture 2/10 Violates v2 CLI-first principles
Test Coverage 8/10 Comprehensive, missing security tests

Overall: NEEDS REVISION

This is high-quality code that will be valuable for Phase 2, but it needs:

  1. Proper roadmap alignment (Phase 2 target)
  2. Security hardening (X-Forwarded-For)
  3. Configuration consolidation

Once these issues are addressed, this will be an excellent addition to the Phase 2 server layer! 🚀


Files Reviewed: 25+ files, ~1000 lines of changes

  • New: config/rate_limits.py, lib/rate_limiter.py, test files
  • Modified: 20 routers, audit_logger.py, .env.example, server.py

cc: Per CLAUDE.md guidelines and v2 architectural principles

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (7)
codeframe/ui/routers/gates_v2.py (2)

119-124: ⚠️ Potential issue | 🟠 Major

Avoid returning raw exception messages to clients.

str(e) can expose internal paths/commands or sensitive error details. Log the exception, but return a redacted/constant message in the API payload.

Suggested fix
     except Exception as e:
         logger.error(f"Failed to run gates: {e}", exc_info=True)
         raise HTTPException(
             status_code=500,
-            detail=api_error("Gate execution failed", ErrorCodes.EXECUTION_FAILED, str(e)),
+            detail=api_error(
+                "Gate execution failed",
+                ErrorCodes.EXECUTION_FAILED,
+                "Internal error. See server logs for details.",
+            ),
         )

133-139: ⚠️ Potential issue | 🟡 Minor

Docstring missing request argument.

The endpoint now accepts request: Request, but the Args list omits it.

Suggested fix
     Args:
+        request: HTTP request for rate limiting
         workspace: v2 Workspace
codeframe/ui/routers/workspace_v2.py (3)

176-189: ⚠️ Potential issue | 🟡 Minor

Allow explicit empty tech_stack updates for existing workspaces.
Line 187 uses if tech_stack: which skips empty strings, so an explicit empty value can’t be applied (unlike update_current_workspace, which uses is not None). Consider allowing empty strings here as well.

🛠️ Suggested fix
-        if tech_stack:
-            workspace = ws.update_workspace_tech_stack(repo_path, tech_stack)
+        if tech_stack is not None:
+            workspace = ws.update_workspace_tech_stack(repo_path, tech_stack)

204-217: ⚠️ Potential issue | 🟡 Minor

Docstring missing the new request parameter.
get_current_workspace now accepts request: Request but the Args list doesn’t mention it.

📝 Suggested doc update
     Args:
+        request: HTTP request for rate limiting
         workspace: v2 Workspace (resolved by dependency)

263-283: ⚠️ Potential issue | 🟡 Minor

Avoid double workspace_exists calls.
ws.workspace_exists(path) is called twice, which adds extra I/O and allows a small TOCTOU window.

♻️ Suggested refactor
-    return {
-        "exists": ws.workspace_exists(path),
-        "path": str(path),
-        "state_dir": str(path / ".codeframe") if ws.workspace_exists(path) else None,
-    }
+    exists = ws.workspace_exists(path)
+    return {
+        "exists": exists,
+        "path": str(path),
+        "state_dir": str(path / ".codeframe") if exists else None,
+    }
codeframe/ui/routers/projects_v2.py (1)

129-141: ⚠️ Potential issue | 🟡 Minor

Update docstrings to include the new request parameter.
These endpoints now accept request: Request, but their Args sections omit it, which makes the docs inconsistent.

Also applies to: 160-171, 192-206, 227-239

codeframe/ui/routers/tasks_v2.py (1)

466-529: ⚠️ Potential issue | 🟠 Major

Avoid executing all READY tasks when an empty list is supplied.
body.task_ids or ... treats [] as falsy, so an explicit empty list will execute all READY tasks instead of triggering the “No tasks to execute” error. Use an explicit None check to preserve the default only when the field is omitted.

🛠️ Proposed fix
-        task_ids = body.task_ids or runtime.get_ready_task_ids(workspace)
+        task_ids = body.task_ids if body.task_ids is not None else runtime.get_ready_task_ids(workspace)
🧹 Nitpick comments (4)
codeframe/ui/routers/discovery_v2.py (1)

245-245: Use Optional type annotation for nullable body parameter.

The type annotation body: GeneratePrdRequest = None is inconsistent—the type says GeneratePrdRequest but the default is None. This will cause type checker warnings and is not idiomatic FastAPI.

♻️ Proposed fix
 async def generate_prd(
     request: Request,
     session_id: str,
-    body: GeneratePrdRequest = None,
+    body: Optional[GeneratePrdRequest] = None,
     workspace: Workspace = Depends(get_v2_workspace),
 ) -> GeneratePrdResponse:
codeframe/ui/routers/git_v2.py (1)

89-97: Minor docstring inconsistency (optional nit).

The create_commit endpoint documents the request parameter in its docstring (lines 175-176), but the GET endpoints like get_git_status don't. Consider adding a brief note about the request parameter to other docstrings for consistency, or removing it from create_commit since it's a technical implementation detail for rate limiting rather than a meaningful API parameter.

codeframe/ui/routers/prd_v2.py (1)

346-358: Docstring references outdated parameter name.

The docstring at line 350 refers to request: Version creation request but the parameter was renamed to body. This should be updated for consistency.

     """Create a new version of a PRD.
 
     Args:
         prd_id: ID of the parent PRD
-        request: Version creation request
+        body: Version creation request
         workspace: v2 Workspace
 
     Returns:
codeframe/ui/routers/batches_v2.py (1)

182-183: Make body params explicitly optional for typing clarity.

Defaulting to None is fine, but the type hints should reflect optionality to keep static typing and OpenAPI expectations aligned.

♻️ Suggested tweak
-    body: StopBatchRequest = None,
+    body: Optional[StopBatchRequest] = None,
@@
-    body: ResumeBatchRequest = None,
+    body: Optional[ResumeBatchRequest] = None,

Also applies to: 233-234

Security fixes:
- Add trusted proxy validation to prevent X-Forwarded-For spoofing
- Only trust proxy headers when direct connection is from configured proxy
- Make "unknown" IP rate limit keys unique to prevent shared bucket DoS
- Add warning logs for proxy header spoofing attempts

Configuration consolidation:
- Add rate limit config fields to GlobalConfig (single source of truth)
- Update RateLimitConfig.from_global_config() to delegate to GlobalConfig
- Remove duplicate environment variable parsing
- Add get_global_config() and reset_global_config() module-level functions

Test improvements:
- Add reset_caches fixture to all rate limiter test classes
- Add TestTrustedProxyValidation with exact IP and CIDR matching tests
- Add test_key_for_unknown_ip_is_unique for DoS mitigation
- Update tests to use from_global_config() instead of from_environment()
@claude

claude Bot commented Feb 3, 2026

Copy link
Copy Markdown

Pull Request Review: Rate Limiting Implementation

I've completed a comprehensive review of the rate limiting implementation. Overall, this is well-architected and follows security best practices. The code quality is high with good test coverage. However, there are important architectural concerns regarding v1 vs v2 strategy.


🏗️ CRITICAL: V1 vs V2 Architecture Alignment

According to CLAUDE.md and docs/V2_STRATEGIC_ROADMAP.md:

  • v2 Philosophy: "Core-first, CLI-first" with server as thin adapter
  • Phase 2 Status: Server layer refactor in progress
  • Rule: "Don't add new HTTP endpoints/features before Golden Path works"

The Issue

This PR adds significant functionality to the v1 server layer (codeframe/ui/server.py, legacy routers) when the project is actively refactoring toward v2. From the roadmap:

Phase 2: Server Layer as Thin Adapter - IN PROGRESS

Deliverables:

  1. Server audit and refactor ([Phase 2] Server audit and refactor - routes delegating to core modules #322) - IN PROGRESS
  2. V2 routers created following thin adapter pattern

Questions for Discussion

  1. Should rate limiting be implemented in core first? If rate limiting is critical for production, should it live in codeframe/core/ and be called by CLI/server?

  2. Is this v1-only or hybrid? The implementation touches both v1 routers (tasks.py, projects.py, agents.py, chat.py) and v2 routers (tasks_v2.py, projects_v2.py, etc.). Should we focus only on v2 routers during Phase 2?

  3. What's the migration strategy? When v1 is deprecated, will this rate limiting code move wholesale to v2, or does it need refactoring?

Recommendation

If rate limiting is needed now:

  • Keep the current implementation for v1/v2 routers that exist
  • Document that this is server-only (not CLI) until Phase 3
  • Create tracking issue for future "rate limiting in core" if CLI needs it

If it can wait:

  • ⏸️ Pause this PR until Phase 2 server refactor completes
  • 🔄 Redesign to integrate cleanly with finalized v2 architecture

Code Quality - Excellent

Strengths

  1. Security-focused design

    • ✅ Trusted proxy validation prevents header spoofing (lines 44-86 in rate_limiter.py)
    • ✅ CIDR notation support for proxy networks
    • ✅ "unknown" IP handling prevents shared bucket DoS (lines 122-128)
    • ✅ Proper audit logging integration
  2. Clean separation of concerns

    • config/rate_limits.py - Configuration layer
    • lib/rate_limiter.py - Middleware implementation
    • ✅ Delegates to GlobalConfig as single source of truth
  3. Comprehensive test coverage

    • ✅ 8 configuration tests
    • ✅ 15 middleware tests (from description)
    • ✅ Tests cover edge cases (invalid IPs, CIDR matching, trusted proxies)
  4. Flexible configuration

    • ✅ Per-category limits (auth, standard, ai, websocket)
    • ✅ Environment-driven configuration
    • ✅ Redis support for distributed deployments

🐛 Potential Issues

1. Decorator Pattern Complexity (Medium Priority)

Location: rate_limiter.py:238-268

The _create_rate_limit_decorator creates a decorator that returns a wrapper that checks config on every call:

def decorator():
    def wrapper(func: Callable) -> Callable:
        limiter = get_rate_limiter()  # Called every time
        config = get_rate_limit_config()  # Cached, but still checked
        if limiter is None or not config.enabled:
            return func
        limit_value = getattr(config, limit_key, "100/minute")
        return limiter.limit(limit_value)(func)
    return wrapper

Issue: If rate limiting is disabled, the original function is returned without slowapi's decorator applied. This means:

  • Functions get different identities depending on config
  • Potential issues with FastAPI's dependency injection

Recommendation:

def wrapper(func: Callable) -> Callable:
    limiter = get_rate_limiter()
    if limiter is None:
        return func  # OK - rate limiting globally disabled
    
    limit_value = getattr(config, limit_key, "100/minute")
    # Always apply the decorator, let slowapi handle the logic
    return limiter.limit(limit_value)(func)

2. Error Handling in Audit Logging (Low Priority)

Location: rate_limiter.py:194-214

The audit logging swallows all exceptions with a debug log:

except Exception as e:
    logger.debug(f"Failed to log rate limit event to audit log: {e}")

Issue: This is correct behavior (don't fail the request), but logger.debug might hide important issues in production.

Recommendation: Use logger.warning instead to ensure visibility in production logs.

3. Missing Rate Limit Headers (Medium Priority)

Location: rate_limiter.py:168-235

The exception handler returns:

headers = {
    "Retry-After": str(exc.detail) if hasattr(exc, "detail") else "60",
}
if hasattr(exc, "limit"):
    headers["X-RateLimit-Limit"] = str(exc.limit)

Missing: Standard rate limit headers:

  • X-RateLimit-Remaining - Requests remaining in window
  • X-RateLimit-Reset - Unix timestamp when limit resets

Recommendation: Add these headers for better client behavior. Clients can implement exponential backoff or smart retry logic.

4. Lack of Rate Limit Bypass for Tests (Low Priority)

Location: N/A

Issue: Integration tests will hit rate limits when running large test suites.

Recommendation: Add a test fixture or environment variable (RATE_LIMIT_TESTING=true) that disables rate limiting:

# In conftest.py
@pytest.fixture(autouse=True)
def disable_rate_limiting_in_tests():
    with patch.dict(os.environ, {"RATE_LIMIT_ENABLED": "false"}):
        yield

🚀 Performance Considerations

1. In-Memory Storage for Single Instance

Good choice for default. In-memory is fast and sufficient for self-hosted single-instance deployments.

2. Redis for Distributed Deployments

Redis support is present but requires manual configuration. Consider documenting:

  • Redis version requirements
  • Connection pool configuration
  • Failover behavior (what happens if Redis goes down?)

3. Cache Usage

get_rate_limit_config() uses @lru_cache(maxsize=1) for singleton pattern. This is efficient.


🔒 Security Assessment

Strengths ✅

  1. Header spoofing prevention - Only trusts proxy headers from configured trusted IPs/networks
  2. DoS mitigation - "unknown" IPs get unique buckets to prevent shared rate limit exhaustion
  3. Audit logging - Rate limit events are logged for security monitoring
  4. Granular limits - Different rates for auth (10/min) vs standard (100/min) vs AI (20/min)

Recommendations

  1. Document proxy configuration in .env.example with examples:

    # If behind AWS ALB
    RATE_LIMIT_TRUSTED_PROXIES=10.0.0.0/8
    
    # If behind nginx on specific IP
    RATE_LIMIT_TRUSTED_PROXIES=172.31.0.10
    
    # Multiple proxies
    RATE_LIMIT_TRUSTED_PROXIES=10.0.0.1,192.168.1.1,172.16.0.0/12
  2. Add rate limit monitoring dashboard (future enhancement) - Track:

    • Top rate-limited IPs/users
    • Endpoints with most 429s
    • Potential abuse patterns

📋 Test Coverage

What's Tested ✅

  • Configuration parsing and validation
  • Trusted proxy IP/CIDR matching
  • Client IP extraction with/without proxy headers
  • Rate limit key generation (user ID vs IP)
  • Decorator behavior when enabled/disabled

What's Missing

  1. Integration tests with actual FastAPI endpoints - Test that decorators work with real routes
  2. Redis storage backend tests - Currently only in-memory is tested
  3. Concurrent request tests - Verify rate limiting works under load
  4. 429 response format validation - Ensure proper headers and JSON body

Recommendation

Add integration test:

def test_rate_limit_with_real_endpoint(client):
    """Test rate limiting on actual endpoint."""
    # Make 11 requests to auth endpoint (limit: 10/minute)
    for i in range(11):
        response = client.post("/auth/login", json={...})
        if i < 10:
            assert response.status_code in [200, 401]  # Normal response
        else:
            assert response.status_code == 429
            assert "Retry-After" in response.headers
            assert response.json()["error"] == "rate_limit_exceeded"

📝 Documentation

Strengths ✅

  • Good docstrings in rate_limits.py and rate_limiter.py
  • .env.example has comprehensive comments
  • PR description includes configuration examples

Recommendations

  1. Add to main README or create docs/RATE_LIMITING.md:

    • Why rate limiting was added
    • Default limits and how to customize
    • Redis setup for multi-instance deployments
    • Monitoring and troubleshooting
  2. Add ADR (Architecture Decision Record) documenting:

    • Why slowapi was chosen over alternatives
    • Why different rate limit categories
    • Security threat model (what attacks does this prevent?)

🎯 Summary

Must Address Before Merge

  1. V1/V2 architecture alignment - Clarify strategy and document decision
  2. ⚠️ Decorator wrapper behavior - Fix function identity issue when rate limiting is disabled

Should Address Before Merge

  1. 🔧 Missing rate limit headers - Add X-RateLimit-Remaining and X-RateLimit-Reset
  2. 📊 Integration tests - Add at least one end-to-end test with real endpoints

Nice to Have (Future PRs)

  1. 📚 Documentation - Add docs/RATE_LIMITING.md
  2. 🧪 Redis backend tests - Verify distributed rate limiting
  3. 📈 Monitoring - Rate limit metrics and dashboard

🎉 Conclusion

This is high-quality, production-ready code with excellent security practices and test coverage. The main concern is architectural fit within the v1→v2 migration strategy. Once the v2 alignment question is resolved, this PR is very close to mergeable.

Great work on the implementation! 👏

Integration tests:
- Add TestRateLimitingIntegration class with 5 tests
- Verify rate limiting allows requests within limit
- Verify 429 response when limit exceeded
- Verify Retry-After header and response body format
- Test rate limiting on real v2 endpoint

Decorator pattern fix:
- Remove redundant config.enabled check (get_rate_limiter already
  returns None when disabled)
- Simplify to: if limiter is None, return func; otherwise apply decorator
- This ensures consistent function identity regardless of config state
Comment on lines +122 to +127
if client_ip == "unknown":
# Use a unique key per request for unknown IPs
# This effectively gives each "unknown" request its own bucket
# preventing the shared bucket DoS attack
request_id = id(request)
return f"ip:unknown:{request_id}"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium

codeframe/lib/rate_limiter.py:122

Using id(request) creates a unique bucket per request, which completely bypasses rate limiting for unknown IPs. Consider using a shared key like "ip:unknown" instead—a shared bucket is safer than no rate limiting at all.

Suggested change
if client_ip == "unknown":
# Use a unique key per request for unknown IPs
# This effectively gives each "unknown" request its own bucket
# preventing the shared bucket DoS attack
request_id = id(request)
return f"ip:unknown:{request_id}"
if client_ip == "unknown":
return "ip:unknown"

🚀 Want me to fix this? Reply ex: "fix it for me".

@claude

claude Bot commented Feb 3, 2026

Copy link
Copy Markdown

Code Review - API Rate Limiting Implementation

I've reviewed PR #327. The implementation is well-structured with good security practices, but there are some important considerations:

Strengths

  1. Security-First Design: Excellent trusted proxy validation preventing header spoofing, smart unknown IP handling, proper audit logging
  2. Architecture: Clean separation of concerns, delegates to GlobalConfig, graceful degradation to memory storage
  3. Test Coverage: 23 new tests with good coverage of key extraction, decorators, and edge cases

Critical Issue: Architecture Alignment

According to CLAUDE.md, this codebase is in v2 refactor phase (CLI-first, headless). The stated principle is:

Rule 0: If a change does not directly support Golden Path, do not implement it.

This PR adds significant server/UI functionality (2055 additions across 20+ router files), which appears to conflict with v2 architecture that states FastAPI is optional and CLI must not require a server.

Question: Is this PR targeting v1 maintenance or v2? If v1, that's fine. If v2, rate limiting may need to wait for Phase 2 (Server Layer) per the strategic roadmap.

Other Issues

  1. Bug (Medium): rate_limiter.py:158 catches only ImportError but Redis init can raise other exceptions (connection errors, auth failures) - should catch broader exceptions
  2. Code Quality (Medium): rate_limits.py:112-117 duplicates validation already in Pydantic validator - can be removed
  3. Enhancement (Low): Missing standard rate limit headers (X-RateLimit-Remaining, X-RateLimit-Reset)
  4. Testing Gap (Low): No integration test for Redis storage path
  5. Code Quality (Low): Unnecessary nesting in _create_rate_limit_decorator

Summary

Strong implementation overall. Main concern is architectural alignment with v2 direction. Please clarify target version.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@tests/ui/test_v2_routers_integration.py`:
- Line 890: Remove the unused import "patch" from the unittest.mock import
statement in tests/ui/test_v2_routers_integration.py (the line currently reading
"from unittest.mock import patch"); locate any references to "patch" (none
expected) and delete only the unused symbol so the file still imports any other
needed names or nothing if not required, then run lint/tests to confirm F401 is
resolved.
🧹 Nitpick comments (2)
codeframe/lib/rate_limiter.py (1)

122-128: Clarify the security tradeoff for "unknown" IP handling.

Using id(request) as a unique key per request effectively disables rate limiting for requests with undeterminable IPs—each gets its own bucket. The comment explains this prevents shared-bucket DoS, but the tradeoff is that requests from misconfigured proxies or edge cases bypass rate limits entirely.

Consider adding a comment documenting this tradeoff, or alternatively applying a very aggressive shared limit for unknown IPs (e.g., 1/second) rather than no limit at all.

💡 Alternative approach with aggressive shared limit
     # Mark "unknown" IPs specially for potential stricter handling
     if client_ip == "unknown":
-        # Use a unique key per request for unknown IPs
-        # This effectively gives each "unknown" request its own bucket
-        # preventing the shared bucket DoS attack
-        request_id = id(request)
-        return f"ip:unknown:{request_id}"
+        # Use shared bucket for unknown IPs with aggressive limit
+        # This is a tradeoff: attackers could fill this bucket, but
+        # leaving unknown IPs unlimited is worse for DoS protection.
+        # If this causes issues, ensure trusted proxies are configured.
+        return "ip:unknown"

This would require a separate, more restrictive limit for unknown IPs configured elsewhere.

tests/ui/test_v2_routers_integration.py (1)

926-929: Use a context manager with yield in the fixture to ensure proper TestClient cleanup.

TestClient leaves background threads and lifespan shutdown handlers running without proper cleanup. The fixture should use with TestClient(app) as client: and yield client to ensure deterministic cleanup when the test finishes. This pattern is already used elsewhere in the test suite and is a FastAPI testing best practice.

♻️ Proposed refactor
-        client = TestClient(app)
-        client.workspace = test_workspace
-
-        return client
+        with TestClient(app) as client:
+            client.workspace = test_workspace
+            yield client

Comment thread tests/ui/test_v2_routers_integration.py Outdated
@claude

claude Bot commented Feb 3, 2026

Copy link
Copy Markdown

Code Review: API Rate Limiting Implementation

I've reviewed PR #327 which adds comprehensive API rate limiting using slowapi. Overall, this is a well-designed and thoroughly tested implementation with strong security considerations.

Strengths

  1. Excellent Security Design - Trusted proxy validation prevents header spoofing, unique keys for unknown IPs prevent DoS
  2. Comprehensive Test Coverage - 23 new tests covering key extraction, decorators, integration, and disabled mode
  3. Flexible Configuration - Environment-driven with memory/Redis support and per-category limits
  4. Proper HTTP Responses - Standard 429 with Retry-After header and audit logging

Critical Issues

1. Redis Connection Handling (rate_limiter.py:152-160)

Only catches ImportError when initializing Redis. Connection failures will crash the app instead of falling back to memory.

Recommendation: Catch broader exceptions for graceful fallback.

2. Rate Limiter Initialization Race (server.py:263-268)

Initialized twice - module level before env vars are loaded (line 263) and in lifespan (line 213). This causes configuration inconsistencies.

Recommendation: Remove module-level init, move middleware addition into lifespan function.

3. Missing Rate Limit Format Validation (config.py:402-411)

Invalid formats like '10/week' or 'abc/minute' only fail at runtime.

Recommendation: Add Pydantic field validator for rate limit format strings.

Medium-Priority

  1. Audit logging sets limit_category=None - consider storing category for better insights
  2. Inconsistent naming: _reset_rate_limit_config (private) vs reset_global_config (public)
  3. Missing integration test for trusted proxy with actual FastAPI requests

Testing

Complete manual testing checklist:

  • Test with actual Redis (not just in-memory)
  • Test trusted proxy configuration in deployment
  • Test Redis unavailability mid-operation

Summary

Solid production-ready implementation! Must fix issues #1 and #2 before merge. Great work on security and test coverage!

@frankbria frankbria merged commit e7f9d01 into main Feb 3, 2026
14 checks passed
@frankbria frankbria deleted the feature/api-rate-limiting branch February 3, 2026 04:43
frankbria pushed a commit that referenced this pull request Feb 3, 2026
- Update status badge to Phase 2 In Progress (4285 tests)
- Add API key authentication section with CLI commands
- Add rate limiting configuration documentation
- Document new V2 API endpoints
- Add Security & API section to Key Features
- Update roadmap to show Phase 2 at 90% complete
- Mark #322, #325, #326, #327 as complete
- Update current focus to remaining items (WebSocket, OpenAPI, pagination)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Phase 2] Implement API Rate Limiting for Security

1 participant