Date: 2025-11-05
Branch: feature/api-auth-middleware-coverage
Status: 🔴 CRITICAL - Major Auth Gaps Found
Our API authentication audit reveals significant security gaps:
- 112 total endpoints in the API
- Only 6 endpoints (5%) have proper authentication
- 52 endpoints (46%) lack authentication but likely need it
- 54 endpoints (48%) marked as public (many incorrectly)
🚨 HIGH RISK: The following endpoint categories are completely unprotected:
- User Management (
/users/*) - No auth on user operations - RBAC Management (
/rbac/*) - No auth on role/permission management - Ingestion (
/ingest/*) - Anyone can ingest data - Ontology Management (
/ontology/*) - Unprotected CRUD operations - Vocabulary Management (
/vocabulary/*) - Open to all - Admin Operations (
/admin/*) - No admin protection - Query Endpoints (
/query/*) - No rate limiting or auth
Jobs API - OAuth authentication required:
GET /jobs/{job_id}- View job detailsGET /jobs- List jobsDELETE /jobs/{job_id}- Delete jobPOST /jobs/{job_id}/approve- Approve jobDELETE /jobs- Bulk delete jobsGET /jobs/{job_id}/stream- Stream job progress
GET /users/me- Get current userGET /users/{user_id}- Get user detailsPUT /users/{user_id}- Update userDELETE /users/{user_id}- Delete userGET /users- List all users
Risk: Anyone can view, modify, or delete users!
- All
/rbac/roles/*endpoints - Create, read, update, delete roles - All
/rbac/permissions/*endpoints - Manage permissions /rbac/user-roles/*- Assign roles to users/rbac/check-permission- Check permissions
Risk: Anyone can grant themselves admin rights!
POST /ingest- Ingest documentsPOST /ingest/text- Ingest textPOST /ingest/image- Ingest images
Risk: Unlimited ingestion, potential DoS, data pollution!
GET /ontology/- List ontologiesGET /ontology/{ontology_name}- Get ontology detailsGET /ontology/{ontology_name}/files- List filesDELETE /ontology/{ontology_name}- Delete ontology
Risk: Anyone can delete entire ontologies!
- All
/vocabulary/*endpoints - Full CRUD on vocabulary /vocabulary/search/{query}- Search vocabulary/vocabulary/similar/{type_name}- Find similar types/vocabulary/review- Review vocabulary/vocabulary/consolidate- Consolidate vocabulary
Risk: Vocabulary manipulation could corrupt the entire system!
POST /admin/reset- Reset entire database!GET /admin/extraction- View extraction configPUT /admin/extraction- Change extraction configPOST /admin/extraction/test- Test extractionGET /admin/embedding/*- View/change embedding providersPOST /admin/run-migrations- Run database migrations
Risk: CRITICAL - Anyone can wipe the database!
POST /query/search- Search conceptsGET /query/concept/{concept_id}- Get concept detailsPOST /query/related- Find related conceptsPOST /query/connect- Find connectionsPOST /query/connect-by-search- Semantic path findingPOST /query/cypher- Execute arbitrary Cypher queries!GET /database/stats- Database statistics
Risk: Cypher endpoint allows arbitrary database queries!
These endpoints should remain public:
/health- Health check/docs,/redoc,/openapi.json- API documentation/auth/login- Login endpoint/auth/oauth/device- Device code flow/auth/oauth/token- Token endpoint/auth/register- User registration
Current State: The API is effectively open to the public with minimal protection.
Attack Vectors:
- Data Destruction: Anyone can delete ontologies, users, or reset the entire database
- Privilege Escalation: Anyone can grant themselves admin roles
- Data Exfiltration: Unprotected query endpoints allow unlimited data access
- Resource Exhaustion: Unlimited ingestion could fill storage/crash the system
- Configuration Tampering: Anyone can change extraction/embedding providers
Compliance Risk:
- Violates OWASP API Security Top 10
- Not production-ready
- Data privacy concerns (GDPR, etc.)
Priority: CRITICAL
- Admin Endpoints - Add
require_role("admin")to all/admin/*endpoints - User Management - Require authentication + ownership verification
- RBAC Endpoints - Require admin role for role/permission management
- Ontology/Vocabulary - Require authentication + permission checks
- Ingestion - Require authentication + rate limiting
- Cypher Endpoint - Either remove or require admin role with audit logging
Timeline: 1-2 weeks
-
Permission-Based Authorization
- Implement fine-grained permissions (ADR-028)
- Resource-level access control
- Ontology-scoped permissions
-
Rate Limiting
- Add rate limits to query endpoints
- Prevent ingestion abuse
- API key quotas
-
Audit Logging
- Log all authenticated requests
- Track admin operations
- Security event monitoring
Timeline: 2-4 weeks
-
Comprehensive Test Suite
- Test all endpoints for auth requirements
- Automated security testing in CI/CD
- Regression tests for auth bypass
-
Documentation
- Document auth requirements per endpoint
- Security best practices guide
- Developer auth testing guide
-
Monitoring
- Alert on unauthorized access attempts
- Track unusual API usage patterns
- Security metrics dashboard
- Add auth to all admin endpoints
- Protect user management endpoints
- Secure RBAC endpoints
- Add auth to ingestion endpoints
- Remove or protect Cypher endpoint
- Add permissions to ontology endpoints
- Add permissions to vocabulary endpoints
- Implement rate limiting
- Add audit logging
- Write comprehensive auth tests
- Add tests to CI/CD pipeline
- Document all auth requirements
- Create security testing guide
- Add monitoring/alerting
Create pytest fixtures:
@pytest.fixture
def admin_client(app):
"""Client with admin authentication"""
# Implementation
@pytest.fixture
def user_client(app):
"""Client with regular user authentication"""
# Implementation
@pytest.fixture
def anonymous_client(app):
"""Client without authentication"""
# Implementationdef test_admin_endpoints_require_admin_role(anonymous_client, user_client):
"""Verify admin endpoints reject non-admin users"""
for endpoint in ADMIN_ENDPOINTS:
# Anonymous should get 401
response = anonymous_client.post(endpoint)
assert response.status_code == 401
# Regular user should get 403
response = user_client.post(endpoint)
assert response.status_code == 403
def test_protected_endpoints_require_auth(anonymous_client):
"""Verify all protected endpoints reject unauthenticated requests"""
for endpoint in PROTECTED_ENDPOINTS:
response = anonymous_client.request(endpoint.method, endpoint.path)
assert response.status_code in [401, 403]def test_openapi_schema_documents_security():
"""Verify OpenAPI schema correctly marks protected endpoints"""
schema = app.openapi()
for path, methods in schema["paths"].items():
if path in PROTECTED_PATHS:
for method, details in methods.items():
assert "security" in details, \
f"{method} {path} should have security requirement"- Full Audit Report: API_AUTH_AUDIT_RESULTS.md
- Research Document: API_AUTH_TESTING_RESEARCH.md
- Audit Script:
scripts/development/audit-api-auth.py - Development Scripts Guide:
scripts/development/README.md - ADR-054: OAuth 2.0 Authentication
- ADR-028: Dynamic RBAC
- Review this document with the team
- Prioritize critical fixes (admin, user management, RBAC)
- Create implementation tasks in project tracker
- Begin Phase 1 security hardening immediately
- Schedule security review after fixes
Status: 🔴 Draft - Requires immediate action Owner: Engineering Team Reviewer: Security Team Target Date: Phase 1 complete within 1 week