Skip to content

Commit 8e290d4

Browse files
authored
Merge pull request #1584 from cnoe-io/prebuild/collapse-rbac-kb-prs
feat(rbac): add fine-grained Knowledge Base access controls
2 parents 1c61c61 + f653f0a commit 8e290d4

85 files changed

Lines changed: 6329 additions & 2377 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.dockerignore

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -159,3 +159,7 @@ docs/_build/
159159
.env
160160
.env.local
161161
.env.*.local
162+
ui/.env
163+
ui/.env.local
164+
ui/.env.*.local
165+
!ui/.env.example

Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ APP_NAME ?= ai-platform-engineering
1515
# or unset to restore unlimited parallelism. Used by every target below that
1616
# invokes `docker compose ... up --build`.
1717
COMPOSE_PARALLEL_LIMIT ?= 4
18-
DOCKER_COMPOSE_BUILD_ENV := COMPOSE_PARALLEL_LIMIT=$(COMPOSE_PARALLEL_LIMIT) BUILDKIT_MAX_PARALLELISM=$(COMPOSE_PARALLEL_LIMIT)
18+
DOCKER_COMPOSE_BUILD_ENV := DOCKER_BUILDKIT=1 COMPOSE_PARALLEL_LIMIT=$(COMPOSE_PARALLEL_LIMIT) BUILDKIT_MAX_PARALLELISM=$(COMPOSE_PARALLEL_LIMIT)
1919

2020
## -------------------------------------------------
2121
.PHONY: \

ai_platform_engineering/knowledge_bases/rag/README.md

Lines changed: 8 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@
2020
## Quick Start
2121

2222
```bash
23-
# Start all services (development mode with trusted network access)
23+
# Start all services
2424
docker compose --profile apps up
2525
```
2626

@@ -31,35 +31,29 @@ docker compose --profile apps up
3131

3232
### Authentication
3333

34-
**Development (Trusted Network - Default):**
35-
Services trust localhost connections without authentication.
36-
37-
**Production (OIDC/OAuth2):**
38-
Configure environment variables for JWT-based authentication:
34+
Configure environment variables for JWT authentication and OpenFGA authorization:
3935

4036
```bash
4137
# UI authentication (OIDC)
4238
OIDC_ISSUER=https://your-keycloak.com/realms/production
4339
OIDC_CLIENT_ID=rag-ui
4440
OIDC_CLIENT_SECRET=xxx
45-
OIDC_GROUP_CLAIM=groups # Optional: auto-detects if empty; supports comma-separated (e.g., "groups,members,roles")
4641

4742
# Ingestor authentication (OAuth2 client credentials)
4843
INGESTOR_OIDC_ISSUER=https://your-keycloak.com/realms/production
4944
INGESTOR_OIDC_CLIENT_ID=rag-ingestor
5045
INGESTOR_OIDC_CLIENT_SECRET=xxx
5146

52-
# Disable trusted network in production
53-
ALLOW_TRUSTED_NETWORK=false
54-
55-
# Role-based access control (map groups to roles)
56-
RBAC_ADMIN_GROUPS=rag-admins,platform-admins
57-
RBAC_INGESTONLY_GROUPS=rag-ingestors
58-
RBAC_READONLY_GROUPS=rag-readers
47+
# Human KB authorization
48+
OPENFGA_HTTP=http://openfga:8080
5949
```
6050

6151
**Supported OIDC Providers:** Keycloak, Azure AD, Okta, AWS Cognito
6252

53+
RAG treats human JWTs as identity-only. KB, data-source, and tool authorization
54+
comes from OpenFGA relationships rather than AD/OIDC groups or Keycloak realm
55+
roles.
56+
6357
If you have Claude code, VS code, Cursor etc. you can connect upto the MCP server running at http://localhost:9446/mcp
6458

6559
**Documentation:**

ai_platform_engineering/knowledge_bases/rag/common/src/common/ingestor.py

Lines changed: 12 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -21,9 +21,7 @@ class Client:
2121
"""
2222
Client bindings for RAG server REST API - handles ingestor lifecycle and data ingestion
2323
24-
Supports two authentication modes:
25-
1. OAuth2 Client Credentials (production) - via INGESTOR_OIDC_* env vars
26-
2. Trusted Network (development) - no authentication required
24+
Uses OAuth2 Client Credentials via INGESTOR_OIDC_* env vars.
2725
"""
2826

2927
def __init__(self, ingestor_name: str, ingestor_type: str, ingestor_description: str = "", ingestor_metadata: Optional[Dict[str, Any]] = {}):
@@ -83,7 +81,7 @@ def __init__(self, ingestor_name: str, ingestor_type: str, ingestor_description:
8381
logger.info(f" - INGESTOR_OIDC_DISCOVERY_URL: {'SET' if has_discovery_url else 'NOT SET'}")
8482
logger.info(f" - INGESTOR_OIDC_CLIENT_ID: {'SET' if has_client_id else 'NOT SET'}")
8583
logger.info(f" - INGESTOR_OIDC_CLIENT_SECRET: {'SET' if has_client_secret else 'NOT SET'}")
86-
logger.info(" - Authentication mode: TRUSTED NETWORK (will send unauthenticated requests)")
84+
logger.info(" - Authentication mode: NOT CONFIGURED (requests will fail before send)")
8785

8886
# Note: Health check will be done during initialize() with aiohttp
8987

@@ -212,20 +210,22 @@ async def _fetch_discovery(self, discovery_url: str) -> str:
212210
logger.info(f"Ingestor '{self.ingestor_name}': ✓ Discovered token endpoint: {self._token_endpoint}")
213211
return self._token_endpoint
214212

215-
async def _get_access_token(self) -> Optional[str]:
213+
async def _get_access_token(self) -> str:
216214
"""
217215
Get valid OAuth2 access token using client credentials flow.
218216
219217
Token is cached and automatically refreshed before expiry.
220-
Returns None if OAuth2 is not configured (trusted network mode).
221218
222219
Returns:
223-
Access token string or None
220+
Access token string
224221
"""
225222
# Check if OAuth2 is configured (need either issuer or discovery URL, plus client credentials)
226223
if not (self.oidc_issuer or self.oidc_discovery_url) or not self.oidc_client_id or not self.oidc_client_secret:
227-
# Trusted network mode - no token needed
228-
return None
224+
raise RuntimeError(
225+
"Ingestor OAuth2 client credentials are required: configure "
226+
"INGESTOR_OIDC_ISSUER or INGESTOR_OIDC_DISCOVERY_URL, "
227+
"INGESTOR_OIDC_CLIENT_ID, and INGESTOR_OIDC_CLIENT_SECRET"
228+
)
229229

230230
# Check if cached token is still valid (with 60s buffer)
231231
if self._access_token and self._token_expiry:
@@ -275,8 +275,7 @@ async def _get_auth_headers(self) -> Dict[str, str]:
275275
"""
276276
Get authentication headers for RAG server requests.
277277
278-
Returns headers with Authorization Bearer token if OAuth2 is configured,
279-
otherwise returns basic headers for trusted network mode.
278+
Returns headers with an Authorization Bearer token.
280279
281280
Also includes X-Ingestor-Type and X-Ingestor-Name headers for identification.
282281
@@ -289,13 +288,9 @@ async def _get_auth_headers(self) -> Dict[str, str]:
289288
headers["X-Ingestor-Type"] = self.ingestor_type
290289
headers["X-Ingestor-Name"] = self.ingestor_name
291290

292-
# Get access token (None if trusted network mode)
293291
token = await self._get_access_token()
294-
if token:
295-
headers["Authorization"] = f"Bearer {token}"
296-
logger.debug(f"Ingestor '{self.ingestor_name}': Sending AUTHENTICATED request (OAuth2 Bearer token)")
297-
else:
298-
logger.debug(f"Ingestor '{self.ingestor_name}': Sending UNAUTHENTICATED request (trusted network mode)")
292+
headers["Authorization"] = f"Bearer {token}"
293+
logger.debug(f"Ingestor '{self.ingestor_name}': Sending AUTHENTICATED request (OAuth2 Bearer token)")
299294

300295
return headers
301296

ai_platform_engineering/knowledge_bases/rag/common/src/common/models/rbac.py

Lines changed: 10 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -1,65 +1,36 @@
1-
"""
2-
Shared RBAC models for the RAG system.
3-
4-
Includes legacy role-based models (Role, UserContext) and 098 Enterprise RBAC
5-
models (TeamKbOwnership, TeamRagToolConfig) for team-scoped RAG tool management.
6-
"""
1+
"""Shared RBAC models for the RAG system."""
72
from datetime import datetime
83
from typing import List, Optional
94
from pydantic import BaseModel, Field
105

116

12-
class KeycloakRole:
13-
"""Realm role name constants used in Keycloak JWT ``roles`` claim (098 Enterprise RBAC)."""
14-
15-
ADMIN = "admin"
16-
KB_ADMIN = "kb_admin"
17-
TEAM_MEMBER = "team_member"
18-
CHAT_USER = "chat_user"
19-
DENIED = "denied"
20-
# Spec 104 — platform-wide admin (assigned to BOOTSTRAP_ADMIN_EMAILS users
21-
# by init-idp.sh). Treated as a synonym for ADMIN by the RAG mapper so a
22-
# single Keycloak role grants both AgentGateway admin and RAG admin.
23-
ADMIN_USER = "admin_user"
24-
25-
26-
class KbPermission(BaseModel):
27-
"""Per-knowledge-base permission parsed from realm roles such as ``kb_reader:my-kb``."""
28-
29-
kb_id: str
30-
scope: str
31-
32-
class Config:
33-
frozen = True
34-
35-
367
class Role:
378
"""
389
Role definitions with hierarchical permissions.
3910
4011
Hierarchy (higher level inherits lower level permissions):
41-
0. ANONYMOUS - No access (unauthenticated users)
4212
1. READONLY - Read-only access (GET, query, explore)
4313
2. INGESTONLY - Read + ingest data (POST ingest, manage jobs)
4414
3. ADMIN - Full access including deletions and bulk operations
4515
"""
4616

47-
ANONYMOUS = "anonymous"
4817
READONLY = "readonly"
4918
INGESTONLY = "ingestonly"
5019
ADMIN = "admin"
5120

5221

5322
class UserContext(BaseModel):
54-
"""User authentication and authorization context"""
23+
"""Authenticated identity context.
24+
25+
Human resource authorization is resolved through OpenFGA using ``subject``.
26+
Static IdP groups, AD groups, and Keycloak realm roles are intentionally not
27+
carried in this model.
28+
"""
5529

5630
subject: Optional[str] = None
5731
email: str
58-
groups: List[str]
5932
role: str
6033
is_authenticated: bool
61-
kb_permissions: List[KbPermission] = Field(default_factory=list)
62-
realm_roles: List[str] = Field(default_factory=list)
6334

6435
class Config:
6536
frozen = True # Immutable for security
@@ -71,35 +42,25 @@ class UserInfoResponse(BaseModel):
7142
email: str
7243
role: str
7344
is_authenticated: bool
74-
groups: List[str]
7545
permissions: List[str] # List of permissions: ["read", "ingest", "delete"]
76-
in_trusted_network: bool
77-
78-
79-
# ============================================================================
80-
# 098 Enterprise RBAC — Team-scoped RAG models (data-model.md)
81-
# ============================================================================
8246

8347

8448
class TeamKbOwnership(BaseModel):
8549
"""
86-
Team/KB ownership assignment stored in MongoDB (FR-009, FR-015).
50+
Team/KB ownership metadata stored in MongoDB.
8751
88-
Defines which knowledge bases and datasources a team is permitted to access.
89-
The ``keycloak_role`` field links this assignment to the Keycloak realm role
90-
that gates access (e.g. ``team_member(team-a)``).
52+
Runtime RAG authorization decisions are made through OpenFGA relationships.
9153
"""
9254
team_id: str
9355
tenant_id: str
9456
kb_ids: List[str] = Field(default_factory=list)
9557
allowed_datasource_ids: List[str] = Field(default_factory=list)
96-
keycloak_role: str
9758
updated_at: datetime = Field(default_factory=datetime.utcnow)
9859

9960

10061
class TeamRagToolConfig(BaseModel):
10162
"""
102-
Team-scoped RAG tool configuration stored in MongoDB (FR-009).
63+
Team-scoped RAG tool configuration stored in MongoDB.
10364
10465
Validation rules:
10566
- ``datasource_ids`` must be a subset of the owning team's

0 commit comments

Comments
 (0)