|
| 1 | +# AI Chat API Gateway |
| 2 | + |
| 3 | +## Executive Summary |
| 4 | + |
| 5 | +This application provides a secure, enterprise-ready API gateway for AI-powered chat interactions. It enables organizations to offer domain-specific AI assistance with granular access control, ensuring that sensitive conversations in areas like finance, healthcare, and legal matters are handled with appropriate security and compliance measures. |
| 6 | + |
| 7 | +### Key Business Value |
| 8 | + |
| 9 | +- **Domain-Specific Intelligence**: Tailored AI responses for finance, medicine, taxes, vacation rentals, and general inquiries |
| 10 | +- **Security & Compliance**: Multi-tier security model with role-based access control (read-only, read-write, admin) |
| 11 | +- **Enterprise Authentication**: OAuth 2.0 implementation with JWT tokens for secure API access |
| 12 | +- **Flexible Integration**: Compatible with multiple AI models through LiteLLM integration |
| 13 | +- **Operational Excellence**: Health monitoring, structured error handling, and production-ready architecture |
| 14 | + |
| 15 | +### Use Cases |
| 16 | + |
| 17 | +- **Financial Services**: Provide compliant financial guidance without offering investment advice |
| 18 | +- **Healthcare**: Deliver medical information with appropriate disclaimers and safety guardrails |
| 19 | +- **Tax Services**: Assist with tax-related questions while maintaining legal boundaries |
| 20 | +- **Property Management**: Support vacation rental inquiries with domain-specific knowledge |
| 21 | +- **General Support**: Handle diverse customer service interactions with context-aware responses |
| 22 | + |
| 23 | +--- |
| 24 | + |
| 25 | +## Quick Start |
| 26 | + |
| 27 | +### Prerequisites |
| 28 | + |
| 29 | +- Node.js 20+ |
| 30 | +- Docker and Docker Compose (for LiteLLM) |
| 31 | +- npm or yarn |
| 32 | + |
| 33 | +### Installation |
| 34 | + |
| 35 | +1. **Clone and install dependencies:** |
| 36 | + ```bash |
| 37 | + npm install |
| 38 | + ``` |
| 39 | + |
| 40 | +2. **Configure environment variables:** |
| 41 | + Copy the `.env.example` file to `.env` and update with your credentials: |
| 42 | + ```bash |
| 43 | + cp .env.example .env |
| 44 | + ``` |
| 45 | + |
| 46 | +3. **Start LiteLLM server:** |
| 47 | + ```bash |
| 48 | + docker-compose up -d |
| 49 | + ``` |
| 50 | + |
| 51 | +4. **Build and start the application:** |
| 52 | + ```bash |
| 53 | + npm run build |
| 54 | + npm start |
| 55 | + ``` |
| 56 | + |
| 57 | + Or for development: |
| 58 | + ```bash |
| 59 | + npm run dev |
| 60 | + ``` |
| 61 | + |
| 62 | +The API will be available at `http://localhost:3000` |
| 63 | + |
| 64 | +--- |
| 65 | + |
| 66 | +## API Overview |
| 67 | + |
| 68 | +### Endpoints |
| 69 | + |
| 70 | +#### Public Endpoints |
| 71 | + |
| 72 | +- **`POST /:level/chat`** - Unauthenticated chat endpoint (level: `minnow` or `shark`) |
| 73 | +- **`GET /health`** - Health check endpoint |
| 74 | + |
| 75 | +#### Authenticated Endpoints |
| 76 | + |
| 77 | +- **`POST /authorized/:level/chat`** - Authenticated chat endpoint (requires Bearer token, level: `minnow` or `shark`) |
| 78 | + |
| 79 | +#### Security Levels |
| 80 | + |
| 81 | +Security levels are specified using fish names in the URL path: |
| 82 | +- **`minnow`** - Standard security level (equivalent to "insecure") |
| 83 | +- **`shark`** - Enhanced security level (equivalent to "secure") with compliance guardrails |
| 84 | + |
| 85 | +#### OAuth Endpoints |
| 86 | + |
| 87 | +- **`POST /oauth/token`** - OAuth 2.0 token endpoint (client credentials grant) |
| 88 | +- **`GET /.well-known/jwks.json`** - JSON Web Key Set for token verification |
| 89 | + |
| 90 | +### Authentication Flow |
| 91 | + |
| 92 | +1. Obtain an access token from `/oauth/token` using client credentials |
| 93 | +2. Include the token in the `Authorization` header: `Bearer <token>` |
| 94 | +3. Access protected endpoints with the token |
| 95 | + |
| 96 | +--- |
| 97 | + |
| 98 | +## Configuration |
| 99 | + |
| 100 | +### Environment Variables |
| 101 | + |
| 102 | +| Variable | Description | Default | |
| 103 | +|----------|-------------|---------| |
| 104 | +| `PORT` | Server port | `3000` | |
| 105 | +| `LITELLM_SERVER_URL` | LiteLLM server URL | `http://localhost:4000` | |
| 106 | +| `OAUTH_TOKEN_EXPIRES_IN` | Token expiration (seconds) | `3600` | |
| 107 | +| `OAUTH_CLIENT_ID_READONLY` | Read-only client ID | Required | |
| 108 | +| `OAUTH_CLIENT_SECRET_READONLY` | Read-only client secret | Required | |
| 109 | +| `OAUTH_CLIENT_ID_READWRITE` | Read-write client ID | Required | |
| 110 | +| `OAUTH_CLIENT_SECRET_READWRITE` | Read-write client secret | Required | |
| 111 | +| `OAUTH_CLIENT_ID_ADMIN` | Admin client ID | Required | |
| 112 | +| `OAUTH_CLIENT_SECRET_ADMIN` | Admin client secret | Required | |
| 113 | + |
| 114 | +### Client Roles |
| 115 | + |
| 116 | +- **readonly**: Read-only access (tokens include `role: "readonly"`) |
| 117 | +- **readwrite**: Read-write access (tokens include `role: "readwrite"`) |
| 118 | +- **admin**: Administrative access (tokens include `role: "admin"`) |
| 119 | + |
| 120 | +--- |
| 121 | + |
| 122 | +## Usage Examples |
| 123 | + |
| 124 | +### Get Access Token |
| 125 | + |
| 126 | +```bash |
| 127 | +curl -X POST http://localhost:3000/oauth/token \ |
| 128 | + -H "Content-Type: application/x-www-form-urlencoded" \ |
| 129 | + -d "grant_type=client_credentials&client_id=YOUR_CLIENT_ID&client_secret=YOUR_CLIENT_SECRET" |
| 130 | +``` |
| 131 | + |
| 132 | +Response: |
| 133 | +```json |
| 134 | +{ |
| 135 | + "access_token": "eyJhbGciOiJSUzI1NiIs...", |
| 136 | + "token_type": "Bearer", |
| 137 | + "expires_in": 3600, |
| 138 | + "scope": "chat" |
| 139 | +} |
| 140 | +``` |
| 141 | + |
| 142 | +### Chat Request (Unauthenticated) |
| 143 | + |
| 144 | +```bash |
| 145 | +# Using minnow (standard) security level |
| 146 | +curl -X POST http://localhost:3000/minnow/chat?domain=general \ |
| 147 | + -H "Content-Type: application/json" \ |
| 148 | + -d '{ |
| 149 | + "messages": [ |
| 150 | + {"role": "user", "content": "Hello!"} |
| 151 | + ] |
| 152 | + }' |
| 153 | + |
| 154 | +# Using shark (enhanced) security level for sensitive domains |
| 155 | +curl -X POST http://localhost:3000/shark/chat?domain=finance \ |
| 156 | + -H "Content-Type: application/json" \ |
| 157 | + -d '{ |
| 158 | + "messages": [ |
| 159 | + {"role": "user", "content": "What is a 401(k)?"} |
| 160 | + ] |
| 161 | + }' |
| 162 | +``` |
| 163 | + |
| 164 | +### Chat Request (Authenticated) |
| 165 | + |
| 166 | +```bash |
| 167 | +# Authenticated request with shark security level |
| 168 | +curl -X POST http://localhost:3000/authorized/shark/chat?domain=medicine \ |
| 169 | + -H "Authorization: Bearer YOUR_ACCESS_TOKEN" \ |
| 170 | + -H "Content-Type: application/json" \ |
| 171 | + -d '{ |
| 172 | + "messages": [ |
| 173 | + {"role": "user", "content": "What are the symptoms of diabetes?"} |
| 174 | + ] |
| 175 | + }' |
| 176 | + |
| 177 | +# Authenticated request with minnow security level |
| 178 | +curl -X POST http://localhost:3000/authorized/minnow/chat?domain=general \ |
| 179 | + -H "Authorization: Bearer YOUR_ACCESS_TOKEN" \ |
| 180 | + -H "Content-Type: application/json" \ |
| 181 | + -d '{ |
| 182 | + "messages": [ |
| 183 | + {"role": "user", "content": "Hello!"} |
| 184 | + ] |
| 185 | + }' |
| 186 | +``` |
| 187 | + |
| 188 | +### Query Parameters |
| 189 | + |
| 190 | +- **`domain`** (optional): Domain for the chat (`general`, `finance`, `medicine`, `taxes`, `vacation-rental`). Defaults to `general`. |
| 191 | +- **`model`** (optional): AI model to use. If not specified, LiteLLM will use its default model. |
| 192 | + |
| 193 | +--- |
| 194 | + |
| 195 | +## Technical Architecture |
| 196 | + |
| 197 | +### Technology Stack |
| 198 | + |
| 199 | +- **Runtime**: Node.js with TypeScript |
| 200 | +- **Framework**: Express.js |
| 201 | +- **Authentication**: OAuth 2.0 (client credentials grant) with JWT (RS256) |
| 202 | +- **Validation**: Zod for request validation |
| 203 | +- **AI Integration**: LiteLLM proxy for multi-model support |
| 204 | +- **Security**: RSA 2048-bit key pairs for JWT signing |
| 205 | + |
| 206 | +### Project Structure |
| 207 | + |
| 208 | +``` |
| 209 | +src/ |
| 210 | +├── domains/ # Domain-specific prompts and configurations |
| 211 | +│ ├── finance/ |
| 212 | +│ ├── medicine/ |
| 213 | +│ ├── taxes/ |
| 214 | +│ ├── vacation-rental/ |
| 215 | +│ └── general/ |
| 216 | +├── middleware/ # Express middleware |
| 217 | +│ └── auth.ts # JWT authentication middleware |
| 218 | +├── routes/ # API route handlers |
| 219 | +│ ├── chat.ts # Chat endpoint handlers |
| 220 | +│ └── oauth.ts # OAuth token and JWKS endpoints |
| 221 | +├── types/ # TypeScript type definitions |
| 222 | +│ └── express.d.ts # Express Request extensions |
| 223 | +├── utils/ # Utility functions |
| 224 | +│ └── jwt-keys.ts # RSA key generation and JWKS conversion |
| 225 | +└── server.ts # Application entry point |
| 226 | +``` |
| 227 | + |
| 228 | +### Security Features |
| 229 | + |
| 230 | +1. **OAuth 2.0 Implementation** |
| 231 | + - Client credentials grant flow |
| 232 | + - RSA-signed JWT tokens (RS256) |
| 233 | + - Token expiration and validation |
| 234 | + - JWKS endpoint for public key distribution |
| 235 | + |
| 236 | +2. **Role-Based Access Control** |
| 237 | + - Three-tier access model (readonly, readwrite, admin) |
| 238 | + - Role information embedded in JWT claims |
| 239 | + - Extensible for future endpoint-level restrictions |
| 240 | + |
| 241 | +3. **Domain Security Levels** |
| 242 | + - Security levels are specified in the URL path using fish names: |
| 243 | + - **`minnow`**: Standard security level (equivalent to "insecure") - Standard prompts for general use |
| 244 | + - **`shark`**: Enhanced security level (equivalent to "secure") - Enhanced prompts with compliance guardrails for sensitive domains |
| 245 | + - The fish names obscure the actual security level from end users while maintaining internal consistency |
| 246 | + |
| 247 | +### Domain-Specific Prompts |
| 248 | + |
| 249 | +The application supports five domains, each with two security levels: |
| 250 | + |
| 251 | +- **General**: Broad-purpose AI assistance |
| 252 | +- **Finance**: Financial information with compliance guardrails |
| 253 | +- **Medicine**: Medical information with appropriate disclaimers |
| 254 | +- **Taxes**: Tax-related guidance with legal boundaries |
| 255 | +- **Vacation Rental**: Property management and rental inquiries |
| 256 | + |
| 257 | +Each domain includes: |
| 258 | +- `insecure.txt`: Standard prompt configuration (used when `minnow` level is specified) |
| 259 | +- `secure.txt`: Enhanced prompt with domain-specific safety measures (used when `shark` level is specified) |
| 260 | +- `prompts.ts`: TypeScript module exporting prompt configurations |
| 261 | + |
| 262 | +The security level is specified in the URL path (`minnow` or `shark`), which internally maps to `insecure` or `secure` respectively. |
| 263 | + |
| 264 | +### Key Generation |
| 265 | + |
| 266 | +- RSA 2048-bit key pairs generated on server startup |
| 267 | +- Keys stored in memory (regenerated on restart) |
| 268 | +- Public keys exposed via JWKS endpoint for token verification |
| 269 | +- Private keys used exclusively for token signing |
| 270 | + |
| 271 | +### Error Handling |
| 272 | + |
| 273 | +- Structured error responses following OAuth 2.0 standards |
| 274 | +- Comprehensive validation using Zod schemas |
| 275 | +- Detailed error messages for debugging |
| 276 | +- Graceful error handling with appropriate HTTP status codes |
| 277 | + |
| 278 | +--- |
| 279 | + |
| 280 | +## Development |
| 281 | + |
| 282 | +### Scripts |
| 283 | + |
| 284 | +- `npm run build` - Compile TypeScript to JavaScript |
| 285 | +- `npm start` - Run compiled application |
| 286 | +- `npm run dev` - Run with ts-node for development |
| 287 | +- `npm run watch` - Watch mode for TypeScript compilation |
| 288 | + |
| 289 | +### Building |
| 290 | + |
| 291 | +```bash |
| 292 | +npm run build |
| 293 | +``` |
| 294 | + |
| 295 | +Output is compiled to the `dist/` directory. |
| 296 | + |
| 297 | +### Type Safety |
| 298 | + |
| 299 | +The application uses TypeScript with strict mode enabled. All request/response types are validated at runtime using Zod schemas, ensuring type safety throughout the application lifecycle. |
| 300 | + |
| 301 | +--- |
| 302 | + |
| 303 | +## Production Considerations |
| 304 | + |
| 305 | +### Security |
| 306 | + |
| 307 | +- **Never commit `.env` files** - Use secure secret management in production |
| 308 | +- **Rotate client secrets regularly** - Implement secret rotation policies |
| 309 | +- **Use HTTPS** - Always use TLS in production environments |
| 310 | +- **Key Management** - Consider using a key management service for production deployments |
| 311 | +- **Rate Limiting** - Implement rate limiting for API endpoints |
| 312 | +- **Monitoring** - Set up logging and monitoring for security events |
| 313 | + |
| 314 | +### Scalability |
| 315 | + |
| 316 | +- Stateless design enables horizontal scaling |
| 317 | +- Key generation on startup may impact cold start times |
| 318 | +- Consider persistent key storage for production (currently in-memory) |
| 319 | + |
| 320 | +### Deployment |
| 321 | + |
| 322 | +The application is designed to run in containerized environments. Ensure: |
| 323 | +- Environment variables are properly configured |
| 324 | +- LiteLLM service is accessible |
| 325 | +- Network policies allow necessary connections |
| 326 | +- Health check endpoint is monitored |
| 327 | + |
| 328 | +--- |
| 329 | + |
| 330 | +## License |
| 331 | + |
| 332 | +Copyright (c) 2024 Promptfoo. All rights reserved. |
| 333 | + |
| 334 | +This software is proprietary and confidential. Unauthorized copying, modification, distribution, or use of this software, via any medium, is strictly prohibited without the express written permission of Promptfoo. |
| 335 | + |
| 336 | +See [LICENSE](LICENSE) for full terms. |
| 337 | + |
0 commit comments