|
| 1 | +# CLAUDE.md |
| 2 | + |
| 3 | +This file provides guidance to AI assistants when working with code in this repository. |
| 4 | + |
| 5 | +## Project Overview |
| 6 | + |
| 7 | +RERUM API v1 is an open source Node.js/Express RESTful API server for the RERUM digital object repository. It stores any valid JSON object but prefers JSON-LD objects such as Web Annotations (https://www.w3.org/TR/annotation-model/) and IIIF Presentation API (https://iiif.io/api/presentation/3.0/) resources. The system emphasizes open access, attribution, versioning, and compliance with Linked Data standards. It's responses follow RESTful best practices (https://restfulapi.net/http-status-codes/). It is maintained by the Research Computing Group at Saint Louis University (https://www.slu.edu/research/faculty-resources/research-computing/index.php). |
| 8 | + |
| 9 | +It is hosted on the web as a centralized API using a centralized database that many applications read from and write to concurrently. It promotes and encourages open source development, and can be a cheap option as an API and back end for a web application. A sanbox API and client application called TinyThings (https://tiny.rerum.io) gives developers an easy rapid prototyping option. That repo can be found at https://github.com/CenterForDigitalHumanities/TinyNode. |
| 10 | + |
| 11 | +Users register with the RERUM API by signing up through Auth0. This will generate a refresh token and an access token for those who sign up. The access token is used as the Bearer Token on requests, which the RERUM API can then use to authenticate the request is from a registered RERUM application. The refresh token can be used to get new access tokens through the RERUM API. All data created and updated gets a `__rerum.generatedBy` property that is an Agent URI encoded in that token. In this way, all data created an updated is attributed to specific application. The API encourages applications to go a step further and attribute data to specific users with a user system and a `creator` property. |
| 12 | + |
| 13 | +In production and development it is registered in a pm2 instance running on a 4-core RHEL VM. It is started with `pm2 start -i max`, and so load balances across 4 instances. The MongoDB that stores all the data is hosted through MongoDB Atlas. The .github folder contains CI/CD for production and development deployment pipelines. |
| 14 | + |
| 15 | + |
| 16 | +**Key Principles:** |
| 17 | +- Save an object, retrieve an object—metadata lives in private `__rerum` property |
| 18 | +- Trust the application, not the user—Auth0 JWT tokens for write operations |
| 19 | +- Open and Free—no charge to read or write, all contributions exposed immediately |
| 20 | +- Attributed and Versioned—all objects track ownership and transaction history |
| 21 | + |
| 22 | +## Development Commands |
| 23 | + |
| 24 | +### Setup and Installation |
| 25 | +```bash |
| 26 | +npm install # Install dependencies (2-5 seconds) |
| 27 | +``` |
| 28 | + |
| 29 | +### Running the Application |
| 30 | +```bash |
| 31 | +npm start # Start server (http://localhost:3001 by default) |
| 32 | +``` |
| 33 | + |
| 34 | +### Testing |
| 35 | +```bash |
| 36 | +npm run runtest # Run full test suite (25+ minutes, requires MongoDB) |
| 37 | +npm run runtest -- __tests__/routes_mounted.test.js # Run route mounting tests (30 seconds, no DB needed) |
| 38 | +npm run runtest -- routes/__tests__/create.test.js # Run specific test file |
| 39 | +``` |
| 40 | + |
| 41 | +**Important:** Use `npm run runtest` (not `npm test`) as it enables experimental VM modules required for ES6 imports in Jest. |
| 42 | + |
| 43 | +### Development Workflow |
| 44 | +```bash |
| 45 | +# After making routing changes |
| 46 | +npm run runtest -- __tests__/routes_mounted.test.js |
| 47 | + |
| 48 | +# Test server startup |
| 49 | +npm start # Should display "LISTENING ON 3001" (or configured PORT) |
| 50 | + |
| 51 | +# In another terminal, test endpoints |
| 52 | +curl -I http://localhost:3001/v1/API.html |
| 53 | +curl -X POST http://localhost:3001/v1/api/query -H "Content-Type: application/json" -d '{"test":"value"}' |
| 54 | +``` |
| 55 | + |
| 56 | +## Architecture |
| 57 | + |
| 58 | +### High-Level Structure |
| 59 | + |
| 60 | +The application follows a **layered architecture** with clear separation of concerns: |
| 61 | + |
| 62 | +``` |
| 63 | +app.js (Express setup, middleware) |
| 64 | + ↓ |
| 65 | +routes/api-routes.js (route mounting & definitions) |
| 66 | + ↓ |
| 67 | +routes/*.js (individual route handlers with JWT auth) |
| 68 | + ↓ |
| 69 | +db-controller.js (controller aggregator) |
| 70 | + ↓ |
| 71 | +controllers/*.js (business logic modules) |
| 72 | + ↓ |
| 73 | +database/index.js (MongoDB connection & operations) |
| 74 | +``` |
| 75 | + |
| 76 | +### Key Architectural Components |
| 77 | + |
| 78 | +**1. Request Flow:** |
| 79 | +- Client → Express middleware (CORS, logging, body parsing) |
| 80 | +- → Auth middleware (JWT validation via Auth0) |
| 81 | +- → Route handlers (routes/*.js) |
| 82 | +- → Controllers (controllers/*.js with business logic) |
| 83 | +- → Database operations (MongoDB via database/index.js) |
| 84 | +- → Response with proper Linked Data HTTP headers |
| 85 | + |
| 86 | +**2. Versioning System:** |
| 87 | +- Every object has a `__rerum` property with versioning metadata |
| 88 | +- `history.prime`: Root object ID (or "root" if this is the prime) |
| 89 | +- `history.previous`: Immediate parent version |
| 90 | +- `history.next[]`: Array of child versions |
| 91 | +- Updates create new objects with new IDs, maintaining version chains |
| 92 | +- Released objects are immutable (isReleased !== "") |
| 93 | + |
| 94 | +**3. Controllers Organization:** |
| 95 | +The `db-controller.js` is a facade that imports from specialized controller modules: |
| 96 | +- `controllers/crud.js`: Core create, query, id operations |
| 97 | +- `controllers/update.js`: PUT/PATCH update operations (putUpdate, patchUpdate, patchSet, patchUnset, overwrite) |
| 98 | +- `controllers/delete.js`: Delete operations |
| 99 | +- `controllers/history.js`: Version history and since queries, HEAD request handlers |
| 100 | +- `controllers/release.js`: Object release (immutability) |
| 101 | +- `controllers/bulk.js`: Bulk create and update operations |
| 102 | +- `controllers/search.js`: MongoDB text search (searchAsWords, searchAsPhrase) |
| 103 | +- `controllers/gog.js`: Gallery of Glosses specific operations (fragments, glosses, expand) |
| 104 | +- `controllers/utils.js`: Shared utilities (ID generation, slug handling, agent claims) |
| 105 | + |
| 106 | +**4. Authentication & Authorization:** |
| 107 | +- **Provider:** Auth0 JWT bearer tokens |
| 108 | +- **Middleware:** `auth/index.js` with express-oauth2-jwt-bearer |
| 109 | +- **Flow:** checkJwt array includes READONLY check, Auth0 validation, token error handling, user extraction |
| 110 | +- **Agent Matching:** Write operations verify `req.user` matches `__rerum.generatedBy` |
| 111 | +- **Bot Access:** Special bot tokens (BOT_TOKEN, BOT_AGENT) bypass some checks |
| 112 | + |
| 113 | +**5. Special Features:** |
| 114 | +- **Slug IDs:** Optional human-readable IDs via Slug header (e.g., "my-annotation") |
| 115 | +- **PATCH Override:** X-HTTP-Method-Override header allows POST to emulate PATCH for clients without PATCH support |
| 116 | +- **GOG Routes:** Specialized endpoints for Gallery of Glosses project (`/gog/fragmentsInManuscript`, `/gog/glossesInManuscript`) |
| 117 | +- **Content Negotiation:** Handles both `@id`/`@context` (JSON-LD) and `id` (plain JSON) patterns |
| 118 | + |
| 119 | +### Directory Structure |
| 120 | + |
| 121 | +``` |
| 122 | +/bin/ Entry point (rerum_v1.js creates HTTP server) |
| 123 | +/routes/ Route handlers (one file per endpoint typically) |
| 124 | +/controllers/ Business logic organized by domain |
| 125 | +/auth/ Authentication middleware and token handling |
| 126 | +/database/ MongoDB connection and utilities |
| 127 | +/public/ Static files (API.html docs, context.json) |
| 128 | +/utils.js Core utilities (__rerum configuration, header generation) |
| 129 | +/rest.js REST error handling and messaging |
| 130 | +/app.js Express app setup and middleware configuration |
| 131 | +/db-controller.js Controller facade exporting all operations |
| 132 | +``` |
| 133 | + |
| 134 | +## Important Patterns and Conventions |
| 135 | + |
| 136 | +### 1. __rerum Property Management |
| 137 | +Never trust client-provided `__rerum` data. Always use `utils.configureRerumOptions()` to set: |
| 138 | +- `APIversion`, `createdAt`, `generatedBy` |
| 139 | +- `history`: {prime, previous, next[]} |
| 140 | +- `releases`: {previous, next[], replaces} |
| 141 | +- `isOverwritten`, `isReleased`, `slug` |
| 142 | + |
| 143 | +### 2. ID Handling |
| 144 | +Objects have both MongoDB `_id` and JSON-LD `@id` or `id`: |
| 145 | +- `_id`: MongoDB ObjectId (or slug if provided) |
| 146 | +- `@id`: Full URI like `{RERUM_ID_PREFIX}{_id}` |
| 147 | +- Use `idNegotiation()` to handle @context variations (some contexts prefer `id` over `@id`) |
| 148 | +- Use `parseDocumentID()` to extract _id from full URIs |
| 149 | + |
| 150 | +### 3. Error Handling |
| 151 | +- Use `createExpressError(err)` from controllers/utils.js to format errors |
| 152 | +- Let errors propagate to `rest.js` messenger middleware—don't res.send() in controllers |
| 153 | +- Messenger adds helpful context based on status code (401, 403, 404, 405, 409, 500, 503) |
| 154 | + |
| 155 | +### 4. Headers |
| 156 | +- Use `utils.configureWebAnnoHeadersFor(obj)` for single objects (Content-Type, Link, Allow) |
| 157 | +- Use `utils.configureLDHeadersFor(obj)` for arrays/query results |
| 158 | +- Use `utils.configureLastModifiedHeader(obj)` for caching support |
| 159 | +- Always set Location header on 201 Created responses |
| 160 | + |
| 161 | +### 5. Maintenance Mode |
| 162 | +Check `process.env.DOWN` and `process.env.READONLY`: |
| 163 | +- DOWN="true": Return 503 for all requests |
| 164 | +- READONLY="true": Block write operations (create/update/delete) with 503 |
| 165 | + |
| 166 | +### 6. Versioning Logic |
| 167 | +When updating (PUT/PATCH): |
| 168 | +1. Clone original object with its @id |
| 169 | +2. Pass to `configureRerumOptions(generator, cloned, true, false)` |
| 170 | +3. Insert as new object with new _id |
| 171 | +4. Update original's `history.next[]` array to include new version's @id |
| 172 | +5. Never modify released objects (isReleased check) |
| 173 | + |
| 174 | +### 7. Deleted Objects |
| 175 | +Deleted objects are transformed: `{"@id": "{id}", "__deleted": {original object properties, "time": ISO-date}}`. The history trees there were a part of are healed to remain connected (this cannot be undone). They are removed from /query and /search results, but deleted objects can always be retrieved by the URI id and will be returned in their deleted form. |
| 176 | + |
| 177 | +## Configuration |
| 178 | + |
| 179 | +Create `.env` file in root with: |
| 180 | + |
| 181 | +```bash |
| 182 | +RERUM_API_VERSION=1.1.0 |
| 183 | +RERUM_BASE=http://localhost:3001 |
| 184 | +RERUM_PREFIX=http://localhost:3001/v1/ |
| 185 | +RERUM_ID_PREFIX=http://localhost:3001/v1/id/ |
| 186 | +RERUM_AGENT_CLAIM=http://localhost:3001/agent |
| 187 | +RERUM_CONTEXT=http://localhost:3001/v1/context.json |
| 188 | +RERUM_API_DOC=http://localhost:3001/v1/API.html |
| 189 | +MONGO_CONNECTION_STRING=mongodb://localhost:27017 |
| 190 | +MONGODBNAME=rerum |
| 191 | +MONGODBCOLLECTION=objects |
| 192 | +DOWN=false |
| 193 | +READONLY=false |
| 194 | +PORT=3001 |
| 195 | + |
| 196 | +# Auth0 Configuration (contact research.computing@slu.edu) |
| 197 | +AUDIENCE=your-audience |
| 198 | +ISSUER_BASE_URL=https://your-tenant.auth0.com/ |
| 199 | +CLIENTID=your-client-id |
| 200 | +RERUMSECRET=your-secret |
| 201 | +BOT_TOKEN=your-bot-token |
| 202 | +BOT_AGENT=your-bot-agent-url |
| 203 | +``` |
| 204 | + |
| 205 | +## Testing Notes |
| 206 | + |
| 207 | +- **Route tests** (`__tests__/routes_mounted.test.js`): Work without MongoDB, verify routing and static files |
| 208 | +- **Controller tests** (`routes/__tests__/*.test.js`): Require MongoDB connection or will timeout after 5 seconds |
| 209 | +- Tests use experimental VM modules, hence `npm run runtest` instead of `npm test` |
| 210 | +- "Jest did not exit" warnings are normal—tests complete successfully despite this |
| 211 | +- Most tests expect Auth0 to be configured; mock tokens are used in test environment |
| 212 | + |
| 213 | +## Working Without MongoDB |
| 214 | + |
| 215 | +**What works:** |
| 216 | +- Server startup |
| 217 | +- Static file serving (/v1/API.html, /v1/context.json, etc.) |
| 218 | +- Route mounting and basic request handling |
| 219 | +- Authentication handling (returns proper 401 errors) |
| 220 | + |
| 221 | +**What fails:** |
| 222 | +- All database operations return "Topology is closed" errors |
| 223 | +- /query, /create, /update, /delete, /id/{id}, /history, /since |
| 224 | + |
| 225 | +**Development tip:** Use route mounting tests to validate routing changes without requiring database setup. |
| 226 | + |
| 227 | +## Common Gotchas |
| 228 | + |
| 229 | +1. **Semicolons:** This codebase avoids unnecessary semicolons—follow existing style |
| 230 | +2. **Guard clauses:** Prefer early returns over nested if/else for clarity |
| 231 | +3. **Optional chaining:** Use `?.` and `??` operators when appropriate |
| 232 | +4. **Node version:** Requires Node.js 22.20.0+ (specified in package.json engines). Prefer to use active Node LTS release. |
| 233 | +5. **ES2015 syntax:** Uses modern ES2015 javascript sytax |
| 234 | +5. **Import statements:** Uses ES6 modules (`import`), not CommonJS (`require`) |
| 235 | +6. **Controller returns:** Controllers call `next(err)` for errors and `res.json()` for success—don't mix both |
| 236 | +7. **Version chains:** History and since queries follow bidirectional version relationships through prime, previous, and next properties |
| 237 | + |
| 238 | +## API Endpoints Reference |
| 239 | + |
| 240 | +Full documentation at http://localhost:3001/v1/API.html when server is running. |
| 241 | + |
| 242 | +**Key endpoints:** |
| 243 | +- POST `/v1/api/create` - Create new object |
| 244 | +- PUT `/v1/api/update` - Version existing object via replacement |
| 245 | +- PATCH `/v1/api/patch` - Version existing object via property update |
| 246 | +- PATCH `/v1/api/set` - Add properties to existing object |
| 247 | +- PATCH `/v1/api/unset` - Remove properties from existing object |
| 248 | +- POST `/v1/api/overwrite` - Overwrite object without versioning |
| 249 | +- DELETE `/v1/api/delete` - Mark object as deleted |
| 250 | +- POST `/v1/api/query` - Query objects by properties |
| 251 | +- POST `/v1/api/search` - Full-text search |
| 252 | +- GET `/v1/id/{id}` - Retrieve object by ID or slug |
| 253 | +- GET `/v1/history/{id}` - Get version history (ancestors) |
| 254 | +- GET `/v1/since/{id}` - Get version descendants |
| 255 | +- POST `/v1/api/release` - Lock object as immutable |
| 256 | +- POST `/v1/api/bulkCreate` - Create multiple objects |
| 257 | +- POST `/v1/api/bulkUpdate` - Update multiple objects |
| 258 | + |
| 259 | +## Additional Resources |
| 260 | + |
| 261 | +- Project homepage: https://rerum.io |
| 262 | +- Public production instance: https://store.rerum.io |
| 263 | +- Public development Instance: https://devstore.rerum.io |
| 264 | +- RERUM API Document: https://store.rerum.io/API.html |
| 265 | +- Repository: https://github.com/CenterForDigitalHumanities/rerum_server_nodejs |
| 266 | +- Auth0 setup: Contact research.computing@slu.edu |
| 267 | + |
| 268 | +## Additional Developer Preferences for AI Assistant Behavior |
| 269 | + |
| 270 | +1. Do not automatically commit or push code. Developers prefer to do this themselves when the time is right. |
| 271 | + - Make the code changes as requested. |
| 272 | + - Explain what changed and why. |
| 273 | + - Stop before committing. The developer will decide at what point to commit changes on their own. You do not need to keep track of it. |
| 274 | +2. No auto compacting. We will compact ourselves if the context gets too big. |
| 275 | +3. When creating documentation do not add Claude as an @author. |
| 276 | +4. Preference using current libraries and native javascript/ExpressJS/Node capabilities instead of installing new npm packages to solve a problem. |
| 277 | + - However, we understand that sometimes we need a package or a package is perfectly designed to solve our problem. Ask if we want to use them in these cases. |
| 278 | +5. We like colors in our terminals! Be diverse and color text in the terminal for the different purposes of the text. (ex. errors red, success green, logs bold white, etc.) |
| 279 | +6. We like to see logs from running code, so expose those logs in the terminal logs as much as possible. |
| 280 | +7. Use JDoc style for code documentation. Cleanup, fix, or generate documentation for the code you work on as you encounter it. |
| 281 | +8. We use `npm start` often to run the app locally. However, do not make code edits based on this assumption. Production and development load balance in the app with pm2, not by using `npm start` |
0 commit comments