Skip to content

Commit 8ce3ad5

Browse files
authored
Merge pull request #26 from oss-slu/Issue-24
merged the upstream changes into current branch
2 parents e6c0ef0 + 1de2c69 commit 8ce3ad5

21 files changed

Lines changed: 4236 additions & 5575 deletions

.claude/CLAUDE.md

Lines changed: 281 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,281 @@
1+
# CLAUDE.md
2+
3+
This file provides guidance to AI assistants when working with code in this repository.
4+
5+
## Project Overview
6+
7+
RERUM API v1 is an open source Node.js/Express RESTful API server for the RERUM digital object repository. It stores any valid JSON object but prefers JSON-LD objects such as Web Annotations (https://www.w3.org/TR/annotation-model/) and IIIF Presentation API (https://iiif.io/api/presentation/3.0/) resources. The system emphasizes open access, attribution, versioning, and compliance with Linked Data standards. It's responses follow RESTful best practices (https://restfulapi.net/http-status-codes/). It is maintained by the Research Computing Group at Saint Louis University (https://www.slu.edu/research/faculty-resources/research-computing/index.php).
8+
9+
It is hosted on the web as a centralized API using a centralized database that many applications read from and write to concurrently. It promotes and encourages open source development, and can be a cheap option as an API and back end for a web application. A sanbox API and client application called TinyThings (https://tiny.rerum.io) gives developers an easy rapid prototyping option. That repo can be found at https://github.com/CenterForDigitalHumanities/TinyNode.
10+
11+
Users register with the RERUM API by signing up through Auth0. This will generate a refresh token and an access token for those who sign up. The access token is used as the Bearer Token on requests, which the RERUM API can then use to authenticate the request is from a registered RERUM application. The refresh token can be used to get new access tokens through the RERUM API. All data created and updated gets a `__rerum.generatedBy` property that is an Agent URI encoded in that token. In this way, all data created an updated is attributed to specific application. The API encourages applications to go a step further and attribute data to specific users with a user system and a `creator` property.
12+
13+
In production and development it is registered in a pm2 instance running on a 4-core RHEL VM. It is started with `pm2 start -i max`, and so load balances across 4 instances. The MongoDB that stores all the data is hosted through MongoDB Atlas. The .github folder contains CI/CD for production and development deployment pipelines.
14+
15+
16+
**Key Principles:**
17+
- Save an object, retrieve an object—metadata lives in private `__rerum` property
18+
- Trust the application, not the user—Auth0 JWT tokens for write operations
19+
- Open and Free—no charge to read or write, all contributions exposed immediately
20+
- Attributed and Versioned—all objects track ownership and transaction history
21+
22+
## Development Commands
23+
24+
### Setup and Installation
25+
```bash
26+
npm install # Install dependencies (2-5 seconds)
27+
```
28+
29+
### Running the Application
30+
```bash
31+
npm start # Start server (http://localhost:3001 by default)
32+
```
33+
34+
### Testing
35+
```bash
36+
npm run runtest # Run full test suite (25+ minutes, requires MongoDB)
37+
npm run runtest -- __tests__/routes_mounted.test.js # Run route mounting tests (30 seconds, no DB needed)
38+
npm run runtest -- routes/__tests__/create.test.js # Run specific test file
39+
```
40+
41+
**Important:** Use `npm run runtest` (not `npm test`) as it enables experimental VM modules required for ES6 imports in Jest.
42+
43+
### Development Workflow
44+
```bash
45+
# After making routing changes
46+
npm run runtest -- __tests__/routes_mounted.test.js
47+
48+
# Test server startup
49+
npm start # Should display "LISTENING ON 3001" (or configured PORT)
50+
51+
# In another terminal, test endpoints
52+
curl -I http://localhost:3001/v1/API.html
53+
curl -X POST http://localhost:3001/v1/api/query -H "Content-Type: application/json" -d '{"test":"value"}'
54+
```
55+
56+
## Architecture
57+
58+
### High-Level Structure
59+
60+
The application follows a **layered architecture** with clear separation of concerns:
61+
62+
```
63+
app.js (Express setup, middleware)
64+
65+
routes/api-routes.js (route mounting & definitions)
66+
67+
routes/*.js (individual route handlers with JWT auth)
68+
69+
db-controller.js (controller aggregator)
70+
71+
controllers/*.js (business logic modules)
72+
73+
database/index.js (MongoDB connection & operations)
74+
```
75+
76+
### Key Architectural Components
77+
78+
**1. Request Flow:**
79+
- Client → Express middleware (CORS, logging, body parsing)
80+
- → Auth middleware (JWT validation via Auth0)
81+
- → Route handlers (routes/*.js)
82+
- → Controllers (controllers/*.js with business logic)
83+
- → Database operations (MongoDB via database/index.js)
84+
- → Response with proper Linked Data HTTP headers
85+
86+
**2. Versioning System:**
87+
- Every object has a `__rerum` property with versioning metadata
88+
- `history.prime`: Root object ID (or "root" if this is the prime)
89+
- `history.previous`: Immediate parent version
90+
- `history.next[]`: Array of child versions
91+
- Updates create new objects with new IDs, maintaining version chains
92+
- Released objects are immutable (isReleased !== "")
93+
94+
**3. Controllers Organization:**
95+
The `db-controller.js` is a facade that imports from specialized controller modules:
96+
- `controllers/crud.js`: Core create, query, id operations
97+
- `controllers/update.js`: PUT/PATCH update operations (putUpdate, patchUpdate, patchSet, patchUnset, overwrite)
98+
- `controllers/delete.js`: Delete operations
99+
- `controllers/history.js`: Version history and since queries, HEAD request handlers
100+
- `controllers/release.js`: Object release (immutability)
101+
- `controllers/bulk.js`: Bulk create and update operations
102+
- `controllers/search.js`: MongoDB text search (searchAsWords, searchAsPhrase)
103+
- `controllers/gog.js`: Gallery of Glosses specific operations (fragments, glosses, expand)
104+
- `controllers/utils.js`: Shared utilities (ID generation, slug handling, agent claims)
105+
106+
**4. Authentication & Authorization:**
107+
- **Provider:** Auth0 JWT bearer tokens
108+
- **Middleware:** `auth/index.js` with express-oauth2-jwt-bearer
109+
- **Flow:** checkJwt array includes READONLY check, Auth0 validation, token error handling, user extraction
110+
- **Agent Matching:** Write operations verify `req.user` matches `__rerum.generatedBy`
111+
- **Bot Access:** Special bot tokens (BOT_TOKEN, BOT_AGENT) bypass some checks
112+
113+
**5. Special Features:**
114+
- **Slug IDs:** Optional human-readable IDs via Slug header (e.g., "my-annotation")
115+
- **PATCH Override:** X-HTTP-Method-Override header allows POST to emulate PATCH for clients without PATCH support
116+
- **GOG Routes:** Specialized endpoints for Gallery of Glosses project (`/gog/fragmentsInManuscript`, `/gog/glossesInManuscript`)
117+
- **Content Negotiation:** Handles both `@id`/`@context` (JSON-LD) and `id` (plain JSON) patterns
118+
119+
### Directory Structure
120+
121+
```
122+
/bin/ Entry point (rerum_v1.js creates HTTP server)
123+
/routes/ Route handlers (one file per endpoint typically)
124+
/controllers/ Business logic organized by domain
125+
/auth/ Authentication middleware and token handling
126+
/database/ MongoDB connection and utilities
127+
/public/ Static files (API.html docs, context.json)
128+
/utils.js Core utilities (__rerum configuration, header generation)
129+
/rest.js REST error handling and messaging
130+
/app.js Express app setup and middleware configuration
131+
/db-controller.js Controller facade exporting all operations
132+
```
133+
134+
## Important Patterns and Conventions
135+
136+
### 1. __rerum Property Management
137+
Never trust client-provided `__rerum` data. Always use `utils.configureRerumOptions()` to set:
138+
- `APIversion`, `createdAt`, `generatedBy`
139+
- `history`: {prime, previous, next[]}
140+
- `releases`: {previous, next[], replaces}
141+
- `isOverwritten`, `isReleased`, `slug`
142+
143+
### 2. ID Handling
144+
Objects have both MongoDB `_id` and JSON-LD `@id` or `id`:
145+
- `_id`: MongoDB ObjectId (or slug if provided)
146+
- `@id`: Full URI like `{RERUM_ID_PREFIX}{_id}`
147+
- Use `idNegotiation()` to handle @context variations (some contexts prefer `id` over `@id`)
148+
- Use `parseDocumentID()` to extract _id from full URIs
149+
150+
### 3. Error Handling
151+
- Use `createExpressError(err)` from controllers/utils.js to format errors
152+
- Let errors propagate to `rest.js` messenger middleware—don't res.send() in controllers
153+
- Messenger adds helpful context based on status code (401, 403, 404, 405, 409, 500, 503)
154+
155+
### 4. Headers
156+
- Use `utils.configureWebAnnoHeadersFor(obj)` for single objects (Content-Type, Link, Allow)
157+
- Use `utils.configureLDHeadersFor(obj)` for arrays/query results
158+
- Use `utils.configureLastModifiedHeader(obj)` for caching support
159+
- Always set Location header on 201 Created responses
160+
161+
### 5. Maintenance Mode
162+
Check `process.env.DOWN` and `process.env.READONLY`:
163+
- DOWN="true": Return 503 for all requests
164+
- READONLY="true": Block write operations (create/update/delete) with 503
165+
166+
### 6. Versioning Logic
167+
When updating (PUT/PATCH):
168+
1. Clone original object with its @id
169+
2. Pass to `configureRerumOptions(generator, cloned, true, false)`
170+
3. Insert as new object with new _id
171+
4. Update original's `history.next[]` array to include new version's @id
172+
5. Never modify released objects (isReleased check)
173+
174+
### 7. Deleted Objects
175+
Deleted objects are transformed: `{"@id": "{id}", "__deleted": {original object properties, "time": ISO-date}}`. The history trees there were a part of are healed to remain connected (this cannot be undone). They are removed from /query and /search results, but deleted objects can always be retrieved by the URI id and will be returned in their deleted form.
176+
177+
## Configuration
178+
179+
Create `.env` file in root with:
180+
181+
```bash
182+
RERUM_API_VERSION=1.1.0
183+
RERUM_BASE=http://localhost:3001
184+
RERUM_PREFIX=http://localhost:3001/v1/
185+
RERUM_ID_PREFIX=http://localhost:3001/v1/id/
186+
RERUM_AGENT_CLAIM=http://localhost:3001/agent
187+
RERUM_CONTEXT=http://localhost:3001/v1/context.json
188+
RERUM_API_DOC=http://localhost:3001/v1/API.html
189+
MONGO_CONNECTION_STRING=mongodb://localhost:27017
190+
MONGODBNAME=rerum
191+
MONGODBCOLLECTION=objects
192+
DOWN=false
193+
READONLY=false
194+
PORT=3001
195+
196+
# Auth0 Configuration (contact research.computing@slu.edu)
197+
AUDIENCE=your-audience
198+
ISSUER_BASE_URL=https://your-tenant.auth0.com/
199+
CLIENTID=your-client-id
200+
RERUMSECRET=your-secret
201+
BOT_TOKEN=your-bot-token
202+
BOT_AGENT=your-bot-agent-url
203+
```
204+
205+
## Testing Notes
206+
207+
- **Route tests** (`__tests__/routes_mounted.test.js`): Work without MongoDB, verify routing and static files
208+
- **Controller tests** (`routes/__tests__/*.test.js`): Require MongoDB connection or will timeout after 5 seconds
209+
- Tests use experimental VM modules, hence `npm run runtest` instead of `npm test`
210+
- "Jest did not exit" warnings are normal—tests complete successfully despite this
211+
- Most tests expect Auth0 to be configured; mock tokens are used in test environment
212+
213+
## Working Without MongoDB
214+
215+
**What works:**
216+
- Server startup
217+
- Static file serving (/v1/API.html, /v1/context.json, etc.)
218+
- Route mounting and basic request handling
219+
- Authentication handling (returns proper 401 errors)
220+
221+
**What fails:**
222+
- All database operations return "Topology is closed" errors
223+
- /query, /create, /update, /delete, /id/{id}, /history, /since
224+
225+
**Development tip:** Use route mounting tests to validate routing changes without requiring database setup.
226+
227+
## Common Gotchas
228+
229+
1. **Semicolons:** This codebase avoids unnecessary semicolons—follow existing style
230+
2. **Guard clauses:** Prefer early returns over nested if/else for clarity
231+
3. **Optional chaining:** Use `?.` and `??` operators when appropriate
232+
4. **Node version:** Requires Node.js 22.20.0+ (specified in package.json engines). Prefer to use active Node LTS release.
233+
5. **ES2015 syntax:** Uses modern ES2015 javascript sytax
234+
5. **Import statements:** Uses ES6 modules (`import`), not CommonJS (`require`)
235+
6. **Controller returns:** Controllers call `next(err)` for errors and `res.json()` for success—don't mix both
236+
7. **Version chains:** History and since queries follow bidirectional version relationships through prime, previous, and next properties
237+
238+
## API Endpoints Reference
239+
240+
Full documentation at http://localhost:3001/v1/API.html when server is running.
241+
242+
**Key endpoints:**
243+
- POST `/v1/api/create` - Create new object
244+
- PUT `/v1/api/update` - Version existing object via replacement
245+
- PATCH `/v1/api/patch` - Version existing object via property update
246+
- PATCH `/v1/api/set` - Add properties to existing object
247+
- PATCH `/v1/api/unset` - Remove properties from existing object
248+
- POST `/v1/api/overwrite` - Overwrite object without versioning
249+
- DELETE `/v1/api/delete` - Mark object as deleted
250+
- POST `/v1/api/query` - Query objects by properties
251+
- POST `/v1/api/search` - Full-text search
252+
- GET `/v1/id/{id}` - Retrieve object by ID or slug
253+
- GET `/v1/history/{id}` - Get version history (ancestors)
254+
- GET `/v1/since/{id}` - Get version descendants
255+
- POST `/v1/api/release` - Lock object as immutable
256+
- POST `/v1/api/bulkCreate` - Create multiple objects
257+
- POST `/v1/api/bulkUpdate` - Update multiple objects
258+
259+
## Additional Resources
260+
261+
- Project homepage: https://rerum.io
262+
- Public production instance: https://store.rerum.io
263+
- Public development Instance: https://devstore.rerum.io
264+
- RERUM API Document: https://store.rerum.io/API.html
265+
- Repository: https://github.com/CenterForDigitalHumanities/rerum_server_nodejs
266+
- Auth0 setup: Contact research.computing@slu.edu
267+
268+
## Additional Developer Preferences for AI Assistant Behavior
269+
270+
1. Do not automatically commit or push code. Developers prefer to do this themselves when the time is right.
271+
- Make the code changes as requested.
272+
- Explain what changed and why.
273+
- Stop before committing. The developer will decide at what point to commit changes on their own. You do not need to keep track of it.
274+
2. No auto compacting. We will compact ourselves if the context gets too big.
275+
3. When creating documentation do not add Claude as an @author.
276+
4. Preference using current libraries and native javascript/ExpressJS/Node capabilities instead of installing new npm packages to solve a problem.
277+
- However, we understand that sometimes we need a package or a package is perfectly designed to solve our problem. Ask if we want to use them in these cases.
278+
5. We like colors in our terminals! Be diverse and color text in the terminal for the different purposes of the text. (ex. errors red, success green, logs bold white, etc.)
279+
6. We like to see logs from running code, so expose those logs in the terminal logs as much as possible.
280+
7. Use JDoc style for code documentation. Cleanup, fix, or generate documentation for the code you work on as you encounter it.
281+
8. We use `npm start` often to run the app locally. However, do not make code edits based on this assumption. Production and development load balance in the app with pm2, not by using `npm start`

.github/copilot-instructions.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ RERUM API v1 is a NodeJS web service for interaction with the RERUM digital obje
2828
3. **Create .env configuration file** (required for operation):
2929
```bash
3030
# Create .env file in repository root
31-
RERUM_API_VERSION=1.0.0
31+
RERUM_API_VERSION=1.1.0
3232
RERUM_BASE=http://localhost:3005
3333
RERUM_PREFIX=http://localhost:3005/v1/
3434
RERUM_ID_PREFIX=http://localhost:3005/v1/id/

.github/workflows/cd_dev.yaml

Lines changed: 0 additions & 73 deletions
This file was deleted.

0 commit comments

Comments
 (0)