Separated db-controller.js into modules by cubap · Pull Request #201 · CenterForDigitalHumanities/rerum_server_nodejs

cubap · 2025-07-03T21:36:40Z

🎯 Refactor monolithic db-controller.js into modular controller structure

📋 Summary

This PR addresses the technical debt of the large, monolithic db-controller.js file by breaking it down into smaller, logically grouped modules. The refactoring improves code maintainability, readability, and follows separation of concerns principles.

🔄 Changes Made

New Controller Modules Created:

controllers/utils.js - Utility functions (idNegotiation, generateSlugId, etc.)
controllers/crud.js - CRUD operations (create, query, id)
controllers/delete.js - Delete operations and history helpers
controllers/update.js - Update operations aggregator
controllers/putUpdate.js - PUT update and import operations
controllers/patchUpdate.js - PATCH update operations
controllers/patchSet.js - PATCH set operations
controllers/patchUnset.js - PATCH unset operations
controllers/overwrite.js - Overwrite operations with optimistic locking
controllers/bulk.js - Bulk operations (bulkCreate, bulkUpdate)
controllers/history.js - History operations (since, history, head requests)
controllers/release.js - Release operations
controllers/gog.js - Gallery of Glosses endpoints

Updated Files:

db-controller.js - Now serves as a clean aggregator that imports and re-exports from new modules
routes/__tests__/overwrite-optimistic-locking.test.js - Added test for optimistic locking

Backup Files:

db-controller-backup.js - Complete backup of original file
db-controller.js.backup - Additional backup for safety

🎯 Key Benefits

📦 Modular Structure: Each controller has a single responsibility
🔧 Easier Maintenance: Smaller files are easier to understand and modify
🧪 Better Testability: Individual modules can be tested in isolation
⚡ Improved Performance: Smaller modules can be loaded on-demand
👥 Better Collaboration: Multiple developers can work on different modules simultaneously
📚 ES6 Modules: Modern import/export syntax throughout

🔒 Backward Compatibility

✅ All existing function exports remain unchanged
✅ No breaking changes to existing API
✅ All route handlers continue to work as before
✅ Test compatibility maintained

🧪 Testing

All existing tests continue to pass
New test added for optimistic locking functionality
No new test failures introduced

📁 File Structure

controllers/ 
├── utils.js # Utility functions 
├── crud.js # Create, Read operations 
├── delete.js # Delete operations 
├── update.js # Update operations aggregator 
├── putUpdate.js # PUT update operations 
├── patchUpdate.js # PATCH update operations 
├── patchSet.js # PATCH set operations 
├── patchUnset.js # PATCH unset operations 
├── overwrite.js # Overwrite operations 
├── bulk.js # Bulk operations 
├── history.js # History operations 
├── release.js # Release operations 
└── gog.js # Gallery of Glosses operations

🚀 Technical Implementation

ES6 Modules: Consistent use of modern import/export syntax
Clean Aggregation: Main controller cleanly imports and re-exports all functions
Minimal Code Changes: Function logic preserved with minimal modifications
Error Handling: All error handling patterns maintained
Performance: No performance impact, potentially improved due to better modularity

👨‍💻 Contributors

Patrick Cuba (@cubap) - Original implementation and architecture
GitHub Copilot - Assisted with refactoring and modularization
thehabes - Project maintenance and code review

🔍 Review Notes

This refactoring maintains 100% backward compatibility while significantly improving code organization. The changes are primarily structural - moving code from one large file into multiple smaller, focused files without altering the core business logic.

Ready for Review! 🎉

Copilot

Pull Request Overview

Refactors the large db-controller.js into multiple focused controller modules while maintaining existing exports and behavior.

Splits controllers into 13 ES6 modules (utils, crud, delete, update, putUpdate, patchUpdate, patchSet, patchUnset, overwrite, bulk, history, release, gog)
Updates db-controller.js to aggregate and re-export these modules
Adds a test for optimistic locking in overwrite

Reviewed Changes

Copilot reviewed 14 out of 17 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
db-controller-backup.js	Imports and re-exports all controller modules
controllers/utils.js	Core utility and helper functions
controllers/crud.js	Create and read operations
controllers/delete.js	Soft-delete and history healing
controllers/update.js	Aggregates update operations
controllers/putUpdate.js	PUT-based update and import
controllers/patchUpdate.js	PATCH update logic
controllers/patchSet.js	PATCH set logic
controllers/patchUnset.js	PATCH unset logic
controllers/overwrite.js	Overwrite with optimistic locking
controllers/bulk.js	Bulk create and update operations
controllers/history.js	History and HEAD request handlers
controllers/release.js	Release operations and tree management
controllers/gog.js	Gallery of Glosses endpoints

Comments suppressed due to low confidence (1)

controllers/utils.js:73

The generateSlugId helper isn’t directly covered by existing tests. Adding unit tests for slug generation collision and error paths would improve confidence.

const generateSlugId = async function(slug_id="", next){

Copilot · 2025-07-03T21:42:55Z

+    const skip = parseInt(req.query.skip ?? 0)
+    let err = { message: `` }
+    // This request can only be made my Gallery of Glosses production apps.
+    if (!agentID === "61043ad4ffce846a83e700dd") {


The check !agentID === "..." always evaluates incorrectly. It should be agentID !== "61043ad4ffce846a83e700dd" to enforce the intended comparison.

Suggested change

if (!agentID === "61043ad4ffce846a83e700dd") {

if (agentID !== "61043ad4ffce846a83e700dd") {

Copilot · 2025-07-03T21:42:56Z

+        const fragmentSet = new Set(witnessFragments)
+        witnessFragments = Array.from(fragmentSet.values())


Using a Set on an array of objects won't dedupe by object content but by reference. Consider deduplicating by a unique key (e.g., @id) to avoid redundant entries.

Suggested change

const fragmentSet = new Set(witnessFragments)

witnessFragments = Array.from(fragmentSet.values())

const fragmentMap = new Map()

witnessFragments.forEach(fragment => {

if (fragment['@id']) {

fragmentMap.set(fragment['@id'], fragment)

}

})

witnessFragments = Array.from(fragmentMap.values())

where credit is due

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

+        return
+    }
+    try {
+        let matches = await db.find(props).limit(limit).skip(skip).toArray()


To fix the issue, we need to ensure that the user-provided input (req.body) is sanitized or validated before being used in the database query. Specifically:

Use MongoDB's $eq operator to ensure that the user input is treated as a literal value and not as a query object.

Alternatively, validate that props contains only literal values (e.g., strings, numbers) and does not include any special query operators.

The best approach in this case is to validate props to ensure it contains only literal values. This ensures that the query is safe while preserving the intended functionality.

+    res.set("Content-Type", "application/json; charset=utf-8")
+    let props = req.body
+    try {
+        let matches = await db.find(props).toArray()


To fix the vulnerability, we need to ensure that the user-provided req.body is sanitized or validated before being used in the MongoDB query. The best approach is to enforce that props is a literal value and not a query object with special operators. This can be achieved by:

Validating the structure of req.body to ensure it contains only expected fields and values.

Using MongoDB's $eq operator to treat user input as literal values.

The fix involves modifying the queryHeadRequest function to validate req.body and use the $eq operator in the query. This ensures that untrusted input is interpreted as literal values, mitigating the risk of NoSQL injection.

+         * This is the because some of the @ids have different RERUM URL patterns on them.
+         **/
+    //All the children of this object will have its @id in __rerum.history.prime
+    ls_versions = await db.find({ "__rerum.history.prime": rootObj['@id'] }).toArray()


To fix the issue, we need to ensure that the user-controlled data (rootObj['@id']) is sanitized or validated before being used in the MongoDB query. The best approach is to use the $eq operator to enforce that the input is treated as a literal value. Additionally, we can validate the input to ensure it is a string or a valid identifier before constructing the query.

Steps to fix:

Modify the query in getAllVersions (line 212 of controllers/utils.js) to use the $eq operator.

Add a validation step to ensure that rootObj['@id'] is a string or a valid identifier before using it in the query.

+        safe_descendant.__rerum.releases.previous = releasing["@id"]
+        let result
+        try {
+            result = await db.replaceOne({ "_id": d_id }, safe_descendant)


To fix the issue, we need to ensure that the _id field used in the query object { "_id": d_id } is sanitized or validated to prevent NoSQL injection. The best approach is to use MongoDB's $eq operator to ensure that the value is treated as a literal and not as a query object. Additionally, we should validate that d_id is a string or a valid ObjectID before using it in the query.

Changes will be made in the establishReleasesTree function in controllers/utils.js to:

Use the $eq operator in the query object.

Validate that d_id is a string or a valid ObjectID.

+        }
+        let result
+        try {
+            result = await db.replaceOne({ "_id": a_id }, safe_ancestor)


To fix the issue, we need to ensure that the user-controlled input is sanitized or validated before being used in the MongoDB query. Specifically:

Use MongoDB's $eq operator to ensure that the _id field is treated as a literal value and not as a query object.

Alternatively, validate the _id field to ensure it is a valid string or ObjectID before using it in the query.

The best approach in this case is to use the $eq operator in the query object. This ensures that the input is treated as a literal value, mitigating the risk of NoSQL injection.

+            }
+            let result
+            try {
+                result = await db.replaceOne({ "_id": d_id }, safe_descendant)


To fix the issue, we need to ensure that the d_id value used in the query object { "_id": d_id } is properly validated or sanitized. Since MongoDB queries are susceptible to NoSQL injection, we can use the $eq operator to ensure that the _id field is treated as a literal value. This approach prevents malicious input from being interpreted as a query object.

Steps to implement the fix:

Modify the query on line 384 in controllers/utils.js to use the $eq operator for the _id field.

Ensure that the d_id value is validated to confirm it is a valid MongoDB ObjectID or a string, depending on the expected format.

+        safe_ancestor.__rerum.releases.next = ancestorNextArray
+        let result
+        try {
+            result = await db.replaceOne({ "_id": a_id }, safe_ancestor)


cubap · 2025-07-07T20:08:15Z

boo:

cubap · 2025-07-07T20:08:35Z

Looks like we're catching 500 instead of 409...

cubap · 2025-07-08T15:31:08Z

Looks like we're catching 500 instead of 409...

Great news team. I think this is actually a Tiny problem™ and not the RERUM errors:

fetch(OVERWRITE_URL, {
        method: 'PUT',
        body: JSON.stringify(obj),
        headers: {
            'Content-Type': 'application/json; charset=utf-8'
        }
    })
    .then(response => {
        if (response.ok) { return response.json() }
        throw response
    })
    .then(resultObj => {
        delete resultObj.new_obj_state
        _customEvent("rerum-result", `URI ${uri} overwritten. See resulting object below:`, resultObj)
    })
    .catch(err => {
        _customEvent("rerum-error", "There was an error trying to overwrite object at " + uri, {}, err)
    })

cubap · 2025-07-08T16:00:14Z

recall we changed TinyPen, but not TinyThings

cubap · 2025-07-08T16:50:16Z

Success:

thehabes

Manually tested using TinyNode. Did a limited amount of looking through the new controllers to make sure it copied the code out correctly, did not notice any errors while using it.

Separated db-controller.js into modules

5d24486

cubap requested a review from thehabes as a code owner July 3, 2025 21:36

cubap requested a review from Copilot July 3, 2025 21:40

Copilot AI reviewed Jul 3, 2025

View reviewed changes

cubap and others added 6 commits July 3, 2025 16:43

Delete overwrite-optimistic-locking.test.js

d1f1c4f

Update update.js

d4f38a6

where credit is due

Update controllers/delete.js

cdafefa

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update controllers/utils.js

f14ee07

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Merge branch 'main' into db-controller-bustamonte

2a07434

Create db-controller-clean.js

8f8ef5a

github-advanced-security AI found potential problems Jul 7, 2025

View reviewed changes

cubap and others added 2 commits July 8, 2025 13:23

Remove unused db-controller backup and clean scripts

059f675

changes from testing and reviewing

6b91f6f

thehabes approved these changes Jul 8, 2025

View reviewed changes

thehabes merged commit f6696dc into main Jul 8, 2025
5 of 6 checks passed

thehabes deleted the db-controller-bustamonte branch November 6, 2025 21:07

@@ -78,2 +78,13 @@
                 }
+                // Validate props to ensure it contains only literal values
+                for (const key in props) {
+                    if (typeof props[key] !== "string" && typeof props[key] !== "number" && typeof props[key] !== "boolean") {
+                        let err = {
+                            message: `Invalid value for key '${key}'. Only string, number, or boolean values are allowed.`,
+                            status: 400
+                        }
+                        next(createExpressError(err))
+                        return
+                    }
+                }
                 try {

@@ -118,4 +118,16 @@
                 let props = req.body
+                if (typeof props !== 'object' || Array.isArray(props)) {
+                    let err = {
+                        "message": "Invalid query format. Request body must be a JSON object.",
+                        "status": 400
+                    }
+                    next(createExpressError(err))
+                    return
+                }
                 try {
-                    let matches = await db.find(props).toArray()
+                    let sanitizedProps = {}
+                    for (let key in props) {
+                        sanitizedProps[key] = { $eq: props[key] }
+                    }
+                    let matches = await db.find(sanitizedProps).toArray()
                     if (matches.length) {

@@ -211,3 +211,6 @@
                 //All the children of this object will have its @id in __rerum.history.prime
-                ls_versions = await db.find({ "__rerum.history.prime": rootObj['@id'] }).toArray()
+                if (typeof rootObj['@id'] !== 'string') {
+                    throw new Error("Invalid @id: must be a string");
+                }
+                ls_versions = await db.find({ "__rerum.history.prime": { $eq: rootObj['@id'] } }).toArray()
                 //The root object is a version, prepend it in

@@ -311,2 +311,6 @@
                     let d_id = safe_descendant._id
+                    if (typeof d_id !== "string" && !ObjectID.isValid(d_id)) {
+                        console.error("Invalid _id detected:", d_id)
+                        return false
+                    }
                     safe_descendant.__rerum.releases.previous = releasing["@id"]
@@ -314,3 +318,3 @@
                     try {
-                        result = await db.replaceOne({ "_id": d_id }, safe_descendant)
+                        result = await db.replaceOne({ "_id": { $eq: d_id } }, safe_descendant)
                     }

	if (!agentID === "61043ad4ffce846a83e700dd") {
	if (agentID !== "61043ad4ffce846a83e700dd") {

		const fragmentSet = new Set(witnessFragments)
		witnessFragments = Array.from(fragmentSet.values())

-        const fragmentSet = new Set(witnessFragments)
-        witnessFragments = Array.from(fragmentSet.values())
+        const fragmentMap = new Map()
+        witnessFragments.forEach(fragment => {
+            if (fragment['@id']) {
+                fragmentMap.set(fragment['@id'], fragment)
+            }
+        })
+        witnessFragments = Array.from(fragmentMap.values())

Conversation

cubap commented Jul 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🎯 Refactor monolithic db-controller.js into modular controller structure

📋 Summary

🔄 Changes Made

New Controller Modules Created:

Updated Files:

Backup Files:

🎯 Key Benefits

🔒 Backward Compatibility

🧪 Testing

📁 File Structure

🚀 Technical Implementation

👨‍💻 Contributors

🔍 Review Notes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Copilot AI Jul 3, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI Jul 3, 2025

Choose a reason for hiding this comment

Uh oh!

Check failure

Uh oh!

Copilot Autofix

Check failure

Uh oh!

Copilot Autofix

Check failure

Uh oh!

Copilot Autofix

Check failure

Uh oh!

Copilot Autofix

Check failure

Uh oh!

Copilot Autofix

Check failure

Uh oh!

Copilot Autofix

Check failure

Uh oh!

cubap commented Jul 7, 2025

Uh oh!

cubap commented Jul 7, 2025

Uh oh!

cubap commented Jul 8, 2025

Uh oh!

cubap commented Jul 8, 2025

Uh oh!

cubap commented Jul 8, 2025

Uh oh!

thehabes left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

cubap commented Jul 3, 2025 •

edited

Loading