Skip to content

perf(orm): batch many-to-many relation manipulation#2486

Merged
ymc9 merged 1 commit intodevfrom
perf/batch-relation
Mar 16, 2026
Merged

perf(orm): batch many-to-many relation manipulation#2486
ymc9 merged 1 commit intodevfrom
perf/batch-relation

Conversation

@ymc9
Copy link
Copy Markdown
Member

@ymc9 ymc9 commented Mar 15, 2026

Summary by CodeRabbit

  • Refactor
    • Optimized many-to-many relationship operations to use batch processing instead of individual item processing, improving performance and reducing the number of database queries required.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 15, 2026

📝 Walkthrough

Walkthrough

The ORM client's many-to-many relation handler was refactored from per-item single-entity processing to batch operations. Changes include parameter signature updates (single entity to array), return type modification (UpdateResult/DeleteResult to void), bulk insert/delete logic replacing iterative operations, and introduction of a batch ID-fetching helper method.

Changes

Cohort / File(s) Summary
Many-to-Many Batch Operations Refactor
packages/orm/src/client/crud/operations/base.ts
Refactored handleManyToManyRelation to process multiple entities in batch: changed parameter from rightEntity: any to rightEntities: any[], return type from Promise<UpdateResult | DeleteResult | undefined> to Promise<void>. Replaced per-record inserts/deletes with bulk operations, removed sorting logic, added getEntitiesIds helper for optimized batch ID lookups. Updated creation flow to pass [createdEntity] as array. Removed unused DeleteResult and UpdateResult imports.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A burrow of entities, bundled with care,
No longer one-by-one through the database air,
In batches they flow, like carrots in bulk,
Our queries grow swift, our logic's no hulk!
All void of delay, efficiency's cheer,
A batch-processed spring is finally here! 🥕✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'perf(orm): batch many-to-many relation manipulation' accurately describes the main change: refactoring M2M operations from per-item to batch processing for performance improvements.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch perf/batch-relation
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@claude
Copy link
Copy Markdown
Contributor

claude bot commented Mar 15, 2026

Claude finished @ymc9's task —— View job


Code Review: Performance optimization for many-to-many relation manipulation

Review Tasks

  • Read repository guidelines and setup
  • Analyze the modified file for code changes
  • Review code quality and best practices
  • Check for potential bugs or issues
  • Assess performance considerations
  • Evaluate security concerns
  • Consider test coverage needs
  • Provide comprehensive feedback

Summary

This PR introduces a significant performance optimization for many-to-many relation operations by batching database queries instead of executing them individually. The changes are well-implemented and follow good practices.

✅ Strengths

Performance Improvements

  • Batched entity lookups: New getEntitiesIds() method replaces N individual queries with a single OR query
  • Batch inserts/deletes: Uses single SQL statements with multiple values instead of loops
  • Reduced database round trips: Significant performance gain for operations involving multiple M2M relations

Code Quality

  • Clean refactoring: Simplified logic with better separation of concerns
  • Consistent error handling: Maintains existing error patterns and validation
  • Proper SQL injection protection: Uses parameterized queries and expression builders
  • Database compatibility: Preserves INSERT IGNORE/ON CONFLICT handling for different dialects

🔍 Technical Analysis

Key Changes

  1. Method signature change: handleManyToManyRelation() now accepts rightEntities[] instead of single rightEntity
  2. Batch processing: All connect/disconnect operations process multiple entities at once
  3. Optimized queries: Single WHERE ... IN for deletes, batch INSERT with multiple values

Security Considerations

No security concerns identified

  • Uses Kysely expression builders and parameterized queries
  • No raw SQL concatenation
  • Maintains existing input validation patterns

🎯 Specific Feedback

packages/orm/src/client/crud/operations/base.ts:687-697

The batch insert implementation is excellent:

const values = rightIdValues.map((rightId) => ({
    A: leftFirst ? leftIdValue : rightId,
    B: leftFirst ? rightId : leftIdValue,
}));

This correctly handles the field ordering logic while preparing all values for a single insert.

packages/orm/src/client/crud/operations/base.ts:703-706

Good optimization for deletes:

.where(eb(`${joinTable}.${leftCol}`, '=', leftIdValue))
.where(eb(`${joinTable}.${rightCol}`, 'in', rightIdValues))

Using IN clause instead of individual deletes significantly reduces query overhead.

packages/orm/src/client/crud/operations/base.ts:2513-2517

The new getEntitiesIds() method is well-designed:

  • Leverages existing read() infrastructure
  • Uses OR conditions for batch lookup
  • Returns only ID fields to minimize data transfer

📊 Performance Impact

Excellent optimization - For N M2M operations:

  • Before: N+1 queries (N entity lookups + N insert/delete operations)
  • After: 2 queries (1 batch lookup + 1 batch insert/delete)
  • Expected improvement: 10-100x faster for bulk M2M operations

🧪 Test Coverage

The changes affect core M2M functionality. Existing tests in /tests/e2e/orm/policy/migrated/connect-disconnect.test.ts should cover the basic functionality, but consider adding specific tests for:

  • Batch operations with large datasets
  • Performance regression tests
  • Edge cases (empty arrays, duplicate connections)

🏁 Recommendation

✅ APPROVE - This is a high-quality performance optimization that:

  • Maintains backward compatibility
  • Follows established patterns
  • Includes proper error handling
  • Shows significant performance benefits
  • Introduces no security or correctness issues

The code is production-ready and represents a meaningful improvement to the ZenStack ORM's M2M relation handling.


Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/orm/src/client/crud/operations/base.ts (1)

2048-2061: ⚠️ Potential issue | 🟠 Major

Performance issue: disconnect: true on M2M fetches all entities unnecessarily.

When data === true, disconnectConditions becomes [true], and getEntitiesIds(model, [true]) attempts to fetch entities matching { OR: [true] }. This likely fetches all entities from the model, which is:

  1. Potentially very slow for large tables
  2. Semantically incorrect - "disconnect all" should only affect entities connected to this parent, not query all entities in the model

The existing resetManyToManyRelation method (lines 710-722) already handles this case efficiently by deleting directly from the join table without fetching entity IDs.

💡 Suggested fix
 if (m2m) {
+    // Handle "disconnect all" case efficiently
+    if (disconnectConditions.length === 1 && disconnectConditions[0] === true) {
+        await this.resetManyToManyRelation(kysely, fromRelation.model, fromRelation.field, fromRelation.ids);
+        return;
+    }
+
     // handle many-to-many relation: batch fetch all entity IDs in one query
     const allIds = await this.getEntitiesIds(kysely, model, disconnectConditions);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/orm/src/client/crud/operations/base.ts` around lines 2048 - 2061,
The current M2M disconnect branch should avoid calling getEntitiesIds when data
=== true (or disconnectConditions indicate "disconnect all"); instead detect
that case and use the existing resetManyToManyRelation logic to remove
join-table links directly (so only relationships tied to the parent are affected
and you avoid scanning the whole target table). Update the branch that currently
calls getEntitiesIds(...) and handleManyToManyRelation(...) to short-circuit to
resetManyToManyRelation(...) when fromRelation.data === true (or when
disconnectConditions is the "all" sentinel), leaving the existing path for
filtered disconnects that still need getEntitiesIds; reference getEntitiesIds,
handleManyToManyRelation, and resetManyToManyRelation to locate and apply the
change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/orm/src/client/crud/operations/base.ts`:
- Around line 2145-2161: The length check after fetching IDs can falsely trigger
createNotFoundError when _data contains duplicate filters; before calling
getEntitiesIds and comparing lengths, deduplicate the input filters used for
lookup (the _data array) or derive a unique key set from _data so you compare
the number of unique lookup inputs to the returned allIds; update the block
around getEntitiesIds and the subsequent allIds.length !== _data.length check in
the same scope where getEntitiesIds, handleManyToManyRelation, and fromRelation
are used (i.e., in the method containing getEntitiesIds and
handleManyToManyRelation) so you validate against unique inputs rather than raw
_data duplicates.
- Around line 1921-1936: The length check using allIds vs _data can fail when
_data contains duplicate filters because getEntitiesIds returns unique IDs;
update the logic around getEntitiesIds, allIds and the comparison to account for
duplicate filters by either deduplicating _data (or its filter keys/values)
before calling getEntitiesIds or by comparing allIds.length to the count of
unique filters instead of _data.length, and only call createNotFoundError if
there are missing unique entities; adjust the flow around getEntitiesIds,
allIds, createNotFoundError and subsequent handleManyToManyRelation accordingly
so duplicates are handled consistently (or explicitly reject duplicates with a
clear error).

---

Outside diff comments:
In `@packages/orm/src/client/crud/operations/base.ts`:
- Around line 2048-2061: The current M2M disconnect branch should avoid calling
getEntitiesIds when data === true (or disconnectConditions indicate "disconnect
all"); instead detect that case and use the existing resetManyToManyRelation
logic to remove join-table links directly (so only relationships tied to the
parent are affected and you avoid scanning the whole target table). Update the
branch that currently calls getEntitiesIds(...) and
handleManyToManyRelation(...) to short-circuit to resetManyToManyRelation(...)
when fromRelation.data === true (or when disconnectConditions is the "all"
sentinel), leaving the existing path for filtered disconnects that still need
getEntitiesIds; reference getEntitiesIds, handleManyToManyRelation, and
resetManyToManyRelation to locate and apply the change.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 55ed4aef-1afc-440d-ad84-24a0699b14a1

📥 Commits

Reviewing files that changed from the base of the PR and between 0778e49 and 4e34084.

📒 Files selected for processing (1)
  • packages/orm/src/client/crud/operations/base.ts

@ymc9 ymc9 merged commit a6ce140 into dev Mar 16, 2026
11 checks passed
@ymc9 ymc9 deleted the perf/batch-relation branch March 16, 2026 04:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant