perf: bulk membership creation with INSERT ON CONFLICT DO NOTHING#149
perf: bulk membership creation with INSERT ON CONFLICT DO NOTHING#149kneckinator wants to merge 2 commits into19.0from
Conversation
Replace per-record ORM creates and Command.create() tuples with raw SQL INSERT ... ON CONFLICT (unique_cols) DO NOTHING for bulk membership creation. Duplicates are silently skipped and the inserted count is returned via cursor.rowcount. Updates _import_registrants and _add_beneficiaries to use the new skip_duplicates path, with ORM cache invalidation after raw SQL inserts.
There was a problem hiding this comment.
Code Review
This pull request introduces high-performance bulk membership creation for both programs and cycles by implementing raw SQL INSERT statements with ON CONFLICT DO NOTHING to efficiently handle duplicates. The cycle and eligibility managers have been updated to utilize these new methods, and a comprehensive test suite has been added to verify the bulk insertion logic. The review feedback highlights the need to manually handle computed fields like enrollment_date when bypassing the ORM, suggests optimizing performance by moving date function calls outside of loops, and recommends using the @api.model_create_multi decorator for consistency with Odoo's batch processing patterns.
| for i in range(0, len(vals_list), chunk_size): | ||
| batch = vals_list[i : i + chunk_size] | ||
| values = [] | ||
| params = [] | ||
| for v in batch: | ||
| values.append("(%s, %s, %s, %s, %s, now(), now())") | ||
| params.extend( | ||
| [ | ||
| v["partner_id"], | ||
| v["program_id"], | ||
| v.get("state", "draft"), | ||
| uid, | ||
| uid, | ||
| ] | ||
| ) | ||
|
|
||
| sql = """ | ||
| INSERT INTO spp_program_membership | ||
| (partner_id, program_id, state, | ||
| create_uid, write_uid, create_date, write_date) | ||
| VALUES {} | ||
| ON CONFLICT (partner_id, program_id) DO NOTHING | ||
| """.format( # noqa: S608 # nosec B608 | ||
| ", ".join(values) | ||
| ) |
There was a problem hiding this comment.
The enrollment_date field in spp.program.membership is a stored computed field that depends on the state. Since this raw SQL insert bypasses the ORM, the field will not be populated for records inserted with state='enrolled'. To ensure data integrity, you should include enrollment_date in the SQL insert and set it to the current timestamp when the state is enrolled.
for i in range(0, len(vals_list), chunk_size):
batch = vals_list[i : i + chunk_size]
values = []
params = []
for v in batch:
state = v.get("state", "draft")
enrollment_date = fields.Datetime.now() if state == "enrolled" else None
values.append("(%s, %s, %s, %s, %s, %s, now(), now())")
params.extend(
[
v["partner_id"],
v["program_id"],
state,
enrollment_date,
uid,
uid,
]
)
sql = """
INSERT INTO spp_program_membership
(partner_id, program_id, state, enrollment_date,
create_uid, write_uid, create_date, write_date)
VALUES {}
ON CONFLICT (partner_id, program_id) DO NOTHING
""".format(
", ".join(values)
)| }, | ||
| } | ||
|
|
||
| @api.model |
There was a problem hiding this comment.
It is recommended to use the @api.model_create_multi decorator for methods that accept a list of dictionaries (vals_list). This ensures that the method is always called with a list, providing better consistency with Odoo's standard batch processing patterns.
| @api.model | |
| @api.model_create_multi |
| total_inserted = 0 | ||
|
|
||
| for i in range(0, len(vals_list), chunk_size): | ||
| batch = vals_list[i : i + chunk_size] | ||
| values = [] | ||
| params = [] | ||
| for v in batch: | ||
| values.append("(%s, %s, %s, %s, %s, %s, now(), now())") | ||
| params.extend( | ||
| [ | ||
| v["partner_id"], | ||
| v["cycle_id"], | ||
| v.get("state", "draft"), | ||
| v.get("enrollment_date", fields.Date.today()), | ||
| uid, | ||
| uid, | ||
| ] | ||
| ) |
There was a problem hiding this comment.
Calling fields.Date.today() inside the loop for every record is inefficient, especially for large batches. It should be called once outside the loop and stored in a variable.
| total_inserted = 0 | |
| for i in range(0, len(vals_list), chunk_size): | |
| batch = vals_list[i : i + chunk_size] | |
| values = [] | |
| params = [] | |
| for v in batch: | |
| values.append("(%s, %s, %s, %s, %s, %s, now(), now())") | |
| params.extend( | |
| [ | |
| v["partner_id"], | |
| v["cycle_id"], | |
| v.get("state", "draft"), | |
| v.get("enrollment_date", fields.Date.today()), | |
| uid, | |
| uid, | |
| ] | |
| ) | |
| total_inserted = 0 | |
| today = fields.Date.today() | |
| for i in range(0, len(vals_list), chunk_size): | |
| batch = vals_list[i : i + chunk_size] | |
| values = [] | |
| params = [] | |
| for v in batch: | |
| values.append("(%s, %s, %s, %s, %s, %s, now(), now())") | |
| params.extend( | |
| [ | |
| v["partner_id"], | |
| v["cycle_id"], | |
| v.get("state", "draft"), | |
| v.get("enrollment_date", today), | |
| uid, | |
| uid, | |
| ] | |
| ) |
|
|
||
| @api.model_create_multi | ||
| def bulk_create_memberships(self, vals_list, chunk_size=1000): | ||
| @api.model |
There was a problem hiding this comment.
It is recommended to use the @api.model_create_multi decorator for methods that accept a list of dictionaries (vals_list). This ensures that the method is always called with a list, providing better consistency with Odoo's standard batch processing patterns.
| @api.model | |
| @api.model_create_multi |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## 19.0 #149 +/- ##
==========================================
+ Coverage 71.06% 71.11% +0.04%
==========================================
Files 925 925
Lines 54704 54774 +70
==========================================
+ Hits 38876 38952 +76
+ Misses 15828 15822 -6
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
Set enrollment_date to current timestamp when state is 'enrolled' in the program membership SQL insert (computed field not triggered by raw SQL). Hoist fields.Date.today() outside the loop in cycle membership to avoid repeated calls per record.
gonzalesedwin1123
left a comment
There was a problem hiding this comment.
@kneckinator I think if audit rules are defined using spp_audit for spp.program.membership and spp.cycle.membership are added, the raw SQL will bypass it entirely. Though I think the performance gain is significant, I am just concerned about the audit logs.
Summary
skip_duplicates=Truemode tobulk_create_memberships()on bothspp.program.membershipandspp.cycle.membershipINSERT ... ON CONFLICT (partner_id, program_id/cycle_id) DO NOTHINGto silently skip duplicates instead of raising IntegrityError or doing per-record search() checkscursor.rowcount_import_registrants()and_add_beneficiaries()callers to use the new pathChanges
program_membership.py: Addskip_duplicatesparam +_bulk_insert_on_conflict()helper using raw SQLcycle_membership.py: Same pattern for cycle membershipseligibility_manager.py: ReplaceCommand.create()+program.update()withbulk_create_memberships(skip_duplicates=True)cycle_manager_base.py: Replace[0, 0, {}]tuples +cycle.update()withbulk_create_memberships(skip_duplicates=True)test_bulk_membership.py(new): 13 tests covering insert, duplicate skipping, empty lists, chunking, ORM fallback, and caller integrationContext
Phase 7 of 9 in the
spp_programsperformance optimization effort. Depends on phase 1 - #116 being merged.Test plan
./scripts/test_single_module.sh spp_programs— 591 tests, 0 failurespre-commit run --files <changed_files>— all checks passODOO_INIT_MODULES=spp_programs docker compose --profile ui up -d