Fix #9607: Resolve N+1 queries with PrimaryKeyRelatedField many=True#9947
Fix #9607: Resolve N+1 queries with PrimaryKeyRelatedField many=True#9947MuraveyApp wants to merge 1 commit intoencode:mainfrom
Conversation
|
Do you think adding failing test cases for your changes would be nice ? |
There was a problem hiding this comment.
Pull request overview
This PR aims to eliminate N+1 queries during deserialization when using PrimaryKeyRelatedField(many=True) by performing a single batched queryset lookup for all incoming primary keys.
Changes:
- Added
PrimaryKeyRelatedField.many_to_internal_value()to batch-resolve related objects viapk__in. - Updated
ManyRelatedField.to_internal_value()to use the child relation’smany_to_internal_value()when available.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| pks = [] | ||
| for item in data: | ||
| if self.pk_field is not None: | ||
| item = self.pk_field.to_internal_value(item) | ||
| if isinstance(item, bool): | ||
| self.fail('incorrect_type', data_type=type(item).__name__) | ||
| pks.append(item) | ||
| queryset = self.get_queryset() | ||
| try: | ||
| objs = {str(obj.pk): obj for obj in queryset.filter(pk__in=pks)} | ||
| except (ValueError, TypeError): | ||
| # Fall back to per-item validation to surface correct error messages | ||
| return [self.to_internal_value(pk) for pk in pks] | ||
| result = [] | ||
| for pk in pks: | ||
| obj = objs.get(str(pk)) |
There was a problem hiding this comment.
Building the lookup map with string keys (str(obj.pk) / str(pk)) can incorrectly raise does_not_exist for inputs that Django will coerce successfully but whose string form differs from the canonical obj.pk string (e.g. UUIDs with uppercase hex, integer PKs with leading zeros or surrounding whitespace). Consider normalizing each incoming pk to the model PK type (e.g. via queryset.model._meta.pk.to_python(...) when pk_field is None) and keying the map by the normalized value (or using queryset.in_bulk(normalized_pks)), so lookups match Django’s coercion behavior.
| pks = [] | |
| for item in data: | |
| if self.pk_field is not None: | |
| item = self.pk_field.to_internal_value(item) | |
| if isinstance(item, bool): | |
| self.fail('incorrect_type', data_type=type(item).__name__) | |
| pks.append(item) | |
| queryset = self.get_queryset() | |
| try: | |
| objs = {str(obj.pk): obj for obj in queryset.filter(pk__in=pks)} | |
| except (ValueError, TypeError): | |
| # Fall back to per-item validation to surface correct error messages | |
| return [self.to_internal_value(pk) for pk in pks] | |
| result = [] | |
| for pk in pks: | |
| obj = objs.get(str(pk)) | |
| queryset = self.get_queryset() | |
| pks = [] | |
| try: | |
| for item in data: | |
| if self.pk_field is not None: | |
| item = self.pk_field.to_internal_value(item) | |
| if isinstance(item, bool): | |
| self.fail('incorrect_type', data_type=type(item).__name__) | |
| if self.pk_field is None: | |
| item = queryset.model._meta.pk.to_python(item) | |
| pks.append(item) | |
| objs = queryset.in_bulk(pks) | |
| except (ValueError, TypeError): | |
| # Fall back to per-item validation to surface correct error messages | |
| return [self.to_internal_value(pk) for pk in data] | |
| result = [] | |
| for pk in pks: | |
| obj = objs.get(pk) |
| except (TypeError, ValueError): | ||
| self.fail('incorrect_type', data_type=type(data).__name__) | ||
|
|
||
| def many_to_internal_value(self, data): |
There was a problem hiding this comment.
This introduces a new optimized code path for many=True deserialization. Please add tests that (1) assert it does a single batched DB query (no N+1) and (2) covers pk normalization cases (e.g. UUID string casing / integer strings with leading zeros) to prevent regressions in the new mapping logic.
| queryset = self.get_queryset() | ||
| try: | ||
| objs = {str(obj.pk): obj for obj in queryset.filter(pk__in=pks)} | ||
| except (ValueError, TypeError): | ||
| # Fall back to per-item validation to surface correct error messages | ||
| return [self.to_internal_value(pk) for pk in pks] |
There was a problem hiding this comment.
The batched queryset.filter(pk__in=pks) path can raise django.core.exceptions.ValidationError for values that would previously only error if/when that specific item was validated (e.g. invalid UUID strings). Since the current except only catches (ValueError, TypeError), this bypasses the intended per-item fallback and can change which error is surfaced based on list contents/order. Consider also catching Django's ValidationError here and falling back to per-item validation to preserve previous error behavior.
Summary
Test plan
Generated by OwlMind