Skip to content

Remove n+1 queries from learning_resources app#3200

Merged
mbertrand merged 6 commits into
mainfrom
mb/nplusone_lr
Apr 17, 2026
Merged

Remove n+1 queries from learning_resources app#3200
mbertrand merged 6 commits into
mainfrom
mb/nplusone_lr

Conversation

@mbertrand

@mbertrand mbertrand commented Apr 14, 2026

Copy link
Copy Markdown
Member

What are the relevant tickets?

Closes https://github.com/mitodl/hq/issues/8259

Description (What does it do?)

Removes n+1 queries from learning_resources

How can this be tested?

  • Unit tests should all pass
  • The learning_resources API endpoints should all still work
  • You should still be able to CRUD learning paths and userlists

Copilot AI review requested due to automatic review settings April 14, 2026 19:09
@github-actions

github-actions Bot commented Apr 14, 2026

Copy link
Copy Markdown

OpenAPI Changes

7 changes: 0 error, 6 warning, 1 info

View full changelog

Unexpected changes? Ensure your branch is up-to-date with main (consider rebasing).

Comment thread learning_resources/serializers.py

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR targets performance in the learning_resources Django app by reducing N+1 query patterns during API serialization, and by re-enabling N+1 detection on a number of previously-exempt tests.

Changes:

  • Introduces/expands for_serialization() QuerySets/managers and updates viewsets/actions to use them for prefetch-heavy API responses.
  • Refactors serializer code to avoid per-object refetching and centralize “reload with prefetches” behavior after writes.
  • Removes @pytest.mark.skip_nplusone_check from multiple API tests and adds coverage around topic channel_url serialization for user list actions.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
learning_resources/views_userlist_test.py Removes N+1 skip markers and adds assertions around serialized topic channel_url in user list item create/bulk update responses.
learning_resources/views_test.py Removes N+1 skip markers from multiple endpoint tests to re-enable N+1 detection.
learning_resources/views_learningpath_test.py Removes N+1 skip markers for some learning path endpoint tests.
learning_resources/views.py Updates relationship/list viewset querysets and action responses to use serialization-optimized querysets/prefetches.
learning_resources/serializers.py Adds reload helpers, refactors related access, removes ContentFile per-instance refetch, and returns prefetched instances after create/update.
learning_resources/permissions.py Adjusts learning path editor group membership check.
learning_resources/models.py Expands for_serialization() prefetches (notably for content files) and adds for_serialization() QuerySets/managers for UserList and UserListRelationship.
drf_lint_baseline.json Removes baseline entries corresponding to refactored serializer ORM patterns.
docs/articles-cdn-purge.md Minor markdown formatting/spacing adjustments.

Comment thread learning_resources/views_userlist_test.py Outdated
Comment thread learning_resources/views_userlist_test.py
Comment thread learning_resources/serializers.py
Comment thread learning_resources/views.py Outdated
Comment thread learning_resources/views.py
Comment thread learning_resources/views.py
@mbertrand mbertrand changed the title Remove most n+1 queries from learning_resources app Remove n+1 queries from learning_resources app Apr 15, 2026
@mbertrand mbertrand requested a review from Copilot April 15, 2026 17:43

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR targets N+1 query reductions across the learning_resources Django app by introducing serialization-focused queryset helpers/managers, refetching created/updated objects from prefetched querysets for response serialization, and updating tests/specs to reflect the adjusted API behavior.

Changes:

  • Add for_serialization() querysets/managers and model helpers to reuse prefetch caches and reduce query growth during API serialization.
  • Adjust several DRF viewset create/update flows to serialize responses from prefetched querysets (and update learning path item creation schema to derive parent from the nested route).
  • Update/extend tests to remove N+1 skips and assert stable query behavior / topic channel serialization.

Reviewed changes

Copilot reviewed 11 out of 12 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
openapi/specs/v1.yaml Renames learning path item create request schema and removes client-supplied parent.
learning_resources/views_userlist_test.py Adds query-count stability test; adds topic/channel serialization assertions; removes N+1 skip markers.
learning_resources/views_test.py Removes N+1 skip markers across multiple endpoint tests.
learning_resources/views_learningpath_test.py Removes N+1 skip markers; adds assertions for bulk learning path relationship patch behavior.
learning_resources/views.py Refactors querysets/prefetching and create/update response serialization for learning paths and user lists; updates nested learning path item creation.
learning_resources/serializers_test.py Adds coverage for content files that only have direct_learning_resource.
learning_resources/serializers.py Introduces helper to prefetch program child resources; switches serializers to prefetch-cache-aware accessors; adds nested learning path item create serializer.
learning_resources/permissions.py Adjusts learning path editor group membership check to use an exists() query.
learning_resources/models.py Adds for_serialization() querysets/managers and model helper methods to leverage prefetch caches and reduce N+1s.
frontends/api/src/generated/v1/api.ts Regenerates API client/types to reflect OpenAPI schema rename/removal of parent on create request.
drf_lint_baseline.json Removes baseline entries related to ORM linting in learning_resources/serializers.py.
docs/articles-cdn-purge.md Formatting-only markdown spacing adjustments.

Comment thread learning_resources/serializers.py Outdated
Comment thread learning_resources/models.py Outdated
Comment thread learning_resources/models.py Outdated
Comment thread learning_resources/models.py Outdated

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR focuses on eliminating N+1 query patterns in the learning_resources API by centralizing “serialization-ready” querysets/prefetch behavior and adjusting endpoints/tests accordingly, with a small API schema update for nested learning path item creation.

Changes:

  • Introduces/uses for_serialization() querysets and model helper methods (e.g., topics_for_serialization, direct_content_files_for_serialization) to keep query counts stable during serialization.
  • Updates several viewset create/update paths to re-fetch objects from prefetched querysets before responding, and tightens learning path relationship scoping by relation type.
  • Updates OpenAPI + generated TS client types to use a dedicated nested create request schema for learning path items (parent derived from route).

Reviewed changes

Copilot reviewed 11 out of 12 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
openapi/specs/v1.yaml Updates learning path nested POST request schema to LearningPathRelationshipCreateRequest (no client-supplied parent).
frontends/api/src/generated/v1/api.ts Regenerates API client typings/param names to match the new OpenAPI request schema.
learning_resources/models.py Adds serialization helpers + new for_serialization() managers for relationships/userlists/content files to reduce N+1s.
learning_resources/serializers.py Refactors serializers to use serialization-aware helpers; removes ContentFileSerializer internal refetching.
learning_resources/views.py Updates querysets and CRUD responses to serialize from prefetched instances; scopes learning path relationship operations.
learning_resources/permissions.py Optimizes learning path editor membership check to an exists() query.
learning_resources/views_test.py Removes several skip_nplusone_check markers.
learning_resources/views_userlist_test.py Adds query-count regression test and topic/channel_url serialization assertions for userlists.
learning_resources/views_learningpath_test.py Improves learning-path item position expectations and adds regression tests for relationship scoping/visibility.
learning_resources/serializers_test.py Adds tests for new serialization helpers and position behavior.
drf_lint_baseline.json Removes baseline entries corresponding to resolved ORM lint findings in serializers.
docs/articles-cdn-purge.md Formatting-only markdown spacing adjustments.

Comment on lines 1359 to +1363
def get_learning_resource(self, instance):
if instance.run:
return instance.run.learning_resource
return instance.learning_resource
if instance.learning_resource:
return instance.learning_resource

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Reduces N+1 query patterns in the learning_resources Django app by standardizing “serialization-ready” querysets/prefetching and adjusting view/task code paths to serialize from prefetched instances. Also updates the Learning Path nested create request schema to reflect that parent is derived from the route.

Changes:

  • Introduces/extends for_serialization() querysets and model helper methods to leverage prefetch caches during serialization.
  • Updates multiple viewsets and background tasks to fetch objects via serialization-ready querysets before serializing responses/documents.
  • Updates OpenAPI + generated TS client types for Learning Path item create requests; adds tests and removes several N+1 skip markers.

Reviewed changes

Copilot reviewed 12 out of 13 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
vector_search/tasks.py Uses content_files.for_serialization() during healthcheck serialization.
openapi/specs/v1.yaml Renames Learning Path item create request schema; removes parent from required/request body.
learning_resources_search/tasks.py Fetches ContentFiles with for_serialization() prior to indexing.
learning_resources/views_userlist_test.py Adds query-count regression and topic-channel serialization tests; removes N+1 skips.
learning_resources/views_test.py Removes N+1 skip markers across multiple endpoint tests.
learning_resources/views_learningpath_test.py Adjusts learning path item position expectations; adds bulk-set behavior tests; removes N+1 skips.
learning_resources/views.py Refetches created/updated resources from prefetched querysets; improves relationship serialization/prefetching.
learning_resources/serializers_test.py Adds tests for relationship positioning and serialization fallbacks.
learning_resources/serializers.py Removes ContentFile refetch-in-serializer; adds serialization helpers and create serializer for learning path items; updates topic/price/child accessors.
learning_resources/permissions.py Refactors learning path editor group membership check.
learning_resources/models.py Adds for_serialization() querysets/managers and model helper methods to use prefetch caches.
frontends/api/src/generated/v1/api.ts Regenerates client types for renamed Learning Path item create request.
drf_lint_baseline.json Removes baseline entries for ORM warnings addressed by refactors.
docs/articles-cdn-purge.md Markdown formatting/spacing improvements.
Comments suppressed due to low confidence (1)

learning_resources_search/tasks.py:115

  • upsert_content_file assumes every ContentFile has a run (uses content_file_obj.run.learning_resource_id for routing). But ContentFile can be associated via learning_resource or direct_learning_resource with run=None, which will raise an AttributeError here (and serialize_content_file_for_update currently has the same assumption when learning_resource is null). Compute the routing/parent id using whichever association is present (run → run.learning_resource_id, else learning_resource_id, else direct_learning_resource_id).
    content_file_obj = ContentFile.objects.for_serialization().get(id=file_id)
    content_file_data = serialize_content_file_for_update(content_file_obj)
    api.upsert_document(
        gen_content_file_id(content_file_obj.id),
        content_file_data,
        COURSE_TYPE,
        retry_on_conflict=settings.INDEXING_ERROR_RETRIES,
        routing=content_file_obj.run.learning_resource_id,
    )

Comment thread learning_resources/serializers.py

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to eliminate N+1 query patterns in the learning_resources domain by introducing serialization-oriented querysets/helpers and updating API viewsets/tasks/tests to use them (with noted exceptions around learning path CRUD).

Changes:

  • Introduces for_serialization() querysets/managers and *_for_serialization() model helpers to centralize/select/prefetch logic.
  • Updates several API viewsets (learning paths, user lists, nested relationship endpoints) to re-fetch created/updated instances via prefetched querysets before serializing responses.
  • Updates OpenAPI + generated TS client types for learning path item creation (parent derived from route).

Reviewed changes

Copilot reviewed 12 out of 13 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
vector_search/tasks.py Uses ContentFile.for_serialization() when iterating run content files to avoid N+1 in healthcheck serialization.
learning_resources_search/tasks.py Fetches content files via for_serialization() before indexing/upserting.
learning_resources/models.py Adds serialization querysets/managers + model helper methods leveraging prefetch caches.
learning_resources/serializers.py Switches serializers to use model serialization helpers; refactors some relationship creation/position logic; adds create-only learning path relationship serializer.
learning_resources/views.py Refetches created/updated resources for response serialization; updates relationship endpoints to use serialization querysets; adjusts learning path item creation flow.
learning_resources/permissions.py Optimizes learning path editor group check using exists().
learning_resources/views_userlist_test.py Adds query-count regression test and topic-channel serialization coverage for userlist endpoints.
learning_resources/views_test.py Removes skip_nplusone_check from various tests.
learning_resources/views_learningpath_test.py Adjusts learning path item position assertions; adds learning-path relationship behavior tests.
learning_resources/serializers_test.py Adds tests for relationship positioning and serialization fallbacks.
openapi/specs/v1.yaml Updates request schema for nested learning path relationship creation (no client-supplied parent).
frontends/api/src/generated/v1/api.ts Regenerates API client types reflecting the OpenAPI change.
drf_lint_baseline.json Removes now-resolved DRF lint baseline entries tied to the old patterns.

Comment thread learning_resources/views.py
Comment thread learning_resources/views_userlist_test.py Outdated
Comment thread learning_resources/serializers.py
Comment thread learning_resources/views.py
@mbertrand mbertrand added Needs Review An open Pull Request that is ready for review and removed Work in Progress labels Apr 15, 2026
@abeglova abeglova self-assigned this Apr 16, 2026
@abeglova

abeglova commented Apr 17, 2026

Copy link
Copy Markdown
Contributor

This works! I get expected data and no n+1 queries for any of the api calls I tested/

I do get

web-1                        | 2026-04-17T16:54:10.997115Z [error    ] Potential unnecessary eager load detected on `LearningResource.image` [nplusone]
web-1                        | 2026-04-17T16:54:10.998539Z [error    ] Potential unnecessary eager load detected on `LearningResource._topics` [nplusone]
web-1                        | 2026-04-17T16:54:10.999711Z [error    ] Potential unnecessary eager load detected on `LearningResource._resource_prices` [nplusone]
web-1                        | 2026-04-17T16:54:11.000844Z [error    ] Potential unnecessary eager load detected on `LearningResource._published_runs` [nplusone]
web-1                        | 2026-04-17T16:54:11.002012Z [error    ] Potential unnecessary eager load detected on `LearningResource._podcasts` [nplusone]
web-1                        | 2026-04-17T16:54:11.003206Z [error    ] Potential unnecessary eager load detected on `LearningResource._playlists` [nplusone]
web-1                        | 2026-04-17T16:54:11.004403Z [error    ] Potential unnecessary eager load detected on `LearningResource._children` [nplusone]
web-1                        | 2026-04-17T16:54:11.005605Z [error    ] Potential unnecessary eager load detected on `LearningResource._direct_content_files` [nplusone]
web-1                        | 2026-04-17T16:54:11.006804Z [error    ] Potential unnecessary eager load detected on `LearningResource.course` [nplusone]
web-1                        | 2026-04-17T16:54:11.008231Z [error    ] Potential unnecessary eager load detected on `LearningResource.program` [nplusone]
web-1                        | 2026-04-17T16:54:11.009546Z [error    ] Potential unnecessary eager load detected on `LearningResource.podcast` [nplusone]
web-1                        | 2026-04-17T16:54:11.010848Z [error    ] Potential unnecessary eager load detected on `LearningResource.podcast_episode` [nplusone]
web-1                        | 2026-04-17T16:54:11.012241Z [error    ] Potential unnecessary eager load detected on `LearningResource.video` [nplusone]
web-1                        | 2026-04-17T16:54:11.013520Z [error    ] Potential unnecessary eager load detected on `LearningResource.video_playlist`

In the logs for most requests

@mbertrand mbertrand force-pushed the mb/nplusone_lr branch 2 times, most recently from 6b1e95c to e653a69 Compare April 17, 2026 17:04
@mbertrand mbertrand merged commit d92c0ed into main Apr 17, 2026
14 checks passed
@mbertrand mbertrand deleted the mb/nplusone_lr branch April 17, 2026 17:20
@odlbot odlbot mentioned this pull request Apr 20, 2026
16 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Needs Review An open Pull Request that is ready for review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants