Skip to content

Story 2096: Implement Wagtail Integration#2100

Open
jlchilders11 wants to merge 17 commits into
boostorg:developfrom
jlchilders11:jc/wagtail-integration
Open

Story 2096: Implement Wagtail Integration#2100
jlchilders11 wants to merge 17 commits into
boostorg:developfrom
jlchilders11:jc/wagtail-integration

Conversation

@jlchilders11

@jlchilders11 jlchilders11 commented Feb 25, 2026

Copy link
Copy Markdown
Collaborator

Implement Wagtail Integrations:

Mixins

  • TaggableMixin - Adds the ability to tag pages for categorization

Routing

  • RoutableHomePage - New home page for all wagtail interactions that allows for forwarding routing to child pages, e.g. if you get to wagtail through /outreach/, you will be routed through the OutreachHomePage

Posts

This is the bulk of the work, and is a porting of the existing Entry Model to Wagtail

  • PostIndexPage - Index Page for Posts, implements most of the logic in formatting found in the news.views.EntryListView. NOTE: this uses queryparams for filtering the child pages, rather than seperate URLs
  • PostPage - Functional translation of Entry, but uses a stream field rather than having separate models for different types of content. These can be filtered by the content of the stream field block

NOTES:

  • Additional work/discussion is needed on the PollBlock. Currently this is not implemented, as it is disabled in the existing architecture, and will be non trivial to implement in Wagtail
  • Additional work is needed for the video block to both automatically save the thumbnail, and display it instead of autoloading the video.

DEPLOYMENT NOTES:
When deploying this, a RoutableHomePage should be created and the existing OutreachHomePage should be moved under it to respect the new routing methods.

Summary by CodeRabbit

  • New Features

    • New Pages app with routable post management, tagging, and a migration tool to import legacy posts.
    • Automated AI-generated summaries and video thumbnail generation for posts.
  • Improvements

    • Better markdown and custom-video rendering, dynamic client-side post filtering, list/detail styling refinements, and more robust image handling.

@jlchilders11 jlchilders11 changed the title Story 2096: Implment Wagtail Integration Story 2096: Impliment Wagtail Integration Feb 26, 2026

@gregjkal gregjkal left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per our conversation in standup: please test the homepage to make sure it won't break without manually migrating the existing testimonial and/or news items. If it does break, we need to either make these changes backwards-compatible (my strong preference!), with a follow-up change that cleans any bridge code up, or add a data migration so it can be fixed during the deploy instead of crashing until manual fixes.

Comment thread pages/migrations/0001_initial.py
@julioest julioest changed the title Story 2096: Impliment Wagtail Integration Story 2096: Implement Wagtail Integration May 10, 2026

@julhoang julhoang left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @jlchilders11 ! This is awesome work! I just left a few of clarifying questions below.

Besides them, it looks like the HTML templates are now outdated. Do you have a next-step recommendation for us to update those templates to V3 versions? Also, do we need to update the V3 Create Post page and various other existing queries to query from Wagtail's PostPage instead of Entry?

Thanks so much for your work here!

Comment thread pages/management/commands/convert_news_entries.py Outdated
Comment thread templates/pages/routable_home_page.html
Comment thread config/urls.py
Comment thread pages/models.py Outdated
Comment thread pages/mixins.py Outdated
@jlchilders11 jlchilders11 requested a review from julhoang May 21, 2026 17:01
@jlchilders11

Copy link
Copy Markdown
Collaborator Author

Hi @jlchilders11 ! This is awesome work! I just left a few of clarifying questions below.

Besides them, it looks like the HTML templates are now outdated. Do you have a next-step recommendation for us to update those templates to V3 versions? Also, do we need to update the V3 Create Post page and various other existing queries to query from Wagtail's PostPage instead of Entry?

Thanks so much for your work here!

For updating these to the V3 and existing things, I believe that should be handled in the data integration phase of those pages, which I believe are still outstanding tickets. If not, I do think that that work should be handled as part of a separate ticket, as it somewhat escapes the "migration" scope of this ticket.

@jlchilders11 jlchilders11 linked an issue May 21, 2026 that may be closed by this pull request

@herzog0 herzog0 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @jlchilders11 this is looking great!
I just have a couple of questions that I'll write in a Slack thread to keep it as a better reference for the future

Comment thread pages/management/commands/convert_news_entries.py

@julhoang julhoang left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @jlchilders11 , I've left a couple of suggestions below – please let me know your thoughts!

A few more things:
1/ I noticed that a few video entries are looking like this when they have a empty summary – can we make sure to not render that block in that case?

Image

2/ The filters on the http://localhost:8000/pages/news/ are not redirecting to the new /pages path – I understand if this makes more sense to tackle in this ticket instead: #2378. If that's the case, would you mind adding a note for Katty so that she's aware?

Comment thread templates/news/v3/detail.html
Comment thread pages/models.py Outdated
Comment thread pages/models.py
Comment thread pages/models.py Outdated
ctx["is_paginated"] = True
ctx["entry_list"] = pag.get_page(request.GET.get("page", 1))
ctx["page_obj"] = ctx["entry_list"]
ctx["filters"] = POST_CONTENT_TYPES

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the UI consuming this ctx["filters"]? It seems like it's only using ctx["filter_terms"]

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. The filter terms should eventually be generated from the POST_CONTENT_TYPES, but that can be addressed once all content types have been established. Added a TODO to address that.

Comment thread pages/management/commands/convert_news_entries.py Outdated

def get_or_create_page(entry: Entry, index_page: PostIndexPage) -> PostPage:
try:
page = index_page.get_children().get(title=entry.title).specific

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Hmm, I'm curious if there's a different key that can be more robust than title here – just a thought, no real action needed!

Comment thread templates/blocks/custom_video_block.html Outdated
Comment thread pages/management/commands/convert_news_entries.py
@jlchilders11

Copy link
Copy Markdown
Collaborator Author

Hi @jlchilders11 , I've left a couple of suggestions below – please let me know your thoughts!

A few more things: 1/ I noticed that a few video entries are looking like this when they have a empty summary – can we make sure to not render that block in that case?
Image

2/ The filters on the http://localhost:8000/pages/news/ are not redirecting to the new /pages path – I understand if this makes more sense to tackle in this ticket instead: #2378. If that's the case, would you mind adding a note for Katty so that she's aware?

  1. was actually doable in this PR, so check that functionality out!

@jlchilders11 jlchilders11 requested a review from julhoang June 3, 2026 18:06

@julhoang julhoang left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @jlchilders11 , I have 1 questions regarding the summary_dispatcher, others are just optional nits 🙏

Comment thread pages/models.py Outdated

if not self.summary:
logger.info(f"Passing {self.pk=} to dispatcher")
summary_dispatcher.delay(self.pk)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this summary_dispatcher and the helper function that it calls set_summary_for_event_entry both are writing to the Entry model instead of our new Page model – should we handle issue in this PR?

Comment thread pages/models.py Outdated
Comment on lines +58 to 60
{% with body=object.get_content|default:object.visible_content %}
{% if body %}{{ body|markdown|urlize }}{% endif %}
{% endwith %}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiny nit since this will not be a likely issue in Prod:

While testing after the Wagtail migration script, I noticed a rendering issue with a sample post I'd previously created via the V3 Create Post page. Using WYSIWYG, the post body was originally saved as markdown, but after migration it's now wrapped in HTML, so the UI no longer renders it correctly (screenshots below). While this might not be an issue in Prod, I think it'd be nice if we can update the script to retain the markdown format just so that any other dev's markdown posts will still render correctly – totally up to you 😄

"Entry" V3 page Wagtail V3 (/pages) Wagtail Admin
Image Image Image

@coderabbitai

coderabbitai Bot commented Jun 9, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: a2b89d4c-76ef-413b-a502-d9f78344859a

📥 Commits

Reviewing files that changed from the base of the PR and between 3383747 and 29dc2a1.

📒 Files selected for processing (1)
  • news/tasks.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • news/tasks.py

📝 Walkthrough

Walkthrough

Adds a Wagtail "pages" app (models, blocks, migrations, app config), a migration-based PostPage index, conversion tooling, and integrates PostPage with news tasks, utils, templates, routing, and settings while gating v3 pages via a waffle flag.

Changes

Pages App and Dual-Model Content System

Layer / File(s) Summary
App config, admin, blocks, and mixins
pages/apps.py, pages/admin.py, pages/blocks.py, pages/mixins.py
Adds PagesConfig, admin placeholder, PollBlock and POST_BLOCKS registry, tagging snippet ContentTag, through-model TaggedContent, TaggableMixin, and abstract BasePage.
Page models and helpers
pages/models.py
Adds RoutableHomePage, PostIndexPage, PostPage, content-type mappings, StreamField-backed single-block content, cached helpers, save() hooks to dispatch summary/thumbnail tasks, and Wagtail content_panels.
Database migrations (0001–0004)
pages/migrations/0001_initial.py, 0002_postpage_video_thumbnail.py, 0003_alter_postpage_content.py, 0004_alter_postpage_content.py
Creates tag and page models, adds video_thumbnail, and alters PostPage content StreamField block definitions across migrations.
Content migration command
pages/management/commands/convert_news_entries.py
Management command to convert legacy BlogPost/News/Video/Link entries into Wagtail PostPage instances under PostIndexPage, converting text, copying/creating images, and setting video thumbnails.
News tasks, utils, and backpopulate
news/tasks.py, news/utils.py, news/management/commands/backpopulate_thumbnails.py
Generalizes set_video_thumbnail for `Video
Legacy news model accessors
news/models.py
Adds Entry.image_url property and Entry.get_content() to expose image URL and raw content for templates/tasks.
Routing, settings, V3 mixin, marketing changes
config/settings.py, config/urls.py, core/mixins.py, marketing/models.py
Registers pages and wagtail.contrib.routable_page in INSTALLED_APPS; mounts Wagtail at outreach/ and pages/ (removes root catch-all); adds V3Mixin.serve() enforcing the v3 waffle flag; allows pages.RoutableHomePage parent and adjusts OutreachHomePage.route() unpacking.
Views, templates, CSS and small template fixes
news/views.py, templates/news/v3/detail.html, templates/v3/includes/_post_list_card.html, templates/v3/posts_list.html, templates/blocks/custom_video_block.html, static/css/v3/post-detail.css, static/css/v3/posts-list.css
Wires filter term URLs in list views, switches templates to use image_url and get_content, generates client-side filter cases from filter_terms, renders video block output, and adjusts CSS for markdown and summary paragraph padding.
Placeholders
pages/tests.py, pages/views.py
Adds placeholder comments for future tests and views.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Suggested reviewers

  • herzog0
  • julioest
  • julhoang
  • kattyode

Poem

🐰 A new pages app hops into place,
Wagtail posts now share the space,
Routes and tags and thumbnails queued,
Old entries gently migrated too,
The site's small rabbit gives a happy pace.

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description is missing the required Issue number, Figma link, component link, and structured Changes/Risks sections from the template. It provides narrative content but lacks the formal structure required. Restructure the description to match the template: add '# Issue: #[number]' header, include Figma link and component link fields, format Changes as a bulleted list, add Risks & Considerations section, and include a Self-review Checklist.
Docstring Coverage ⚠️ Warning Docstring coverage is 10.64% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Story 2096: Implement Wagtail Integration' clearly and concisely describes the primary change - implementing Wagtail integration across multiple components.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
⚔️ Resolve merge conflicts
  • Resolve merge conflict in branch jc/wagtail-integration

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
news/utils.py (2)

27-35: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add timeouts to both outbound thumbnail requests.

Lines 28 and 35 call external URLs without a timeout. These run in Celery/backfill workers, so one slow upstream can pin a worker indefinitely and stall the queue.

🛡️ Proposed fix
-    response = requests.get(url)
+    response = requests.get(url, timeout=10)
@@
-    thumbnail_file = File(
-        BytesIO(requests.get(json.get("thumbnail_url")).content),
+    thumbnail_response = requests.get(json.get("thumbnail_url"), timeout=10)
+    thumbnail_response.raise_for_status()
+    thumbnail_file = File(
+        BytesIO(thumbnail_response.content),
         name={f"{video.slug} Thumbnail"},
     )
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@news/utils.py` around lines 27 - 35, The outbound requests to the OEmbed
endpoint and to fetch the thumbnail are missing timeouts and can hang workers;
update both requests.get calls (the one that uses YOUTUBE_OEMBED_ENDPOINT +
f"?url={video.external_url}" and the one inside the thumbnail_file =
File(BytesIO(requests.get(json.get("thumbnail_url")).content), ...)) to pass a
timeout parameter (e.g. timeout=10 or a module-level/configurable constant like
REQUEST_TIMEOUT), and optionally wrap them in a try/except for
requests.Timeout/requests.RequestException to raise or log a clear error if the
call times out.

34-37: ⚠️ Potential issue | 🔴 Critical

Pass a string filename to File(...).

Line 36 uses name={f"{video.slug} Thumbnail"}, which creates a set; Django file handling expects name to be a string (so saving to a FileField can error).

🐛 Proposed fix
-    thumbnail_file = File(
-        BytesIO(requests.get(json.get("thumbnail_url")).content),
-        name={f"{video.slug} Thumbnail"},
-    )
+    thumbnail_file = File(
+        BytesIO(requests.get(json.get("thumbnail_url")).content),
+        name=f"{video.slug} Thumbnail",
+    )
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@news/utils.py` around lines 34 - 37, The thumbnail_file creation passes a set
to File(...).name by using name={f"{video.slug} Thumbnail"}; change this to pass
a string instead (e.g. name=f"{video.slug} Thumbnail") so Django receives a
proper filename string; update the File(...) invocation in the thumbnail_file
creation (the call that wraps
BytesIO(requests.get(json.get("thumbnail_url")).content)) to use a plain
formatted string for name and ensure the
BytesIO/requests/get/json.get/video.slug references remain unchanged.
🧹 Nitpick comments (4)
marketing/models.py (1)

164-164: ⚡ Quick win

Remove the leftover request-path print.

This writes every outreach request path to stdout and bypasses the structured logging configured in settings. If you still need rollout visibility here, switch it to a logger debug call instead.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@marketing/models.py` at line 164, Remove the stray print(path_components)
call that emits request paths to stdout; replace it with a structured logger
debug call instead (e.g., use the module-level logger or
logging.getLogger(__name__)). Locate the print(path_components) usage
(referenced symbol: path_components / print(path_components)) and change it to
logger.debug with a clear message and the path_components as structured data so
it uses the configured logging backend rather than stdout.
pages/models.py (2)

338-343: 💤 Low value

Optional: Use iterable unpacking for content_panels.

Consider using iterable unpacking instead of concatenation for consistency with modern Python style and to match the suggestion in mixins.py.

♻️ Proposed refactor
-    content_panels = BasePage.content_panels + [
+    content_panels = [
+        *BasePage.content_panels,
         "content",
         "image",
         "summary",
         "video_thumbnail",
     ]
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pages/models.py` around lines 338 - 343, The content_panels assignment
currently concatenates BasePage.content_panels with a list; change it to use
iterable unpacking so it mirrors the style used in mixins.py. Update the
content_panels definition on the Page model to produce a single list by
unpacking BasePage.content_panels (e.g., using the * operator) followed by the
added panel names ("content", "image", "summary", "video_thumbnail"), ensuring
the symbol names content_panels and BasePage.content_panels are used so the
intent and ordering remain identical.

Source: Linters/SAST tools


144-185: ⚡ Quick win

Reduce duplication by generating filter_terms from POST_CONTENT_TYPES.

The filter terms are hard-coded here and duplicated in news/views.py. The TODO comment on line 143 notes this should be generated from POST_CONTENT_TYPES. Consider implementing this now to reduce duplication and ensure consistency across the codebase.

♻️ Proposed refactor to generate filter_terms
+        # Generate filter terms from POST_CONTENT_TYPES
+        filter_terms = []
+        for content_type in POST_CONTENT_TYPES:
+            if content_type.filter_name:
+                filter_terms.append({
+                    "label": content_type.content_type + "s" if content_type.content_type != "News" else content_type.content_type,
+                    "value": content_type.filter_name,
+                    "url": self.get_url() + f"?type={content_type.filter_name}",
+                })
+            else:
+                # "All" filter
+                filter_terms.insert(0, {
+                    "label": "All",
+                    "value": "all",
+                    "url": self.get_url(),
+                })
+        
+        # Add placeholder filters (Discussions, Achievements, Issues)
+        # TODO: Implement these content types when ready
+        filter_terms.extend([
             {
                 "label": "Discussions",
                 "value": "discussions",
                 "url": self.get_url(),
             },
             {
                 "label": "Achievements",
                 "value": "achievements",
                 "url": self.get_url(),
             },
             {
                 "label": "Issues",
                 "value": "issues",
                 "url": self.get_url(),
             },
-        ]
+        ])
+        ctx["filter_terms"] = filter_terms
-        ctx["filter_terms"] = [
-            {
-                "label": "All",
-                "value": "all",
-                "url": self.get_url(),
-            },
-            {
-                "label": "News",
-                "value": "news",
-                "url": self.get_url() + "?type=news",
-            },
-            {
-                "label": "Blogs",
-                "value": "blogpost",
-                "url": self.get_url() + "?type=blogpost",
-            },
-            {
-                "label": "Links",
-                "value": "link",
-                "url": self.get_url() + "?type=link",
-            },
-            {
-                "label": "Videos",
-                "value": "video",
-                "url": self.get_url() + "?type=video",
-            },
-            {
-                "label": "Discussions",
-                "value": "discussions",
-                "url": self.get_url(),
-            },
-            {
-                "label": "Achievements",
-                "value": "achievements",
-                "url": self.get_url(),
-            },
-            {
-                "label": "Issues",
-                "value": "issues",
-                "url": self.get_url(),
-            },
-        ]
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pages/models.py` around lines 144 - 185, The filter_terms list in the pages
model is hard-coded and duplicates logic in news/views.py; instead iterate over
the POST_CONTENT_TYPES mapping to build each term so they stay consistent.
Update the code that sets ctx["filter_terms"] (and any related method using
get_url()) to loop through POST_CONTENT_TYPES items, creating dicts with "label"
from the human-readable name, "value" from the content type key, and "url" using
self.get_url() with "?type={value}" conditionally appended (ensure the "all"
entry still maps to self.get_url() without a query). Replace the static list
with this generated list so filter_terms is derived solely from
POST_CONTENT_TYPES.
pages/mixins.py (1)

41-41: 💤 Low value

Optional: Use iterable unpacking for content_panels.

Consider using iterable unpacking instead of concatenation for better readability and consistency with modern Python style.

♻️ Proposed refactor
-    content_panels = Page.content_panels + ["tags"]
+    content_panels = [*Page.content_panels, "tags"]
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pages/mixins.py` at line 41, Replace the list concatenation used to extend
content_panels with iterable unpacking for clarity: change the expression that
sets content_panels (currently using Page.content_panels + ["tags"]) to use
unpacking of Page.content_panels and the "tags" item so the result is built via
a single list literal with *Page.content_panels and "tags"; update the
assignment to the content_panels attribute in the pages/mixins.py mixin where
content_panels and Page.content_panels are referenced.

Source: Linters/SAST tools

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@config/urls.py`:
- Around line 410-412: The new Wagtail mount at path("outreach/",
include(wagtail_urls)) is being shadowed by an earlier route handled by
WhitePaperView (and the existing outreach/<slug>/<slug> pattern) so requests
never reach marketing.OutreachHomePage.route(); fix by making the routing
non-overlapping — either move the Wagtail include for "outreach/" so it appears
before the WhitePaperView route in urls.py, or tighten the WhitePaperView
pattern (or its view logic) to exclude the Wagtail path/shape (e.g., require a
different prefix or more specific regex) so marketing.OutreachHomePage.route()
can receive outreach topic/detail requests.

In `@news/tasks.py`:
- Around line 191-200: set_summary_for_event_page currently grabs
page.content[0] (a StreamChild) and passes it to summarize_content, which can
crash because PostPage.content is a StreamField; instead iterate the StreamField
(PostPage.content), extract plain text from text-like blocks (e.g.
RichTextBlock, MarkdownBlock, TextBlock) by reading each StreamChild's
value/text, skip non-text blocks (e.g. PollBlock, URLBlock), join those pieces
into a single str, then use that string for the length check and as the first
arg to summarize_content; update usage sites in set_summary_for_event_page (and
keep save_page_summary_value as the callback) so summarize_content.apply_async
receives the cleaned plain text.

In `@news/utils.py`:
- Around line 22-25: The PostPage guard in set_thumbnail_for_video_page is
inverted: change the condition for PostPage from "not video.post_content_type !=
'video'" to check inequality directly so only non-video pages raise;
specifically, locate the PostPage branch that currently reads "elif
isinstance(video, PostPage) and not video.post_content_type != 'video'" and
replace it with a direct inequality check ("and video.post_content_type !=
'video'") so valid video pages proceed and only pages whose post_content_type is
not "video" trigger the Exception.

In `@pages/blocks.py`:
- Around line 10-14: CustomVideoBlock is missing the thumbnail child so the
StreamField StructBlock schema no longer matches existing migrations and stored
data; restore schema-compatibility by adding a thumbnail field to
CustomVideoBlock (e.g. thumbnail = ImageChooserBlock(required=False)) and import
ImageChooserBlock at the top, ensuring the class definition (CustomVideoBlock)
and templates remain compatible with PostPage.video_thumbnail; alternatively, if
you intend to move thumbnails off the block, create a migration that removes
thumbnail from the content StreamField and include a data migration to strip
thumbnail entries from existing stored JSON, but the simplest fix is to re-add
the thumbnail ImageChooserBlock to CustomVideoBlock.

In `@pages/management/commands/convert_news_entries.py`:
- Around line 71-73: The code currently raises a bare Exception ("No Post Index
Page found...") which should be replaced with a command-native exception to
produce cleaner CLI output; update the raise in convert_news_entries.py (the
raise Exception(...) call) to raise a click.ClickException (or Django's
CommandError if this is a Django management command) and add the appropriate
import (click or CommandError from django.core.management.base) so the command
fails with a concise, user-friendly error message.
- Around line 49-52: Replace title-based dedupe with a stable file identity:
change the Image.objects.get_or_create call that currently uses
get_or_create(title=image.name, ...) to use the image file as the lookup key
(e.g., get_or_create(file=image.name) or
get_or_create(file=<file-field-or-filename>)) so you match existing Image
records by their file, not title; keep the same defaults (width, height, file)
and assign the result to wagtail_image as before (reference:
Image.objects.get_or_create, wagtail_image, image).

In `@pages/tests.py`:
- Line 1: pages/tests.py is empty so conversion logic in convert_news_entries
lacks coverage; add tests that exercise idempotent upsert behavior, content
block mapping, and image/video transfer. Write unit tests targeting the
convert_news_entries function: create fixture source entries with varied content
blocks and media, run convert_news_entries once and assert created targets and
correct mapped blocks/media, run it again and assert no duplicates (idempotent
upsert), and include edge cases (missing media, empty blocks) to validate
conversions and error handling. Ensure tests use the same symbols
convert_news_entries and any helper mappers/uploaders used in the diff so they
fail fast if mapping logic changes.

In `@templates/blocks/custom_video_block.html`:
- Line 1: The template currently bypasses escaping by outputting the EmbedBlock
with "{{ self.video|safe }}"; instead render the EmbedBlock with Wagtail's
include_block so the block's own safe rendering is used (use the {%
include_block %} tag targeting self.video), removing the "|safe" usage and
ensuring the EmbedBlock template is invoked (look for occurrences of
"self.video" and the template "templates/blocks/custom_video_block.html" to
update).

---

Outside diff comments:
In `@news/utils.py`:
- Around line 27-35: The outbound requests to the OEmbed endpoint and to fetch
the thumbnail are missing timeouts and can hang workers; update both
requests.get calls (the one that uses YOUTUBE_OEMBED_ENDPOINT +
f"?url={video.external_url}" and the one inside the thumbnail_file =
File(BytesIO(requests.get(json.get("thumbnail_url")).content), ...)) to pass a
timeout parameter (e.g. timeout=10 or a module-level/configurable constant like
REQUEST_TIMEOUT), and optionally wrap them in a try/except for
requests.Timeout/requests.RequestException to raise or log a clear error if the
call times out.
- Around line 34-37: The thumbnail_file creation passes a set to File(...).name
by using name={f"{video.slug} Thumbnail"}; change this to pass a string instead
(e.g. name=f"{video.slug} Thumbnail") so Django receives a proper filename
string; update the File(...) invocation in the thumbnail_file creation (the call
that wraps BytesIO(requests.get(json.get("thumbnail_url")).content)) to use a
plain formatted string for name and ensure the
BytesIO/requests/get/json.get/video.slug references remain unchanged.

---

Nitpick comments:
In `@marketing/models.py`:
- Line 164: Remove the stray print(path_components) call that emits request
paths to stdout; replace it with a structured logger debug call instead (e.g.,
use the module-level logger or logging.getLogger(__name__)). Locate the
print(path_components) usage (referenced symbol: path_components /
print(path_components)) and change it to logger.debug with a clear message and
the path_components as structured data so it uses the configured logging backend
rather than stdout.

In `@pages/mixins.py`:
- Line 41: Replace the list concatenation used to extend content_panels with
iterable unpacking for clarity: change the expression that sets content_panels
(currently using Page.content_panels + ["tags"]) to use unpacking of
Page.content_panels and the "tags" item so the result is built via a single list
literal with *Page.content_panels and "tags"; update the assignment to the
content_panels attribute in the pages/mixins.py mixin where content_panels and
Page.content_panels are referenced.

In `@pages/models.py`:
- Around line 338-343: The content_panels assignment currently concatenates
BasePage.content_panels with a list; change it to use iterable unpacking so it
mirrors the style used in mixins.py. Update the content_panels definition on the
Page model to produce a single list by unpacking BasePage.content_panels (e.g.,
using the * operator) followed by the added panel names ("content", "image",
"summary", "video_thumbnail"), ensuring the symbol names content_panels and
BasePage.content_panels are used so the intent and ordering remain identical.
- Around line 144-185: The filter_terms list in the pages model is hard-coded
and duplicates logic in news/views.py; instead iterate over the
POST_CONTENT_TYPES mapping to build each term so they stay consistent. Update
the code that sets ctx["filter_terms"] (and any related method using get_url())
to loop through POST_CONTENT_TYPES items, creating dicts with "label" from the
human-readable name, "value" from the content type key, and "url" using
self.get_url() with "?type={value}" conditionally appended (ensure the "all"
entry still maps to self.get_url() without a query). Replace the static list
with this generated list so filter_terms is derived solely from
POST_CONTENT_TYPES.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 73ccd1cf-0c21-49fd-839a-61d4d5efbef2

📥 Commits

Reviewing files that changed from the base of the PR and between 560be4d and ec276f1.

📒 Files selected for processing (30)
  • config/settings.py
  • config/urls.py
  • core/mixins.py
  • marketing/models.py
  • news/management/commands/backpopulate_thumbnails.py
  • news/models.py
  • news/tasks.py
  • news/utils.py
  • news/views.py
  • pages/__init__.py
  • pages/admin.py
  • pages/apps.py
  • pages/blocks.py
  • pages/management/commands/convert_news_entries.py
  • pages/migrations/0001_initial.py
  • pages/migrations/0002_postpage_video_thumbnail.py
  • pages/migrations/0003_alter_postpage_content.py
  • pages/migrations/__init__.py
  • pages/mixins.py
  • pages/models.py
  • pages/tests.py
  • pages/views.py
  • pages/wagtail_hooks.py
  • static/css/v3/post-detail.css
  • static/css/v3/posts-list.css
  • templates/blocks/custom_video_block.html
  • templates/news/v3/detail.html
  • templates/pages/routable_home_page.html
  • templates/v3/includes/_post_list_card.html
  • templates/v3/posts_list.html

Comment thread config/urls.py
Comment on lines +410 to +412
+ [
path("outreach/", include(wagtail_urls)),
path("pages/", include(wagtail_urls)),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

The new /outreach/... Wagtail mount is still shadowed by WhitePaperView.

Line 133 already matches outreach/<slug>/<slug>, which is the same URL shape marketing.OutreachHomePage.route() now expects for topic/detail pages. Those requests will never reach this Wagtail include block, so the new outreach detail routing stays unreachable until the URL shapes stop overlapping.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@config/urls.py` around lines 410 - 412, The new Wagtail mount at
path("outreach/", include(wagtail_urls)) is being shadowed by an earlier route
handled by WhitePaperView (and the existing outreach/<slug>/<slug> pattern) so
requests never reach marketing.OutreachHomePage.route(); fix by making the
routing non-overlapping — either move the Wagtail include for "outreach/" so it
appears before the WhitePaperView route in urls.py, or tighten the
WhitePaperView pattern (or its view logic) to exclude the Wagtail path/shape
(e.g., require a different prefix or more specific regex) so
marketing.OutreachHomePage.route() can receive outreach topic/detail requests.

Comment thread news/tasks.py
Comment on lines +191 to +200
page = PostPage.objects.get(pk=pk)
content = page.content[0]
logger.info(f"dispatching summarize task for {pk=} with {content[:40]=}...")
if content and len(content) < CONTENT_SUMMARIZATION_THRESHOLD:
logger.warning(f"Content too short to summarize for {pk=}, skipping.")
return
logger.info(f"handing off {pk=} to summarize_content task")
summarize_content.apply_async(
(content, page.title, "gpt-oss-120b"),
link=save_page_summary_value.s(pk),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Locate and print relevant parts of the file around the reported lines
sed -n '150,240p' news/tasks.py | nl -ba

# Find PostPage model definition and inspect the 'content' field type/semantics
rg -n "class PostPage|PostPage" -S . -g'*.py' --hidden
# Try common locations
ls -la news || true

fd -g'*.py' PostPage . 2>/dev/null | head

# Search for 'content' field definition related to PostPage
rg -n "content\s*=|StreamField|page\.content" news -S --hidden -g'*.py'
rg -n "StreamField" -S . -g'*.py' --hidden | head -n 50

# Search for any accessor that converts content to text
rg -n "get_.*text|to_text|stream.*text|content.*text|content_text|body_text" -S news . --hidden -g'*.py'

Repository: boostorg/website-v2

Length of output: 106


🏁 Script executed:

#!/bin/bash
set -euo pipefail

python3 - <<'PY'
from pathlib import Path
p = Path("news/tasks.py")
start, end = 160, 220
lines = p.read_text().splitlines()
for i in range(start, min(end, len(lines)) + 1):
    print(f"{i:4d}: {lines[i-1]}")
PY

echo "---- set_summary_for_event_page context ----"
rg -n "def set_summary_for_event_page|set_summary_for_event_page" news/tasks.py

echo "---- search for PostPage definition ----"
rg -n "class PostPage\b" -S . -g'*.py' --hidden

echo "---- find PostPage.content field definition ----"
rg -n "PostPage.*content\s*=|content\s*=\s*StreamField|content\s*=" -S . -g'*.py' --hidden

echo "---- find direct indexing usage of page.content ----"
rg -n "page\.content\[[0-9]+\]" -S . -g'*.py' --hidden

echo "---- locate StreamField usage and any text conversion helpers ----"
rg -n "StreamField" -S . -g'*.py' --hidden | head -n 200
rg -n "to_text|get_.*text|stream.*text|content.*text|content_text|body_text|plain.*text" -S . -g'*.py' --hidden | head -n 200

echo "---- locate summarize_content task signature/implementation ----"
rg -n "def summarize_content|summarize_content" -S news -g'*.py' --hidden

Repository: boostorg/website-v2

Length of output: 29717


🏁 Script executed:

#!/bin/bash
set -euo pipefail

python3 - <<'PY'
from pathlib import Path
p = Path("pages/models.py")
lines = p.read_text().splitlines()
# Show around POST_BLOCKS and PostPage.content definition
for start,end in [(1,140),(170,260),(260,360)]:
    for i in range(start, min(end, len(lines))+1):
        if i-1 < len(lines):
            print(f"{i:4d}: {lines[i-1]}")
    print("----")
PY

echo "---- search for PostPage text/accessor helpers ----"
rg -n "class PostPage\b|def get_.*(text|content)|content_.*text|to_text|stream_.*text|plain.*text" pages/models.py -S

echo "---- search for POST_BLOCKS definition usage ----"
rg -n "POST_BLOCKS" pages/models.py -S

echo "---- inspect summarize_content implementation for input expectations ----"
sed -n '1,260p' news/tasks.py | nl -ba | head -n 220 2>/dev/null || true

# nl isn't available; print with python if needed
python3 - <<'PY'
from pathlib import Path
p = Path("news/tasks.py")
lines = p.read_text().splitlines()
for i in range(1,120):
    if i-1 < len(lines):
        print(f"{i:4d}: {lines[i-1]}")
PY

Repository: boostorg/website-v2

Length of output: 16714


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "---- inspect pages/blocks.py (POST_BLOCKS) ----"
python3 - <<'PY'
from pathlib import Path
p=Path("pages/blocks.py")
lines=p.read_text().splitlines()
print(f"total lines: {len(lines)}")
# print first 220 lines; if smaller fine
for i in range(1, min(260, len(lines))+1):
    print(f"{i:4d}: {lines[i-1]}")
PY

echo "---- find any conversion helpers to plain text in pages/models.py / blocks / news/helpers ----"
rg -n "extract_content|get_text|plain\s*text|to_text|stream_.*text|as_text" -S pages news core -g'*.py' --hidden | head -n 200

echo "---- inspect news/helpers.py extract_content ----"
sed -n '1,220p' news/helpers.py

echo "---- inspect news/tasks.py around set_summary_for_event_page for how it expects content ----"
python3 - <<'PY'
from pathlib import Path
p=Path("news/tasks.py")
lines=p.read_text().splitlines()
for i in range(170, 230):
    if 0<=i-1<len(lines):
        print(f"{i:4d}: {lines[i-1]}")
PY

echo "---- inspect any PostPage template usage of content/text ----"
rg -n "visible_content|get_content|content\[0\]|stream_content_type" -S pages/templates news/templates -g'*.html' --hidden | head -n 200

Repository: boostorg/website-v2

Length of output: 6524


Fix StreamField handling in set_summary_for_event_page (pass plain text to the summarizer)
set_summary_for_event_page uses content = page.content[0], but PostPage.content is a StreamField over POST_BLOCKS (e.g., RichTextBlock, MarkdownBlock, URLBlock, PollBlock). The code then slices/len()s content and forwards it to summarize_content, which expects a str (and itself does content[:100]), so this can crash or send the wrong data. Extract the underlying text from the StreamChild (and/or skip non-text block types) before the length check and apply_async.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@news/tasks.py` around lines 191 - 200, set_summary_for_event_page currently
grabs page.content[0] (a StreamChild) and passes it to summarize_content, which
can crash because PostPage.content is a StreamField; instead iterate the
StreamField (PostPage.content), extract plain text from text-like blocks (e.g.
RichTextBlock, MarkdownBlock, TextBlock) by reading each StreamChild's
value/text, skip non-text blocks (e.g. PollBlock, URLBlock), join those pieces
into a single str, then use that string for the length check and as the first
arg to summarize_content; update usage sites in set_summary_for_event_page (and
keep save_page_summary_value as the callback) so summarize_content.apply_async
receives the cleaned plain text.

Comment thread news/utils.py
Comment on lines +22 to +25
if isinstance(video, Video) and not video.is_video:
raise Exception(f"{video} is not a video, cannot set thumbnail.")
elif isinstance(video, PostPage) and not video.post_content_type != "video":
raise Exception(f"{video}'s content is not a video, cannot set thumbnail.")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Fix the PostPage video-type guard.

not video.post_content_type != "video" is equivalent to video.post_content_type == "video", so valid video pages raise here and set_thumbnail_for_video_page never succeeds.

🐛 Proposed fix
-    elif isinstance(video, PostPage) and not video.post_content_type != "video":
+    elif isinstance(video, PostPage) and video.post_content_type != "video":
         raise Exception(f"{video}'s content is not a video, cannot set thumbnail.")
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@news/utils.py` around lines 22 - 25, The PostPage guard in
set_thumbnail_for_video_page is inverted: change the condition for PostPage from
"not video.post_content_type != 'video'" to check inequality directly so only
non-video pages raise; specifically, locate the PostPage branch that currently
reads "elif isinstance(video, PostPage) and not video.post_content_type !=
'video'" and replace it with a direct inequality check ("and
video.post_content_type != 'video'") so valid video pages proceed and only pages
whose post_content_type is not "video" trigger the Exception.

Comment thread pages/blocks.py Outdated
Comment on lines +10 to +14
class CustomVideoBlock(StructBlock):
video = EmbedBlock()

class Meta:
template = "blocks/custom_video_block.html"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check the serialized block structure in migration 0003
rg -A 10 "StructBlock.*video.*thumbnail" pages/migrations/

Repository: boostorg/website-v2

Length of output: 45


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== List migrations in pages/migrations =="
ls -la pages/migrations | sed -n '1,200p'

echo
echo "== Show migration 0003 file =="
MIG_FILE="$(ls pages/migrations | rg '^0003_.*\.py$' || true)"
if [ -z "${MIG_FILE}" ]; then
  echo "Could not find 0003_*.py in pages/migrations"
  exit 0
fi
echo "Found: ${MIG_FILE}"
sed -n '1,200p' "pages/migrations/${MIG_FILE}"

echo
echo "== Show lines around where the review comment claims line 34 is =="
sed -n '20,80p' "pages/migrations/${MIG_FILE}"

echo
echo "== Inspect CustomVideoBlock in pages/blocks.py =="
sed -n '1,120p' pages/blocks.py

echo
echo "== Search for 'CustomVideoBlock' usage across repo =="
rg -n "CustomVideoBlock" -S .

echo
echo "== Search for thumbnail references in migration files =="
rg -n "thumbnail" pages/migrations -S

echo
echo "== Search for StructBlock definition of CustomVideoBlock in blocks.py =="
rg -n "class CustomVideoBlock|EmbedBlock\(|thumbnail|ImageChooserBlock" pages/blocks.py -n

Repository: boostorg/website-v2

Length of output: 4590


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Search for PostPage.video_thumbnail usage =="
rg -n "video_thumbnail" pages -S

echo
echo "== Inspect CustomVideoBlock template =="
fd -a "custom_video_block\.html" -S . | head -n 20 || true
# If found, print it (bounded)
TEMPLATE_PATH="$(fd -a "custom_video_block\.html" -S . | head -n 1 || true)"
if [ -n "${TEMPLATE_PATH}" ]; then
  echo "Using: ${TEMPLATE_PATH}"
  sed -n '1,200p' "${TEMPLATE_PATH}"
fi

echo
echo "== Inspect PostPage model field definitions =="
# Try likely files first
for f in pages/models.py pages/models/*.py pages/models/__init__.py; do
  if [ -f "$f" ]; then
    echo "== File: $f =="
    rg -n "class PostPage|video_thumbnail|content\s*=" "$f" -S || true
  fi
done

# Also search for "class PostPage" across pages/ safely
echo
echo "== Locate class PostPage definition =="
rg -n "class PostPage" pages -S
POSTPAGE_FILE="$(rg -l "class PostPage" pages -S | head -n 1 || true)"
if [ -n "${POSTPAGE_FILE}" ]; then
  echo "Using: ${POSTPAGE_FILE}"
  sed -n '1,240p' "${POSTPAGE_FILE}"
fi

Repository: boostorg/website-v2

Length of output: 8618


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Locate custom_video_block.html in repo =="
fd -t f "custom_video_block\.html$" . -a || true

echo
echo "== Search for custom_video_block.html references =="
rg -n "custom_video_block\.html" . -S || true

echo
echo "== Search for 'video_thumbnail' usage in templates/static =="
rg -n "video_thumbnail" . -S --glob '!**/migrations/**' || true

echo
echo "== Inspect management/commands/convert_news_entries.py (thumbnail handling) =="
sed -n '80,160p' pages/management/commands/convert_news_entries.py || true
# (If path differs, list likely command files)
if [ ! -f pages/management/commands/convert_news_entries.py ]; then
  fd -t f "convert_news_entries\.py$" . -a || true
fi

echo
echo "== Inspect news/tasks/set_thumbnail_for_video_page =="
rg -n "set_thumbnail_for_video_page" -S . || true
TASK_FILE="$(rg -l "set_thumbnail_for_video_page" -S . | head -n 1 || true)"
if [ -n "${TASK_FILE}" ]; then
  sed -n '1,200p' "${TASK_FILE}"
fi

echo
echo "== Inspect migration 0002 and 0001 for video/thumbnail block schema =="
sed -n '1,120p' pages/migrations/0002_postpage_video_thumbnail.py
sed -n '120,220p' pages/migrations/0001_initial.py

echo
echo "== Inspect migration 0003 (full block_lookup snippet) =="
sed -n '1,120p' pages/migrations/0003_alter_postpage_content.py

Repository: boostorg/website-v2

Length of output: 15302


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Inspect templates/blocks/custom_video_block.html =="
sed -n '1,200p' templates/blocks/custom_video_block.html

echo
echo "== Inspect pages/models.py around video thumbnail logic =="
sed -n '220,320p' pages/models.py

echo
echo "== Inspect news/utils.py set_video_thumbnail =="
sed -n '1,220p' news/utils.py

echo
echo "== Search templates for '.thumbnail' inside the video block include =="
rg -n "thumbnail" templates/blocks templates/v3 -S || true

echo
echo "== Search for StreamField access to video block thumbnail key =="
rg -n "content.*thumbnail|video.*thumbnail|thumbnail.*video" pages news templates -S || true

Repository: boostorg/website-v2

Length of output: 8576


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Search for any usage of stream-block thumbnail key in templates/python =="
rg -n "content.*thumbnail|thumbnail.*content|block\.thumbnail|\.\s*thumbnail\b" pages news templates -S || true

echo
echo "== Search for video block field access (self.video / value.video) =="
rg -n "content.*\bvideo\b|\.video\b" templates/blocks templates/v3 pages news -S || true

echo
echo "== Search for StreamField access to thumbnail specifically within video struct =="
rg -n "content__0__.*thumbnail|content\[['\"].*thumbnail|content__.*thumbnail" pages news -S || true

Repository: boostorg/website-v2

Length of output: 1871


Critical: CustomVideoBlock missing thumbnail so the PostPage.content StreamField video StructBlock schema won’t match existing stored data/migrations.

File: pages/blocks.py (Lines 10-14)

class CustomVideoBlock(StructBlock):
    video = EmbedBlock()

    class Meta:
        template = "blocks/custom_video_block.html"

The video block schema in pages/migrations/0001_initial.py and pages/migrations/0003_alter_postpage_content.py defines a StructBlock with both ("video", ...) and ("thumbnail", ...). The current CustomVideoBlock defines only video, while the templates currently render the thumbnail from PostPage.video_thumbnail (model-level FK), not from the struct block.

🔧 Proposed fix: Align code vs StreamField schema

Option 1 (schema-compat / simplest): add the missing thumbnail child block back to CustomVideoBlock (and import it), e.g. thumbnail = ImageChooserBlock(required=False).

Option 2 (if you intend to store thumbnails only on PostPage.video_thumbnail): create a migration that removes thumbnail from the content StreamField video StructBlock and includes a data migration to strip thumbnail from existing content JSON values (editing 0003 alone won’t migrate existing rows).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pages/blocks.py` around lines 10 - 14, CustomVideoBlock is missing the
thumbnail child so the StreamField StructBlock schema no longer matches existing
migrations and stored data; restore schema-compatibility by adding a thumbnail
field to CustomVideoBlock (e.g. thumbnail = ImageChooserBlock(required=False))
and import ImageChooserBlock at the top, ensuring the class definition
(CustomVideoBlock) and templates remain compatible with
PostPage.video_thumbnail; alternatively, if you intend to move thumbnails off
the block, create a migration that removes thumbnail from the content
StreamField and include a data migration to strip thumbnail entries from
existing stored JSON, but the simplest fix is to re-add the thumbnail
ImageChooserBlock to CustomVideoBlock.

Comment on lines +49 to +52
wagtail_image, _ = Image.objects.get_or_create(
title=image.name,
defaults={"width": image.width, "height": image.height, "file": image},
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use a stable image identity; title-based dedupe can corrupt mappings.

At Line 49, get_or_create(title=image.name, ...) can bind the wrong existing Image when different files share the same title/name, causing incorrect media on migrated posts.

Proposed fix
-    wagtail_image, _ = Image.objects.get_or_create(
-        title=image.name,
-        defaults={"width": image.width, "height": image.height, "file": image},
-    )
+    # Avoid title-only matching; prefer file identity.
+    wagtail_image = Image.objects.filter(file=image.name).first()
+    if not wagtail_image:
+        wagtail_image = Image.objects.create(
+            title=image.name,
+            width=image.width,
+            height=image.height,
+            file=image,
+        )
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pages/management/commands/convert_news_entries.py` around lines 49 - 52,
Replace title-based dedupe with a stable file identity: change the
Image.objects.get_or_create call that currently uses
get_or_create(title=image.name, ...) to use the image file as the lookup key
(e.g., get_or_create(file=image.name) or
get_or_create(file=<file-field-or-filename>)) so you match existing Image
records by their file, not title; keep the same defaults (width, height, file)
and assign the result to wagtail_image as before (reference:
Image.objects.get_or_create, wagtail_image, image).

Comment thread pages/management/commands/convert_news_entries.py Outdated
Comment thread pages/tests.py
@@ -0,0 +1 @@
# Create your tests here.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Add migration/conversion test coverage before merge.

pages/tests.py is still a placeholder (Line 1), so critical conversion paths in convert_news_entries are untested (idempotent upsert, content block mapping, image/video transfer). This is a high-risk gap for a data migration feature.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pages/tests.py` at line 1, pages/tests.py is empty so conversion logic in
convert_news_entries lacks coverage; add tests that exercise idempotent upsert
behavior, content block mapping, and image/video transfer. Write unit tests
targeting the convert_news_entries function: create fixture source entries with
varied content blocks and media, run convert_news_entries once and assert
created targets and correct mapped blocks/media, run it again and assert no
duplicates (idempotent upsert), and include edge cases (missing media, empty
blocks) to validate conversions and error handling. Ensure tests use the same
symbols convert_news_entries and any helper mappers/uploaders used in the diff
so they fail fast if mapping logic changes.

@@ -0,0 +1 @@
{{ self.video|safe }}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

In current Wagtail docs, what is the recommended way to render an EmbedBlock value inside a StructBlock template: {{ value.video }}, {{ value.video|safe }}, or {% include_block value.video %}?

💡 Result:

Recommended: use {% include_block value.video %} (not {{ value.video }} and not {{ value.video|safe }}). In Wagtail templates, {{ value.video }} outputs the plain Python value; for StreamField/StructBlock child access, Wagtail documents that children of a StructBlock (e.g. value.heading) return a plain value, so {% include_block value.heading %} would render as plain text unless you explicitly include the block’s HTML rendering via the correct include mechanism [1]. The Wagtail docs describe using {% include_block %} to include a block’s HTML representation in templates [2]. Therefore, inside your StructBlock template, render the EmbedBlock with {% include_block value.video %} so Wagtail can use the EmbedBlock’s own template/HTML output [2].

Citations:


🌐 Web query:

Wagtail EmbedBlock template rendering include_block safe filter discouraged

💡 Result:

In Wagtail, the use of the safe filter with {% include_block %} is generally discouraged because it bypasses Django's automatic HTML escaping, which is a critical security measure to prevent Cross-Site Scripting (XSS) attacks [1]. Wagtail has implemented strict HTML escaping for the {% include_block %} tag to ensure that content authored in the Wagtail admin—particularly in blocks like CharBlock or TextBlock that do not have custom templates—is safely rendered [1]. If you explicitly mark content as safe or use {% autoescape off %}, you risk allowing arbitrary HTML or script injection if those blocks are editable by users [1]. Regarding the rendering of blocks: 1. Recommended Approach: Use the {% include_block %} tag directly on your block instances [2][3]. This respects Django's autoescaping rules [4][5]. If a block has a custom template defined, Wagtail will use it; otherwise, it will use appropriate default rendering [2][3]. 2. EmbedBlock Specifics: You should use {% include_block %} for EmbedBlock instances within a StreamField [2][3]. Do not attempt to use the {% embed %} tag on an EmbedBlock directly, as the block already holds the rendered HTML fragment, not just the URL [6]. 3. Security Context: The need for escaping became particularly relevant after a vulnerability (GHSA-xfrw-hxr5-ghqf) where improper escaping in basic field blocks could be exploited by authors [1]. Site implementors are advised to keep autoescaping enabled [1]. If you have a legitimate need to render raw HTML (e.g., from a RawHTMLBlock), Wagtail's block system handles this automatically because such blocks are designed to return 'safe' marked strings [4][5]. If you find yourself reaching for the safe filter, it is likely an indication that you should instead use a block type specifically designed to hold safe/raw HTML or define a custom template for your block.

Citations:


Avoid bypassing escaping for EmbedBlock output

Line 1 uses |safe, which defeats Django/Wagtail’s HTML escaping protections. Render the EmbedBlock via Wagtail’s {% include_block %} so the EmbedBlock’s own template/HTML rendering is used safely.

Suggested fix
-{{ self.video|safe }}
+{% load wagtailcore_tags %}
+{% include_block self.video %}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
{{ self.video|safe }}
{% load wagtailcore_tags %}
{% include_block self.video %}
🧰 Tools
🪛 HTMLHint (1.9.2)

[error] 1-1: Doctype must be declared before any non-comment content.

(doctype-first)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@templates/blocks/custom_video_block.html` at line 1, The template currently
bypasses escaping by outputting the EmbedBlock with "{{ self.video|safe }}";
instead render the EmbedBlock with Wagtail's include_block so the block's own
safe rendering is used (use the {% include_block %} tag targeting self.video),
removing the "|safe" usage and ensuring the EmbedBlock template is invoked (look
for occurrences of "self.video" and the template
"templates/blocks/custom_video_block.html" to update).

@javiercoronadonarvaez javiercoronadonarvaez self-requested a review June 10, 2026 14:13

@javiercoronadonarvaez javiercoronadonarvaez left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with the latest PR comments. Just a matter of solving the merge conflicts.

@herzog0 herzog0 self-requested a review June 10, 2026 15:59

@herzog0 herzog0 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heya, sorry for the delay, still looking through some things. It's a bit difficult to test all cases as this integration is a bit large. But I found some things that, unfortunately, can't be postponed to a follow-up ticket for improvements.

Comment thread news/tasks.py Outdated
Comment thread pages/models.py
Comment thread news/tasks.py
@jlchilders11 jlchilders11 requested a review from herzog0 June 12, 2026 14:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

wagtail implimentation

6 participants