Skip to content

Commit 2b70119

Browse files
authored
Threaded Comments & Review — issue #25 (comments + replies render in PowerPoint) (#55)
* feat(comments): Threaded Comments & Review — issue #25 epic Implements all 8 sub-features additively, built from the OOXML spec (no upstream prior art): - SF1 CommentAuthors (legacy integer-id) + modern GUID-keyed authors - SF2 legacy <p:cm> + modern <p188:cm> oxml/part classes (2018/8/main) - SF3 Slide.comments add/remove/iter (lazy modern part create+relate) - SF4 Comment.replies threaded chains - SF5 author/text/created_at(tz-aware)/anchor_position accessors - SF6 Comment.resolve()/resolved (modern; legacy raises, documented) - SF7 Shape.comments per-shape anchored filter - SF8 legacy<->modern coexistence on round-trip (both enumerable) DEFECT caught by Interceptor visual verification (trinity was green): modern <p188:cm> referenced GUID authorIds while only a legacy integer-id commentAuthors.xml existed -> dangling refs -> PowerPoint repair dialog. Fixed: modern /ppt/authors.xml (application/vnd.ms-powerpoint.authors+xml, GUID-keyed <p188:author>, presentation->authors threadedCommentAuthors rel). providerId made RequiredAttribute so it always serializes (PowerPoint always writes it). Tests: +97 pytest (tests/test_issue25_comments.py, 43 issue-25), +14 behave. Trinity: pytest 3751, ruff clean, behave 1062, 0 failed. uat_issue25.py 17/17 (script-QA). Post-fix PowerPoint Review-pane visual deferred to maintainer UAT per repo §6a (env wall on re-open; residual risk disclosed — structural reasoning produced one false-green on this part). No push/PR pre-signoff. Refs #25 Closes scanny#487 scanny#538 scanny#756 * fix(comments): move shape-anchor out of p188:cm into extLst (Cato audit) Cato cross-vendor audit flagged `anchorShapeId` as an out-of-schema attribute on <p188:cm> (2018/8/main defines only id/authorId/created/ status) — a plausible second PowerPoint repair-dialog trigger that the deferred post-fix visual would have hidden. Anchor now stored in <p188:extLst>/<p188:ext uri="{fork URI}"> — the OOXML-sanctioned extension mechanism conformant consumers MUST ignore on unknown URI. SF7 per-shape filter + round-trip preserved (anchored comment round-trips True). <p188:cm> now carries only schema attributes. Trinity green: pytest 3751, ruff clean, behave 1062, 0 failed. Refs #25 * Updated uat instructions to use uat folder * Remove uat files. * fix(comments): emit required <a:bodyPr> in <p188:txBody> (maintainer UAT) Maintainer opened uat_issue25_comments.pptx in PowerPoint: file opened clean (no repair) but ZERO comments displayed. Root cause: the add path builds the body via get_or_add_txBody() (a bare <p188:txBody/>) then sets text via _set_txBody_text, which only PRESERVED a pre-existing <a:bodyPr> and never created one. <p188:txBody> is a DrawingML a:CT_TextBody whose schema REQUIRES <a:bodyPr> as first child; PowerPoint silently drops comments with a bodyPr-less txBody — no repair dialog, the comment is just absent. The entire green trinity + 17/17 UAT could not see this; only maintainer visual review did. Fix: _set_txBody_text now GUARANTEES a leading <a:bodyPr/> (inserts it when absent) for both comments and replies. Added DescribeThreadedCommentBodyPrRegression (2 tests) asserting every txBody carries <a:bodyPr> before <a:p>, so the silent-drop class is now test-caught. uat_issue25.py OUT path moved under ./uat/ per CLAUDE.md §6a update. Trinity: pytest 3753, ruff clean, behave 1062, 0 failed. Refs #25 * fix(comments): use PowerPoint's real modern-comment contract — parent threads now render (issue #25) Threaded comments were invisible in PowerPoint's Comments pane despite a green trinity and clean script UAT. Root-caused by diffing our OOXML against a threaded comment PowerPoint for Mac itself authored+saved (ground truth, captured this session) — three string axes and one element were inferred wrong in Waves 1-2: - comment-part content type: was application/vnd.ms-powerpoint.threadedComments+xml -> application/vnd.ms-powerpoint.comments+xml - slide -> comment-part reltype: was .../2018/10/relationships/threadedComment -> .../2018/10/relationships/comments - presentation -> authors reltype: was .../2018/10/relationships/threadedCommentAuthors -> .../2018/10/relationships/authors - every <p188:cm> now emits its required first child <pc:sldMkLst><pc:docMk/><pc:sldMk cId="0" sldId="{p:sldId}"/></pc:sldMkLst> (ns .../2013/main/command) — the slide binding PowerPoint needs; omitting it (or the fork-private extLst anchor) leaves the comment unbound. PowerPoint resolves the comment part by the slide relationship type; an unrecognized reltype meant the part was never loaded -> empty pane. Files: opc/constants.py (3 values; symbols unchanged so the PartFactory registry + producer move in lockstep — Cato-audited), oxml/ns.py (+pc), oxml/comments.py (sldMkLst ZeroOrOne + set_slide_marker), comments.py (Comments.add binds the slide). 6 regression assertions in DescribeThreadedCommentPowerPointContract lock all four axes (red->green proven by stashing src) plus set_slide_marker idempotency. Verification: pytest 3759 passed; ruff check + format clean; behave 1062 scenarios / 3215 steps 0 failed. Regenerated UAT deck opened in real PowerPoint via Interceptor — Comments pane renders 3 threads, correct author/text, resolved status persisted (uat/FIXED_comments_rendering.png vs uat/REPRO_no_comments_pane.png). Cato cross-vendor audit: pass. KNOWN-INCOMPLETE: PowerPoint still drops every <p188:reply> (replies do not render; only parent comments do). This is a logically independent reply-level contract defect, deliberately NOT inference-fixed — it needs its own PowerPoint-authored <p188:reply> ground-truth diff. Tracked as a scoped follow-up in the session ISA. UAT script runs clean — pending maintainer visual signoff in PowerPoint/Keynote. No push/PR per repo §6a. Refs #25. * fix(comments): order replyLst before txBody per [MS-PPTX] CT_Comment — replies now render (issue #25) Follow-up to ca4da3f. After the four parent-binding fixes, PowerPoint rendered parent comment threads but SILENTLY DROPPED every reply (`interceptor macos find "Merging"` → 0; only empty Reply boxes). The advisor (Rule 2) correctly flagged reply-rendering as a logically independent concern none of the four parent fixes touched. Root cause is authoritative, not inference: the [MS-PPTX] CT_Comment schema (learn.microsoft.com/openspecs MS-PPTX, 2018/8/main) defines <xsd:complexType name="CT_Comment"><xsd:sequence> <xsd:group ref="EG_CommentAnchor"/> <!-- pc:sldMkLst --> <xsd:element name="pos" minOccurs="0"/> <xsd:element name="replyLst" type="CT_CommentReplyList" minOccurs="0"/> <xsd:group ref="EG_CommentProperties"/> <!-- txBody, extLst --> </xsd:sequence></xsd:complexType> replyLst MUST precede txBody. Our CT_ThreadedComment ZeroOrOne successors emitted txBody first, replyLst after — out of sequence, so PowerPoint validated the parent and discarded the misplaced replyLst. The reply element itself (CT_CommentReply = EG_CommentProperties only) was already correct; this was purely a child-ordering defect. Fix: reorder the successor tuples in src/pptx/oxml/comments.py so the emitted order is pc:sldMkLst, [pos], p188:replyLst, p188:txBody, [p188:extLst] — matching the schema sequence exactly. No API change. Two new regression assertions lock it (it_places_replyLst_before_txBody_per_ms_pptx_sequence, it_keeps_reply_txBody_with_bodyPr_after_reorder); red->green proven by stashing oxml/comments.py (the order test fails pre-fix). Verification: pytest 3761 passed; ruff check All checks passed; ruff format clean; behave 1062 scenarios / 3215 steps 0 failed. Regenerated UAT deck: replyLst-before-txBody True, reply texts present. Opened in real PowerPoint via Interceptor — AX tree now exposes the reply node "Comment from Alex Reviewer. Merging now." plus a "View 1 more reply" control (PowerPoint parsed the full reply list); pre-fix that count was zero. uat/FIXED_replies_rendering.png. Issue #25 now fully resolves: parent comments AND replies render in PowerPoint's Comments pane. UAT script runs clean — pending maintainer visual signoff in PowerPoint/Keynote. No push/PR per repo §6a. Refs #25.
1 parent fed7e45 commit 2b70119

16 files changed

Lines changed: 2568 additions & 155 deletions

CLAUDE.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -56,8 +56,9 @@ Reference `git log 253dbc87 --format=%B -n1` for the canonical shape: scope, des
5656

5757
Concretely:
5858
- Agents **author** the UAT script (it's part of the deliverable per §6.3).
59-
- Agents **may run** the UAT — once, after the §7 trinity is green — to verify the script doesn't crash, exits non-zero on failure, and actually asserts what it claims to assert. This is QA on the test itself, not acceptance of the feature.
59+
- Agents **may run** the UAT — after the §7 trinity is green — to verify the script doesn't crash, exits non-zero on failure, and actually asserts what it claims to assert. This is QA on the test itself, not acceptance of the feature.
6060
- Agents **must not report or imply UAT signoff** in commit messages, PR bodies, or summaries. Acceptable: *"UAT script runs clean — pending maintainer visual signoff in PowerPoint/Keynote."* Not acceptable: *"UAT: PASS — round-trip confirmed."*
61+
- UAT files should be stored under the `./uat` subdirectory.
6162
- The §7 reporting trinity below (pytest + ruff + behave) is the agent's full self-verification surface. UAT execution output may be included as an attachment but it is **not** the fourth gate.
6263
- Round-trip / behavioral evidence within agent-runnable scope still goes through **pytest integration tests** that exercise save+reopen. If pytest can fully cover it, the UAT is just a maintainer convenience; if pytest can't, UAT becomes the maintainer's primary acceptance path.
6364
- If unsure whether the maintainer has delegated signoff for a particular case, the agent **stops and asks**.

features/sld-comments.feature

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
Feature: Modern threaded comments on a slide (issue #25)
2+
In order to round-trip review feedback in a presentation
3+
As a developer using python-pptx
4+
I need to add, read, reply to, and remove threaded comments on a slide
5+
6+
Scenario: Add a threaded comment and read it back after save/reopen
7+
Given a blank slide with no comments
8+
When I add a comment "Looks great" by author "Ada"
9+
Then the slide has 1 comment
10+
And after save and reopen the first comment text is "Looks great" by author "Ada"
11+
12+
Scenario: Reply to a comment threads under it and round-trips
13+
Given a blank slide with no comments
14+
When I add a comment "Question?" by author "Ada"
15+
And I reply "Here is the answer" by author "Babbage"
16+
Then after save and reopen the first comment has 1 reply with text "Here is the answer"
17+
18+
Scenario: Removing the last comment leaves a valid reopenable file
19+
Given a blank slide with no comments
20+
When I add a comment "Temporary" by author "Ada"
21+
And I remove that comment
22+
Then the slide has 0 comments
23+
And after save and reopen the slide has 0 comments
24+
25+
Scenario: Resolving a threaded comment round-trips
26+
Given a blank slide with no comments
27+
When I add a comment "Please review" by author "Ada"
28+
And I resolve that comment
29+
Then after save and reopen the first comment is resolved
30+
31+
Scenario: A legacy comment and a modern comment coexist on round-trip
32+
Given a slide carrying a pre-existing legacy comment
33+
When I add a modern comment "Modern note" by author "Mary"
34+
Then after save and reopen both the legacy and modern comments are present
35+
And after save and reopen the legacy author ids are intact

features/steps/comments.py

Lines changed: 221 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,221 @@
1+
"""Gherkin step implementations for modern threaded comments (issue #25)."""
2+
3+
from __future__ import annotations
4+
5+
import io
6+
7+
from behave import given, then, when
8+
9+
from pptx import Presentation
10+
11+
12+
# given ===================================================
13+
14+
15+
@given("a blank slide with no comments")
16+
def given_blank_slide_no_comments(context):
17+
prs = Presentation()
18+
slide = prs.slides.add_slide(prs.slide_layouts[6])
19+
context.prs = prs
20+
context.slide = slide
21+
assert len(slide.comments) == 0
22+
23+
24+
# when ====================================================
25+
26+
27+
@when('I add a comment "{text}" by author "{author}"')
28+
def when_add_comment(context, text, author):
29+
context.comment = context.slide.comments.add(text, author)
30+
31+
32+
@when('I reply "{text}" by author "{author}"')
33+
def when_add_reply(context, text, author):
34+
context.comment.replies.add(text, author)
35+
36+
37+
@when("I remove that comment")
38+
def when_remove_comment(context):
39+
context.slide.comments.remove(context.comment)
40+
41+
42+
# then ====================================================
43+
44+
45+
@then("the slide has {count:d} comment")
46+
@then("the slide has {count:d} comments")
47+
def then_slide_has_n_comments(context, count):
48+
assert len(context.slide.comments) == count, "expected %d comments, got %d" % (
49+
count,
50+
len(context.slide.comments),
51+
)
52+
53+
54+
@then('after save and reopen the first comment text is "{text}" by author "{author}"')
55+
def then_reopen_first_comment(context, text, author):
56+
stream = io.BytesIO()
57+
context.prs.save(stream)
58+
stream.seek(0)
59+
comment = list(Presentation(stream).slides[0].comments)[0]
60+
assert comment.text == text, "text was %r" % comment.text
61+
assert comment.author == author, "author was %r" % comment.author
62+
63+
64+
@then('after save and reopen the first comment has {count:d} reply with text "{text}"')
65+
def then_reopen_reply(context, count, text):
66+
stream = io.BytesIO()
67+
context.prs.save(stream)
68+
stream.seek(0)
69+
comment = list(Presentation(stream).slides[0].comments)[0]
70+
replies = list(comment.replies)
71+
assert len(replies) == count, "expected %d replies, got %d" % (count, len(replies))
72+
assert replies[0].text == text, "reply text was %r" % replies[0].text
73+
74+
75+
@then("after save and reopen the slide has {count:d} comments")
76+
def then_reopen_comment_count(context, count):
77+
stream = io.BytesIO()
78+
context.prs.save(stream)
79+
stream.seek(0)
80+
reopened = Presentation(stream).slides[0]
81+
assert len(reopened.comments) == count, "expected %d comments after reopen, got %d" % (
82+
count,
83+
len(reopened.comments),
84+
)
85+
86+
87+
# -- Wave 3: SF6 resolve + SF8 legacy/modern coexistence ------------------------
88+
89+
90+
@when("I resolve that comment")
91+
def when_resolve_comment(context):
92+
context.comment.resolve()
93+
94+
95+
@then("after save and reopen the first comment is resolved")
96+
def then_reopen_first_comment_resolved(context):
97+
stream = io.BytesIO()
98+
context.prs.save(stream)
99+
stream.seek(0)
100+
comment = list(Presentation(stream).slides[0].comments)[0]
101+
assert comment.resolved is True, "comment.resolved was %r" % comment.resolved
102+
103+
104+
def _build_legacy_plus_modern(prs0):
105+
"""Inject a legacy <p:cm> + author + slide rel into a saved package and
106+
return the reopened Presentation (modern not yet added)."""
107+
import zipfile
108+
109+
base = io.BytesIO()
110+
prs0.save(base)
111+
base.seek(0)
112+
113+
legacy_authors = (
114+
'<?xml version="1.0" encoding="UTF-8" standalone="yes"?>\n'
115+
'<p:cmAuthorLst xmlns:p="http://schemas.openxmlformats.org/'
116+
'presentationml/2006/main">'
117+
'<p:cmAuthor id="0" name="Legacy Larry" initials="LL" '
118+
'lastIdx="1" clrIdx="0"/></p:cmAuthorLst>'
119+
)
120+
legacy_comments = (
121+
'<?xml version="1.0" encoding="UTF-8" standalone="yes"?>\n'
122+
'<p:cmLst xmlns:p="http://schemas.openxmlformats.org/'
123+
'presentationml/2006/main">'
124+
'<p:cm authorId="0" dt="2026-05-16T00:00:00" idx="1">'
125+
'<p:pos x="100" y="200"/><p:text>Legacy feedback</p:text>'
126+
"</p:cm></p:cmLst>"
127+
)
128+
src = zipfile.ZipFile(base, "r")
129+
names = src.namelist()
130+
ct = (
131+
src.read("[Content_Types].xml")
132+
.decode("utf-8")
133+
.replace(
134+
"</Types>",
135+
'<Override PartName="/ppt/commentAuthors.xml" '
136+
'ContentType="application/vnd.openxmlformats-officedocument.'
137+
'presentationml.commentAuthors+xml"/>'
138+
'<Override PartName="/ppt/comments/comment1.xml" '
139+
'ContentType="application/vnd.openxmlformats-officedocument.'
140+
'presentationml.comments+xml"/></Types>',
141+
)
142+
)
143+
pres_rels = (
144+
src.read("ppt/_rels/presentation.xml.rels")
145+
.decode("utf-8")
146+
.replace(
147+
"</Relationships>",
148+
'<Relationship Id="rIdLegacyAuth" '
149+
'Type="http://schemas.openxmlformats.org/officeDocument/2006/'
150+
'relationships/commentAuthors" Target="commentAuthors.xml"/>'
151+
"</Relationships>",
152+
)
153+
)
154+
slide_rels = (
155+
src.read("ppt/slides/_rels/slide1.xml.rels")
156+
.decode("utf-8")
157+
.replace(
158+
"</Relationships>",
159+
'<Relationship Id="rIdLegacyCmt" '
160+
'Type="http://schemas.openxmlformats.org/officeDocument/2006/'
161+
'relationships/comments" Target="../comments/comment1.xml"/>'
162+
"</Relationships>",
163+
)
164+
)
165+
patched = {
166+
"[Content_Types].xml": ct,
167+
"ppt/_rels/presentation.xml.rels": pres_rels,
168+
"ppt/slides/_rels/slide1.xml.rels": slide_rels,
169+
"ppt/commentAuthors.xml": legacy_authors,
170+
"ppt/comments/comment1.xml": legacy_comments,
171+
}
172+
out = io.BytesIO()
173+
dst = zipfile.ZipFile(out, "w", zipfile.ZIP_DEFLATED)
174+
for name in names:
175+
dst.writestr(name, patched.get(name, src.read(name)))
176+
for name, data in patched.items():
177+
if name not in names:
178+
dst.writestr(name, data)
179+
src.close()
180+
dst.close()
181+
out.seek(0)
182+
return Presentation(out)
183+
184+
185+
@given("a slide carrying a pre-existing legacy comment")
186+
def given_legacy_comment_slide(context):
187+
prs0 = Presentation()
188+
prs0.slides.add_slide(prs0.slide_layouts[6])
189+
context.prs = _build_legacy_plus_modern(prs0)
190+
context.slide = context.prs.slides[0]
191+
assert any(c.is_legacy for c in context.slide.comments)
192+
193+
194+
@when('I add a modern comment "{text}" by author "{author}"')
195+
def when_add_modern_comment(context, text, author):
196+
context.comment = context.slide.comments.add(text, author)
197+
198+
199+
@then("after save and reopen both the legacy and modern comments are present")
200+
def then_reopen_both_families(context):
201+
stream = io.BytesIO()
202+
context.prs.save(stream)
203+
stream.seek(0)
204+
comments = list(Presentation(stream).slides[0].comments)
205+
texts = sorted(c.text for c in comments)
206+
assert texts == ["Legacy feedback", "Modern note"], "texts were %r" % texts
207+
208+
209+
@then("after save and reopen the legacy author ids are intact")
210+
def then_reopen_legacy_author_ids(context):
211+
import zipfile
212+
213+
stream = io.BytesIO()
214+
context.prs.save(stream)
215+
stream.seek(0)
216+
zf = zipfile.ZipFile(stream, "r")
217+
legacy_xml = zf.read("ppt/comments/comment1.xml").decode("utf-8")
218+
authors_xml = zf.read("ppt/commentAuthors.xml").decode("utf-8")
219+
zf.close()
220+
assert 'authorId="0"' in legacy_xml, "legacy authorId mangled: %r" % legacy_xml
221+
assert 'name="Legacy Larry"' in authors_xml, "legacy author lost: %r" % authors_xml

src/pptx/__init__.py

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,12 @@
1010
from pptx.opc.constants import CONTENT_TYPE as CT
1111
from pptx.opc.package import PartFactory
1212
from pptx.parts.chart import ChartPart
13+
from pptx.parts.comments import (
14+
AuthorsPart,
15+
CommentAuthorsPart,
16+
CommentsPart,
17+
ModernCommentsPart,
18+
)
1319
from pptx.parts.coreprops import CorePropertiesPart
1420
from pptx.parts.custom_properties import CustomPropertiesPart
1521
from pptx.parts.custom_xml import CustomXmlPropertiesPart
@@ -41,6 +47,10 @@
4147
CT.PML_TEMPLATE_MAIN: PresentationPart,
4248
CT.PML_SLIDESHOW_MAIN: PresentationPart,
4349
CT.OPC_CORE_PROPERTIES: CorePropertiesPart,
50+
CT.PML_COMMENT_AUTHORS: CommentAuthorsPart,
51+
CT.PML_COMMENTS: CommentsPart,
52+
CT.PML_THREADED_COMMENTS: ModernCommentsPart,
53+
CT.PML_AUTHORS: AuthorsPart,
4454
CT.OFC_CUSTOM_PROPERTIES: CustomPropertiesPart,
4555
CT.OFC_CUSTOM_XML_PROPERTIES: CustomXmlPropertiesPart,
4656
# NOTE: CT.XML is intentionally NOT mapped to CustomXmlPart — see
@@ -80,6 +90,10 @@
8090

8191
del (
8292
ChartPart,
93+
AuthorsPart,
94+
CommentAuthorsPart,
95+
CommentsPart,
96+
ModernCommentsPart,
8397
CorePropertiesPart,
8498
CustomPropertiesPart,
8599
CustomXmlPropertiesPart,

0 commit comments

Comments
 (0)