Skip to content

Commit 7420bb5

Browse files
committed
feat(text): Field authoring — _Paragraph.add_field, _Field class, CT_TextField.text setter (Phase 3)
Public Python API for the headers/footers/slide-numbers/dates epic (#20). Phase 3 adds the field-authoring surface that lets users create auto-updating slide numbers, dates, and other PowerPoint-resolved fields inside any paragraph. Builds on Phase 1 (PR #48) OOXML primitives and Phase 2 (PR #49) slide/master public API. Design source: scanny#797 ("Added a:fld type to paragraphs for page numbers and datetimes"). Manually ported — per CLAUDE.md §2, this fork's master had a repo-wide ruff format pass (PR #10) while upstream did not, so cherry-pick conflicts on whitespace across every touched file. Semantic diff re-derived against the current ruff- formatted, post-Phase-1 source. Changes: - pptx.oxml.simpletypes.ST_FieldType (NEW) — XsdString subclass for the `a:fld@type` attribute value, replacing the plain XsdString declaration Phase 1 used as a placeholder. - pptx.oxml.text.CT_TextField.text — read-only property from Phase 1 now has a setter. Writes through get_or_add_t() and routes the value through CT_TextField._escape_ctrl_chars (NEW static method) which replaces chars in `[\x00-\x08\x0B-\x1F]` with `_xNNNN_` uppercase-hex form per OOXML §22.9.2.19, leaving `\t` (0x09) and `\n` (0x0A) alone. - pptx.oxml.text.CT_TextParagraph.fld — ZeroOrMore("a:fld", successors= ("a:endParaRPr",)) accessor; the `a:pPr` successor tuple already named `a:fld` per Phase 1 (forward declaration). xmlchemy auto-generates `_add_fld()` from the ZeroOrMore. - pptx.text.text._Field (NEW) — public-via-add_field-return-value class wrapping `<a:fld>`. Leading-underscore private name matches `_Run` and `_Paragraph`. Properties: `font` (Font wrapping rPr), `text` (read/ write, routes through the escaping setter), `type` (read/write, str | None). - pptx.text.text._Paragraph.add_field() (NEW) — appends a fresh `<a:fld>` with a uuid4 GUID id wrapped in braces, uppercase hex — matches what PowerPoint's "Insert → Slide Number" writes. Returns a `_Field`; caller sets `type` and optionally `text`. The Run-style symmetry is deliberate: users who know `add_run()` should not have to learn a new pattern. Out of scope for Phase 3 (deliberate): - Field discovery during paragraph iteration — `p.runs` continues to yield only `_Run` objects. Phase 4 will surface `_Field` instances alongside, with a stable ordering rule. - HandoutMaster Python class and watermark helper — Phase 5. - MSO_FIELD_TYPE enum — `type` stays plain `str` for now to mirror scanny#797. An enum can land in a later cleanup once the canonical field-type list is settled. Verification (local, CPython 3.14.4): - python3 -m pytest tests/ -q → 3626 passed in 5.32s (+28 vs Phase 2 baseline) - 14 new tests in tests/oxml/test_text.py (CT_TextField setter + _escape_ctrl_chars + CT_TextParagraph.add_fld) - 14 new tests in tests/text/test_text.py (Describe_Field ×10 + Describe_Paragraph_add_field ×4) - python3 -m ruff check src tests → All checks passed! - python3 -m ruff format --check src tests → 216 files already formatted - python3 -m behave features/ --no-color → 1048 scenarios, 0 failed - python3 uat/uat_headers_footers_phase3.py → PASS (full <a:fld> with id, type, and text round-tripped through save+reopen; GUID preserved byte-for-byte at {2ED44585-07B5-4BC8-93B2-49122D50BCC2}) Refs #20.
1 parent dfe9905 commit 7420bb5

5 files changed

Lines changed: 316 additions & 4 deletions

File tree

src/pptx/oxml/simpletypes.py

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -368,6 +368,18 @@ class ST_Extension(XsdString):
368368
pass
369369

370370

371+
class ST_FieldType(XsdString):
372+
"""Field-type token on `<a:fld type="...">` per ECMA-376 §A.4.1.
373+
374+
Values are PowerPoint-defined strings such as `slidenum`, `datetime1` ..
375+
`datetime13`, and `title`. Type is intentionally permissive (a plain
376+
string) — the schema itself does not enumerate the values, and
377+
PowerPoint accepts any token.
378+
"""
379+
380+
pass
381+
382+
371383
class ST_GapAmount(BaseIntType):
372384
"""
373385
String value is an integer in range 0-500, representing a percent,

src/pptx/oxml/text.py

Lines changed: 27 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@
1818
from pptx.oxml.ns import nsdecls
1919
from pptx.oxml.simpletypes import (
2020
ST_Coordinate32,
21+
ST_FieldType,
2122
ST_TextFontScalePercentOrPercentString,
2223
ST_TextFontSize,
2324
ST_TextIndentLevelType,
@@ -339,6 +340,7 @@ class CT_TextField(BaseOxmlElement):
339340
"""
340341

341342
get_or_add_rPr: Callable[[], CT_TextCharacterProperties]
343+
get_or_add_t: Callable[[], BaseOxmlElement]
342344

343345
rPr: CT_TextCharacterProperties | None = ZeroOrOne( # pyright: ignore[reportAssignmentType]
344346
"a:rPr", successors=("a:pPr", "a:t")
@@ -348,7 +350,7 @@ class CT_TextField(BaseOxmlElement):
348350
)
349351
id: str = RequiredAttribute("id", XsdString) # pyright: ignore[reportAssignmentType]
350352
type: str | None = OptionalAttribute( # pyright: ignore[reportAssignmentType]
351-
"type", XsdString
353+
"type", ST_FieldType
352354
)
353355

354356
@property
@@ -359,6 +361,28 @@ def text(self) -> str: # pyright: ignore[reportIncompatibleMethodOverride]
359361
return ""
360362
return t.text or ""
361363

364+
@text.setter
365+
def text(self, value: str): # pyright: ignore[reportIncompatibleMethodOverride]
366+
"""Replace the text of the `a:t` child, escaping control chars.
367+
368+
Adds an `a:t` child element if not already present. Characters in the
369+
ASCII control range 0x00-0x08 and 0x0B-0x1F (everything except `\\t`
370+
and `\\n`) are replaced with their `_xNNNN_` plain-text escape per
371+
OOXML §22.9.2.19, matching the behavior of `CT_RegularTextRun.text`.
372+
"""
373+
t = self.get_or_add_t()
374+
t.text = self._escape_ctrl_chars(value)
375+
376+
@staticmethod
377+
def _escape_ctrl_chars(s: str) -> str:
378+
"""Return str after replacing each control character with a plain-text escape.
379+
380+
For example, a BEL character (x07) would appear as "_x0007_". Horizontal-tab
381+
(x09) and line-feed (x0A) are not escaped. All other characters in the range
382+
x00-x1F are escaped.
383+
"""
384+
return re.sub(r"([\x00-\x08\x0B-\x1F])", lambda match: "_x%04X_" % ord(match.group(1)), s)
385+
362386

363387
class CT_TextFont(BaseOxmlElement):
364388
"""Custom element class for `a:latin`, `a:ea`, `a:cs`, and `a:sym`.
@@ -403,13 +427,15 @@ class CT_TextParagraph(BaseOxmlElement):
403427
get_or_add_pPr: Callable[[], CT_TextParagraphProperties]
404428
r_lst: list[CT_RegularTextRun]
405429
_add_br: Callable[[], CT_TextLineBreak]
430+
_add_fld: Callable[[], CT_TextField]
406431
_add_r: Callable[[], CT_RegularTextRun]
407432

408433
pPr: CT_TextParagraphProperties | None = ZeroOrOne( # pyright: ignore[reportAssignmentType]
409434
"a:pPr", successors=("a:r", "a:br", "a:fld", "a:endParaRPr")
410435
)
411436
r = ZeroOrMore("a:r", successors=("a:endParaRPr",))
412437
br = ZeroOrMore("a:br", successors=("a:endParaRPr",))
438+
fld = ZeroOrMore("a:fld", successors=("a:endParaRPr",))
413439
endParaRPr: CT_TextCharacterProperties | None = ZeroOrOne("a:endParaRPr", successors=()) # pyright: ignore[reportAssignmentType]
414440

415441
def add_br(self) -> CT_TextLineBreak:

src/pptx/text/text.py

Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22

33
from __future__ import annotations
44

5+
import uuid
56
from typing import TYPE_CHECKING, Iterator, cast
67

78
from pptx.dml.fill import FillFormat
@@ -33,6 +34,7 @@
3334
CT_RegularTextRun,
3435
CT_TextBody,
3536
CT_TextCharacterProperties,
37+
CT_TextField,
3638
CT_TextParagraph,
3739
CT_TextParagraphProperties,
3840
)
@@ -582,6 +584,21 @@ def add_run(self) -> _Run:
582584
r = self._p.add_r()
583585
return _Run(r, self)
584586

587+
def add_field(self) -> _Field:
588+
"""Return a new |_Field| appended after the paragraph's existing content.
589+
590+
The new ``<a:fld>`` element is given a fresh RFC-4122 v4 GUID `id`
591+
wrapped in braces, with uppercase hex — matching the authoring format
592+
PowerPoint emits when the user runs *Insert → Slide Number* or
593+
*Insert → Date and Time*. The caller is expected to set `type` (e.g.
594+
`"slidenum"`, `"datetime1"`) and optionally `text` (the placeholder
595+
glyph PowerPoint displays for the field before it resolves the live
596+
value) on the returned `_Field`.
597+
"""
598+
f = self._p._add_fld()
599+
f.id = "{%s}" % str(uuid.uuid4()).upper()
600+
return _Field(f, self)
601+
585602
@property
586603
def alignment(self) -> PP_PARAGRAPH_ALIGNMENT | None:
587604
"""Horizontal alignment of this paragraph.
@@ -888,3 +905,62 @@ def text(self):
888905
@text.setter
889906
def text(self, text: str):
890907
self._r.text = text
908+
909+
910+
class _Field(Subshape):
911+
"""Field object. Corresponds to ``<a:fld>`` child element in a paragraph.
912+
913+
A field renders text whose value PowerPoint resolves at slide-show or open
914+
time — slide numbers, the current date, the slide title, etc. The literal
915+
text written to the ``<a:t>`` child is the placeholder PowerPoint shows
916+
before it resolves the live value; users typically pass a glyph like
917+
``"‹#›"`` for slide numbers or the current date as a static fallback.
918+
919+
Not intended to be constructed directly — obtain instances from
920+
:meth:`_Paragraph.add_field`.
921+
"""
922+
923+
def __init__(self, f: CT_TextField, parent: ProvidesPart):
924+
super(_Field, self).__init__(parent)
925+
self._f = f
926+
927+
@property
928+
def font(self) -> Font:
929+
"""|Font| instance for the run-level character properties of this field.
930+
931+
Character properties can be and perhaps most often are inherited from
932+
parent objects such as the paragraph and slide layout the field is
933+
contained in. Only those specifically overridden at the field level
934+
are contained in the font object.
935+
"""
936+
rPr = self._f.get_or_add_rPr()
937+
return Font(rPr)
938+
939+
@property
940+
def text(self) -> str:
941+
"""Read/write. A unicode string containing the field's placeholder text.
942+
943+
Assignment replaces all text in the field. Control characters other
944+
than tab or newline are escaped as a hex representation. For example,
945+
ESC (ASCII 27) is escaped as ``"_x001B_"``.
946+
"""
947+
return self._f.text
948+
949+
@text.setter
950+
def text(self, text: str):
951+
self._f.text = text
952+
953+
@property
954+
def type(self) -> str | None:
955+
"""Read/write. The field's ``type`` attribute, e.g. ``"slidenum"``.
956+
957+
ECMA-376 §A.4.1 names the well-known types: ``slidenum``,
958+
``datetime1`` .. ``datetime13``, and ``title``. The OOXML schema
959+
itself treats the value as a permissive string. Returns |None| when
960+
no ``type`` attribute is present.
961+
"""
962+
return self._f.type
963+
964+
@type.setter
965+
def type(self, value: str | None):
966+
self._f.type = value

tests/oxml/test_text.py

Lines changed: 105 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,9 +7,9 @@
77
import pytest
88

99
from pptx.exc import InvalidXmlError
10-
from pptx.oxml.text import CT_TextField
10+
from pptx.oxml.text import CT_TextField, CT_TextParagraph
1111

12-
from ..unitutil.cxml import element
12+
from ..unitutil.cxml import element, xml
1313

1414

1515
class DescribeCT_TextField(object):
@@ -51,3 +51,106 @@ def it_returns_empty_string_for_text_when_a_t_is_absent(self):
5151
def it_reads_the_text_of_its_a_t_child(self):
5252
fld = cast(CT_TextField, element('a:fld{id=foo,type=slidenum}/a:t"42"'))
5353
assert fld.text == "42"
54+
55+
def it_adds_an_a_t_child_on_text_assignment_when_absent(self):
56+
fld = cast(CT_TextField, element("a:fld{id=foo,type=slidenum}"))
57+
assert fld.t is None
58+
59+
fld.text = "‹#›"
60+
61+
assert fld.t is not None
62+
assert fld.text == "‹#›"
63+
assert fld.xml == xml('a:fld{id=foo,type=slidenum}/a:t"‹#›"')
64+
65+
def it_replaces_existing_a_t_content_on_text_assignment(self):
66+
fld = cast(CT_TextField, element('a:fld{id=foo,type=slidenum}/a:t"old"'))
67+
68+
fld.text = "new"
69+
70+
assert fld.text == "new"
71+
# ---only one a:t child is present; assignment replaces, not appends---
72+
assert len(fld.findall("{http://schemas.openxmlformats.org/drawingml/2006/main}t")) == 1
73+
74+
@pytest.mark.parametrize(
75+
("input_value", "expected_a_t_text"),
76+
[
77+
("hello", "hello"),
78+
("a\x07b", "a_x0007_b"), # BEL escapes
79+
("tab\there", "tab\there"), # tab pass-through
80+
("line1\nline2", "line1\nline2"), # newline pass-through
81+
("esc\x1bhere", "esc_x001B_here"), # ESC escapes, uppercase hex
82+
("", ""),
83+
],
84+
)
85+
def it_escapes_control_chars_when_assigning_text(
86+
self, input_value: str, expected_a_t_text: str
87+
):
88+
fld = cast(CT_TextField, element("a:fld{id=foo,type=slidenum}"))
89+
90+
fld.text = input_value
91+
92+
# ---round-trip: reading back returns the escaped form because the
93+
# ---escape is permanent storage form, not a presentation layer---
94+
assert fld.text == expected_a_t_text
95+
96+
def it_escapes_BEL_to_uppercase_hex_via__escape_ctrl_chars(self):
97+
# ---BEL is x07; expected escape is "_x0007_" with uppercase hex---
98+
assert CT_TextField._escape_ctrl_chars("ring\x07bell") == "ring_x0007_bell"
99+
100+
def it_passes_tab_and_newline_through__escape_ctrl_chars(self):
101+
# ---x09 (HT) and x0A (LF) are explicitly excluded from the escape
102+
# ---range per OOXML §22.9.2.19; all other x00..x1F characters escape.
103+
assert CT_TextField._escape_ctrl_chars("a\tb\nc") == "a\tb\nc"
104+
105+
# ---verify x0B (VT) and x1F (US, the highest in-range value) DO escape
106+
assert CT_TextField._escape_ctrl_chars("\x0b") == "_x000B_"
107+
assert CT_TextField._escape_ctrl_chars("\x1f") == "_x001F_"
108+
109+
110+
class DescribeCT_TextParagraph(object):
111+
"""Unit-test suite for `pptx.oxml.text.CT_TextParagraph` field accessor."""
112+
113+
def it_can_add_an_a_fld_via__add_fld(self):
114+
p = cast(CT_TextParagraph, element("a:p"))
115+
116+
fld = p._add_fld()
117+
118+
assert isinstance(fld, CT_TextField)
119+
assert len(p.fld_lst) == 1
120+
assert p.fld_lst[0] is fld
121+
122+
def it_appends_a_fld_after_existing_runs_in_document_order(self):
123+
# ---fld successors=("a:endParaRPr",) places it after a:r and a:br,
124+
# ---and before a:endParaRPr. Verifying the post-r position confirms
125+
# ---xmlchemy honored the successors tuple correctly.
126+
p = cast(CT_TextParagraph, element('a:p/(a:r/a:t"x",a:endParaRPr)'))
127+
128+
fld = p._add_fld()
129+
fld.id = "fld-1"
130+
131+
# ---walk children of <a:p>; ignoring pPr (none here), expect:
132+
# ---a:r, a:fld, a:endParaRPr in that order
133+
tags = [child.tag.split("}")[-1] for child in p]
134+
assert tags == ["r", "fld", "endParaRPr"]
135+
136+
def it_returns_all_fld_children_via_fld_lst(self):
137+
p = cast(CT_TextParagraph, element("a:p"))
138+
139+
fld_a = p._add_fld()
140+
fld_b = p._add_fld()
141+
fld_c = p._add_fld()
142+
143+
assert p.fld_lst == [fld_a, fld_b, fld_c]
144+
145+
def it_includes_a_fld_in_content_children(self):
146+
# ---content_children must surface a:r, a:br, and a:fld in document order
147+
# ---so _Paragraph.text concatenates field text alongside run text.
148+
p = cast(CT_TextParagraph, element("a:p"))
149+
p.add_r("before")
150+
fld = p._add_fld()
151+
fld.id = "fld-1"
152+
fld.text = "[N]"
153+
p.add_r("after")
154+
155+
texts = [child.text for child in p.content_children]
156+
assert texts == ["before", "[N]", "after"]

0 commit comments

Comments
 (0)