Skip to content

Commit aaaf181

Browse files
Matthew HoroszowskiMatthew Horoszowski
authored andcommitted
feat: expose custom_properties and custom_xml_parts on Presentation
Phase 3 of customXml support per Plans/customxml-implementation-plan.md. Public surface — Presentation.custom_properties returns a Mapping wrapper over CustomPropertiesPart; Presentation.custom_xml_parts returns a Sequence wrapper over the package's RT.CUSTOM_XML rels. New modules ----------- - src/pptx/custom_properties.py — CustomProperties(Mapping[str, value]). Type-dispatched __setitem__ (bool checked before int); explicit set_* setters bypass dispatch when caller wants a specific OOXML form (e.g. set_string('N', '42') writes <vt:lpwstr> not <vt:i4>). - src/pptx/custom_xml.py — CustomXmlParts(Sequence[CustomXmlPart]). Walks both presentation-scoped and package-scoped CUSTOM_XML rels; add(xml, name=, datastoreItem_id=, schema_refs=, scope=) defaults to presentation scope per plan section 3.5; by_guid is brace- and case-tolerant; by_name reverse-resolves through reserved '_pptx_customxml_name_{guid}' entries in custom_properties. Loading existing PPTX customXml parts ------------------------------------- _upgrade_to_custom_xml_part() promotes a base Part loaded for an 'application/xml' content type to CustomXmlPart in-place via __class__ swap and parsing the blob to lxml. Done lazily on first enumeration. The package's rel graph keeps pointing at the same instance, so no rel rewriting is needed. Per plan section 3.6 / Q4 default — CT.XML stays unmapped in PartFactory; this is the upgrade path the plan specified. Wiring ------ - Package.custom_properties_part — lazy create CustomPropertiesPart if absent, mirroring Package.core_properties pattern. - Package.custom_properties — convenience returning a CustomProperties Mapping wrapping the part. - PresentationPart.custom_properties / .custom_xml_parts — lazyproperty wrappers; same instance per Presentation. - Presentation.custom_properties / .custom_xml_parts — top-level accessors that delegate to the part. Convenience on parts.custom_xml.CustomXmlPart --------------------------------------------- - .name reads from reserved custom_properties entries; returns None if the part has no application-assigned name. - add_item(tag, text, **attrs) — convenience for the 'flat list of items' shape per plan section 2.2 / Q1 default = include. remove() leaves the data->props rel intact on the orphaned part — once the source rel is gone, neither the data nor the props part is reachable from iter_parts so both are omitted on save. Keeping the rel lets a caller still read part.datastore_item_id on a held reference after removal. 59 new unit tests; 100/94/100/95% coverage on the new modules. End-to-end save → reload → lookup by name verified through real Presentation() built from the default template. Existing 2897-test suite still green (2956 total).
1 parent a8cac32 commit aaaf181

8 files changed

Lines changed: 1014 additions & 0 deletions

File tree

src/pptx/custom_properties.py

Lines changed: 135 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,135 @@
1+
"""User-facing wrapper around the Custom Document Properties part.
2+
3+
Mapping-protocol surface that lets callers read and write the values exposed
4+
under `File → Properties → Advanced` in PowerPoint as if they were a `dict`.
5+
"""
6+
7+
from __future__ import annotations
8+
9+
import datetime as dt
10+
from typing import TYPE_CHECKING, Iterator, Mapping, Union
11+
12+
if TYPE_CHECKING:
13+
from pptx.parts.custom_properties import CustomPropertiesPart
14+
15+
16+
CustomPropertyValue = Union[str, int, float, bool, dt.datetime]
17+
18+
19+
class CustomProperties(Mapping[str, CustomPropertyValue]):
20+
"""Dict-like read/write access to custom document properties.
21+
22+
Returned by :attr:`pptx.presentation.Presentation.custom_properties`. The
23+
mapping is *live* — writes go directly to the underlying
24+
`CustomPropertiesPart`; the next `Presentation.save(...)` persists them.
25+
26+
Type dispatch on assignment is by Python type:
27+
28+
==================== ===================
29+
Python type OOXML element
30+
==================== ===================
31+
``str`` ``<vt:lpwstr>``
32+
``bool`` ``<vt:bool>``
33+
``int`` ``<vt:i4>``
34+
``float`` ``<vt:r8>``
35+
``datetime.datetime`` ``<vt:filetime>``
36+
==================== ===================
37+
38+
For the cases where Python's type inference does the wrong thing — for
39+
example, you want a string `"42"` rather than the integer 42 — use the
40+
explicit :meth:`set_string` / :meth:`set_int` / etc. setters.
41+
"""
42+
43+
def __init__(self, part: "CustomPropertiesPart"):
44+
self._part = part
45+
46+
# -- Mapping protocol --------------------------------------------------
47+
48+
def __getitem__(self, name: str) -> CustomPropertyValue:
49+
prop = self._part.get_property(name)
50+
if prop is None:
51+
raise KeyError(name)
52+
value = prop.value
53+
if value is None:
54+
# Defensive: a malformed entry with no <vt:*> child is treated as
55+
# absent rather than surfacing None — keeps the Mapping contract clean.
56+
raise KeyError(name)
57+
return value
58+
59+
def __setitem__(self, name: str, value: CustomPropertyValue) -> None:
60+
if not _is_supported(value):
61+
raise TypeError(
62+
"custom property value must be bool, int, float, str, or datetime; "
63+
"got %s" % type(value).__name__
64+
)
65+
existing = self._part.get_property(name)
66+
if existing is not None:
67+
existing.value = value
68+
return
69+
self._part.add_property(name, value)
70+
71+
def __delitem__(self, name: str) -> None:
72+
if not self._part.remove_property(name):
73+
raise KeyError(name)
74+
75+
def __contains__(self, name: object) -> bool:
76+
return isinstance(name, str) and self._part.get_property(name) is not None
77+
78+
def __iter__(self) -> Iterator[str]:
79+
return iter(self._part.property_names)
80+
81+
def __len__(self) -> int:
82+
return len(self._part)
83+
84+
# -- Explicit-typed setters --------------------------------------------
85+
86+
def set_string(self, name: str, value: str) -> None:
87+
"""Write `value` as `<vt:lpwstr>` regardless of Python type."""
88+
if not isinstance(value, str): # pyright: ignore[reportUnnecessaryIsInstance]
89+
raise TypeError("set_string value must be str, got %s" % type(value).__name__)
90+
self._set_typed(name, value)
91+
92+
def set_int(self, name: str, value: int) -> None:
93+
"""Write `value` as `<vt:i4>` regardless of Python type.
94+
95+
Rejects `bool` even though `bool` is-a `int` in Python — callers who
96+
really want a 1/0 i4 can wrap with `int(value)` first.
97+
"""
98+
if isinstance(value, bool) or not isinstance(value, int):
99+
raise TypeError("set_int value must be int, got %s" % type(value).__name__)
100+
self._set_typed(name, value)
101+
102+
def set_float(self, name: str, value: float) -> None:
103+
"""Write `value` as `<vt:r8>` regardless of Python type."""
104+
if isinstance(value, bool) or not isinstance(value, (int, float)):
105+
raise TypeError("set_float value must be a number, got %s" % type(value).__name__)
106+
self._set_typed(name, float(value))
107+
108+
def set_bool(self, name: str, value: bool) -> None:
109+
"""Write `value` as `<vt:bool>`."""
110+
if not isinstance(value, bool): # pyright: ignore[reportUnnecessaryIsInstance]
111+
raise TypeError("set_bool value must be bool, got %s" % type(value).__name__)
112+
self._set_typed(name, value)
113+
114+
def set_datetime(self, name: str, value: dt.datetime) -> None:
115+
"""Write `value` as `<vt:filetime>` (UTC, ISO-8601)."""
116+
if not isinstance(value, dt.datetime): # pyright: ignore[reportUnnecessaryIsInstance]
117+
raise TypeError(
118+
"set_datetime value must be datetime, got %s" % type(value).__name__
119+
)
120+
self._set_typed(name, value)
121+
122+
def _set_typed(self, name: str, value: CustomPropertyValue) -> None:
123+
"""Replace-or-add the property; the underlying `CT_Property.value` setter
124+
already dispatches on Python type cleanly, so re-using it here is safe."""
125+
existing = self._part.get_property(name)
126+
if existing is not None:
127+
existing.value = value
128+
return
129+
self._part.add_property(name, value)
130+
131+
132+
def _is_supported(value: object) -> bool:
133+
if isinstance(value, bool):
134+
return True
135+
return isinstance(value, (int, float, str, dt.datetime))

src/pptx/custom_xml.py

Lines changed: 252 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,252 @@
1+
"""User-facing wrapper around customXml data parts.
2+
3+
`CustomXmlParts` exposes the collection of `<ds:datastoreItem>`-tagged
4+
arbitrary-XML parts attached to a presentation. The user-facing element type
5+
is :class:`pptx.parts.custom_xml.CustomXmlPart` itself — there is no separate
6+
facade. Loaded base `Part` instances (which arise because `CT.XML` is not
7+
mapped to `CustomXmlPart` in `pptx/__init__.py` per plan §3.6) are upgraded
8+
in-place by `_upgrade_to_custom_xml_part(...)` on first enumeration.
9+
"""
10+
11+
from __future__ import annotations
12+
13+
from typing import TYPE_CHECKING, Iterable, Iterator, Literal, Sequence, Union, cast
14+
15+
from pptx.opc.constants import RELATIONSHIP_TYPE as RT
16+
from pptx.opc.package import Part
17+
from pptx.oxml import parse_xml
18+
from pptx.oxml.xmlchemy import BaseOxmlElement
19+
from pptx.parts.custom_xml import CustomXmlPart, XmlPayload
20+
21+
if TYPE_CHECKING:
22+
from pptx.parts.presentation import PresentationPart
23+
24+
25+
# Reserved name-prefix used to store user-assigned customXml part names as
26+
# entries in the custom document properties part. The key is
27+
# `{prefix}{datastore_item_id}` and the value is the user-assigned name.
28+
NAME_PROPERTY_PREFIX = "_pptx_customxml_name_"
29+
30+
31+
class CustomXmlParts(Sequence[CustomXmlPart]):
32+
"""Collection of customXml data parts attached to the presentation.
33+
34+
Iterates both presentation-scoped (`ppt/_rels/presentation.xml.rels`) and
35+
package-scoped (`/_rels/.rels`) `RT.CUSTOM_XML` relationships. Parts are
36+
deduplicated by identity — a single part related from both sources appears
37+
once.
38+
39+
Lookup:
40+
41+
prs.custom_xml_parts[0] # by index
42+
prs.custom_xml_parts["item3.xml"] # by partname tail
43+
prs.custom_xml_parts.by_guid("{...}") # by datastoreItem GUID
44+
prs.custom_xml_parts.by_name("provenance") # by user-assigned name
45+
"""
46+
47+
def __init__(self, presentation_part: "PresentationPart"):
48+
self._presentation_part = presentation_part
49+
50+
# -- Sequence-like protocol --------------------------------------------
51+
52+
def __iter__(self) -> Iterator[CustomXmlPart]:
53+
return self._iter_parts()
54+
55+
def __len__(self) -> int:
56+
return sum(1 for _ in self._iter_parts())
57+
58+
def __getitem__(self, key): # type: ignore[override]
59+
if isinstance(key, int):
60+
for i, part in enumerate(self._iter_parts()):
61+
if i == key:
62+
return part
63+
raise IndexError("custom_xml_parts index out of range: %d" % key)
64+
if isinstance(key, str):
65+
for part in self._iter_parts():
66+
partname = str(part.partname)
67+
if partname == key or partname.endswith("/" + key):
68+
return part
69+
raise KeyError("no custom_xml part with partname %r" % key)
70+
raise TypeError(
71+
"custom_xml_parts key must be int or str, got %s" % type(key).__name__
72+
)
73+
74+
# -- Public lookups ----------------------------------------------------
75+
76+
def by_guid(self, guid: str) -> CustomXmlPart | None:
77+
"""Return the part whose `datastore_item_id` matches `guid`, or None.
78+
79+
Match is case-insensitive and curly-brace-tolerant — `"{ABCD-...}"` and
80+
`"abcd-..."` both find the same part.
81+
"""
82+
target = _normalize_guid(guid)
83+
for part in self._iter_parts():
84+
if _normalize_guid(part.datastore_item_id) == target:
85+
return part
86+
return None
87+
88+
def by_name(self, name: str) -> CustomXmlPart | None:
89+
"""Return the part previously added with `name=...`, or None.
90+
91+
Names are stored as reserved entries in the custom document properties
92+
part keyed by datastore_item_id; this method reverse-resolves the name
93+
through that table.
94+
"""
95+
if not isinstance(name, str): # pyright: ignore[reportUnnecessaryIsInstance]
96+
raise TypeError("name must be str, got %s" % type(name).__name__)
97+
cp_part = self._presentation_part.package.custom_properties_part
98+
for prop in cp_part._element.property_lst:
99+
if not prop.name.startswith(NAME_PROPERTY_PREFIX):
100+
continue
101+
if prop.value != name:
102+
continue
103+
guid = prop.name[len(NAME_PROPERTY_PREFIX) :]
104+
return self.by_guid(guid)
105+
return None
106+
107+
# -- Mutation ----------------------------------------------------------
108+
109+
def add(
110+
self,
111+
xml: XmlPayload,
112+
*,
113+
name: str | None = None,
114+
datastoreItem_id: str | None = None,
115+
schema_refs: Iterable[str] | None = None,
116+
scope: Literal["presentation", "package"] = "presentation",
117+
) -> CustomXmlPart:
118+
"""Add a new customXml part with `xml` as its payload.
119+
120+
See module docstring for parameter semantics. Returns the new part,
121+
already attached to the presentation; nothing else is required before
122+
`prs.save(...)`.
123+
"""
124+
if scope not in ("presentation", "package"):
125+
raise ValueError(
126+
"scope must be 'presentation' or 'package', got %r" % (scope,)
127+
)
128+
129+
package = self._presentation_part.package
130+
data_part = CustomXmlPart.new_pair(
131+
package,
132+
xml,
133+
datastore_item_id=datastoreItem_id,
134+
schema_refs=tuple(schema_refs) if schema_refs is not None else (),
135+
)
136+
137+
if scope == "presentation":
138+
self._presentation_part.relate_to(data_part, RT.CUSTOM_XML)
139+
else:
140+
package.relate_to(data_part, RT.CUSTOM_XML)
141+
142+
if name is not None:
143+
cp = package.custom_properties
144+
cp[NAME_PROPERTY_PREFIX + data_part.datastore_item_id] = name
145+
146+
return data_part
147+
148+
def remove(self, part: Union[CustomXmlPart, int, str]) -> None:
149+
"""Remove a customXml part from the presentation.
150+
151+
Drops the relationship from whichever scope (presentation or package)
152+
currently holds it, plus any reserved name entry in custom_properties.
153+
Idempotent — a second call on an already-removed part is a no-op.
154+
155+
The data → props rel is intentionally LEFT IN PLACE on the now-orphaned
156+
data part. Once the source rel is gone, neither the data part nor the
157+
props part is reachable from `iter_parts`, so both are omitted on
158+
save. Keeping the rel around lets a caller still read
159+
`part.datastore_item_id` on the returned reference after removal,
160+
which matches the principle of least surprise for held references.
161+
"""
162+
target = self._resolve(part)
163+
if target is None:
164+
return
165+
166+
# Drop the reserved name entry, if any. Reading datastore_item_id
167+
# here requires the data → props rel to still be intact.
168+
cp_part = self._presentation_part.package.custom_properties_part
169+
cp_part.remove_property(NAME_PROPERTY_PREFIX + target.datastore_item_id)
170+
171+
# Drop the rel from whichever source holds it (presentation or package).
172+
for rels in self._iter_rel_collections():
173+
for rId, rel in list(rels.items()):
174+
if rel.is_external or rel.reltype != RT.CUSTOM_XML:
175+
continue
176+
if rel.target_part is target:
177+
rels.pop(rId)
178+
179+
# -- Internals ---------------------------------------------------------
180+
181+
def _iter_parts(self) -> Iterator[CustomXmlPart]:
182+
"""Yield each unique customXml data part across both rel sources."""
183+
seen: set[int] = set()
184+
for rels in self._iter_rel_collections():
185+
for rel in rels.values():
186+
if rel.is_external or rel.reltype != RT.CUSTOM_XML:
187+
continue
188+
part = _upgrade_to_custom_xml_part(rel.target_part)
189+
if id(part) in seen:
190+
continue
191+
seen.add(id(part))
192+
yield part
193+
194+
def _iter_rel_collections(self):
195+
"""Yield the two relationship collections to scan for `RT.CUSTOM_XML`.
196+
197+
Presentation part exposes `.rels` publicly; the package exposes the
198+
same collection internally as `_rels` (it has no public API for
199+
external rel inspection because most callers reach the rel graph via
200+
`iter_parts`/`iter_rels` instead). We need direct rel access here to
201+
find the source rel for `add(scope="package")`-attached parts.
202+
"""
203+
yield self._presentation_part.rels
204+
yield self._presentation_part.package._rels
205+
206+
def _resolve(
207+
self, part: Union[CustomXmlPart, int, str]
208+
) -> CustomXmlPart | None:
209+
if isinstance(part, CustomXmlPart):
210+
return part
211+
if isinstance(part, int):
212+
try:
213+
return self.__getitem__(part)
214+
except IndexError:
215+
return None
216+
if isinstance(part, str):
217+
try:
218+
return self.__getitem__(part)
219+
except KeyError:
220+
return None
221+
raise TypeError(
222+
"remove() argument must be CustomXmlPart, int, or str; got %s"
223+
% type(part).__name__
224+
)
225+
226+
227+
def _upgrade_to_custom_xml_part(part: Part) -> CustomXmlPart:
228+
"""Upgrade a base `Part` to `CustomXmlPart` in-place via `__class__` swap.
229+
230+
Loaded `application/xml` parts come in as plain `Part` because plan §3.6
231+
intentionally leaves `CT.XML` unmapped. On first enumeration, we promote
232+
the instance: assign the `CustomXmlPart` class, parse its blob to lxml,
233+
and stash the parsed root in `_element`. The package's rel graph keeps
234+
pointing at the same instance, so every other reference now resolves to
235+
the upgraded class with no graph rewriting.
236+
"""
237+
if isinstance(part, CustomXmlPart):
238+
return part
239+
element = cast("BaseOxmlElement", parse_xml(part.blob))
240+
part.__class__ = CustomXmlPart
241+
part._element = element # type: ignore[attr-defined]
242+
return cast(CustomXmlPart, part)
243+
244+
245+
def _normalize_guid(guid: str) -> str:
246+
"""Lowercase and strip surrounding curly braces for comparison."""
247+
if not isinstance(guid, str): # pyright: ignore[reportUnnecessaryIsInstance]
248+
raise TypeError("guid must be str, got %s" % type(guid).__name__)
249+
s = guid.strip().lower()
250+
if s.startswith("{") and s.endswith("}"):
251+
s = s[1:-1]
252+
return s

0 commit comments

Comments
 (0)