You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: add page number support to v1 html partition (#4327)
This PR adds support for page-number when partitioning html using the v1
parser.
- Add `page_number` support to the v1 HTML parser by reading
`data-page-number `attributes from ancestor elements, consistent with v2
parser behavior
- Add `_page_number` cached property on Flow using efficient
parent-chain lookup (O(n) total vs O(n*depth) ancestor walk)
- Wire page number into all three element-creation paths: text elements,
images, and tables
- Malformed `data-page-number` values are skipped and fall back to the
nearest valid ancestor
Copy file name to clipboardExpand all lines: CHANGELOG.md
+5Lines changed: 5 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,3 +1,8 @@
1
+
## 0.22.18
2
+
3
+
### Enhancements
4
+
-**Add page number support to v1 HTML parser**: The v1 HTML parser now reads `data-page-number` attributes from ancestor elements and includes the page number in element metadata, consistent with the v2 parser behavior.
0 commit comments