Commit a24e9c0
develop (#1)
* chore(planning): initialize markdown conversion fidelity improvement plan
12 tasks across 5 phases to fix TOC/anchor links, text extraction artifacts, and conversion fidelity issues identified by comparing DOCX-to-markdown output with live Microsoft Learn HTML.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(docx): rewrite table cell text extraction to use run-aware parsing
Replaced flat w:t node collection with paragraph/run-aware extraction in Get-OpenSpecOpenXmlNodeText. The old approach joined all text nodes with spaces, causing mid-word artifacts (e.g., 'W EBAUTHN', '10/8/20 10', 'technica l'). The new approach walks w:p > w:r structure and delegates to ConvertFrom-OpenSpecOpenXmlRunText which correctly handles w:br, w:tab, w:cr elements.
Tasks: TASK-001, TASK-002 | Phase: 1/5 | Progress: 2/12 (17%)
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(docx): rewrite TOC links to Section_X.Y anchors and strip page numbers
Rewrote Add-OpenSpecSectionAnchorsFromToc to replace _Toc anchor targets with Section_X.Y in TOC links and strip trailing DOCX page numbers from labels. TOC entries now read '[1 Introduction](#Section_1)' instead of '[1 Introduction 5](#_Toc164822728)'.
Tasks: TASK-003, TASK-004 | Phase: 2/5 | Progress: 4/12 (33%)
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(docx): remove _Toc anchor tags from heading output
Keep _Toc bookmarks during initial conversion for Section_X.Y anchor placement, then strip all _Toc anchor tags with regex. Each heading now has only bookmark GUID + Section_X.Y anchors.
Task: TASK-005 | Phase: 2/5 | Progress: 5/12 (42%)
Co-authored-by: Cursor <cursoragent@cursor.com>
* chore(planning): phase 2 complete, TASK-006 cancelled (not needed)
All MS Open Specs headings are numbered - slug anchors for non-numbered headings not needed. Phase 2 complete: 3 tasks done, 1 cancelled. Moving to Phase 3.
Phase: 2/5 complete | Progress: 6/12 (50%)
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(docx): prepend section numbers to heading text from TOC mapping
Word auto-numbers headings but the number isn't in the paragraph text. Added post-processing in Add-OpenSpecSectionAnchorsFromToc to inject section numbers from the TOC map into heading lines. Headings now show '# 1 Introduction' matching the live Microsoft Learn HTML.
Task: TASK-007 finding | Phase: 3/5 | Progress: 7/12 (58%)
Co-authored-by: Cursor <cursoragent@cursor.com>
* feat(docx): add inline formatting and table cell formatting support
Add bold/italic/code detection from OpenXML run properties (w:rPr) to markdown output. Uses Unicode noncharacter placeholders for safe marker merging of adjacent same-format runs. Whitespace moved outside markers for CommonMark compliance. Bold stripped from headings.
Table cell extraction upgraded from plain text to paragraph-aware rendering, preserving bold formatting and hyperlinks within table cells.
Results across 41 specs: 20,258 bold pairs, 669 bold table rows, 796 linked table rows, 0 conversion errors. Completes TASK-009 and TASK-010. All 12 plan tasks now completed.
Co-authored-by: Cursor <cursoragent@cursor.com>
* feat(tests): add anchor validation to Test-OpenSpecMarkdownFidelity
Extended fidelity tests to validate: Section_X.Y anchors present, no _Toc anchors remain, TOC links resolve to existing anchors, numbered headings exist, bold formatting detected. Fixed CRLF regex issue in table detection. All 41 specs pass.
Updated plan.md to reflect completed project status with all checkboxes checked.
Co-authored-by: Cursor <cursoragent@cursor.com>
* improve processing
* feat(docx): convert packet diagram tables to mermaid packet-beta syntax
Detect DOCX packet layout tables by their PacketDiagramHeaderText style and convert them to mermaid packet-beta diagrams instead of wide 32-column markdown tables. Continuation rows are merged into the previous field's bit range for correct multi-row field representation.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(docx): detect additional packet diagram styles (Definition-Field, Packetdiagramheaderrow)
Extend packet diagram detection to match Packetdiagramheaderrow and Definition-Field/Definition-Field2 styles in addition to PacketDiagramHeaderText. This catches 230 additional packet diagrams across the RDP specs.
Co-authored-by: Cursor <cursoragent@cursor.com>
* feat: rename output to <name>.md and add root index generation
Change per-spec output filename from index.md to <ProtocolId>.md for unique editor tab names. Update cross-document link generation to match. Add Update-OpenSpecIndex command that generates a README.md catalog of all converted specs with titles and links.
Co-authored-by: Cursor <cursoragent@cursor.com>
* cleanup
* Add convert-and-publish workflow and Prepare-Publish script
Co-authored-by: Cursor <cursoragent@cursor.com>
---------
Co-authored-by: Cursor <cursoragent@cursor.com>1 parent cf6e4eb commit a24e9c0
9 files changed
Lines changed: 875 additions & 45 deletions
File tree
- .github/workflows
- AwakeCoding.OpenSpecs
- Private
- Public
- scripts
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | | - | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
21 | | - | |
| 21 | + | |
| 22 | + | |
22 | 23 | | |
23 | 24 | | |
24 | 25 | | |
| |||
Lines changed: 556 additions & 35 deletions
Large diffs are not rendered by default.
Lines changed: 98 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
31 | 31 | | |
32 | 32 | | |
33 | 33 | | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
34 | 38 | | |
35 | 39 | | |
36 | 40 | | |
| |||
672 | 676 | | |
673 | 677 | | |
674 | 678 | | |
675 | | - | |
| 679 | + | |
676 | 680 | | |
677 | 681 | | |
678 | 682 | | |
| |||
817 | 821 | | |
818 | 822 | | |
819 | 823 | | |
| 824 | + | |
| 825 | + | |
| 826 | + | |
| 827 | + | |
| 828 | + | |
| 829 | + | |
| 830 | + | |
| 831 | + | |
| 832 | + | |
| 833 | + | |
| 834 | + | |
| 835 | + | |
| 836 | + | |
| 837 | + | |
| 838 | + | |
| 839 | + | |
| 840 | + | |
| 841 | + | |
| 842 | + | |
| 843 | + | |
| 844 | + | |
| 845 | + | |
| 846 | + | |
| 847 | + | |
| 848 | + | |
| 849 | + | |
| 850 | + | |
| 851 | + | |
| 852 | + | |
| 853 | + | |
| 854 | + | |
| 855 | + | |
| 856 | + | |
| 857 | + | |
| 858 | + | |
| 859 | + | |
| 860 | + | |
| 861 | + | |
| 862 | + | |
| 863 | + | |
| 864 | + | |
| 865 | + | |
| 866 | + | |
| 867 | + | |
| 868 | + | |
| 869 | + | |
| 870 | + | |
| 871 | + | |
| 872 | + | |
| 873 | + | |
| 874 | + | |
| 875 | + | |
| 876 | + | |
| 877 | + | |
| 878 | + | |
| 879 | + | |
| 880 | + | |
| 881 | + | |
| 882 | + | |
| 883 | + | |
| 884 | + | |
| 885 | + | |
| 886 | + | |
| 887 | + | |
| 888 | + | |
| 889 | + | |
| 890 | + | |
| 891 | + | |
| 892 | + | |
| 893 | + | |
| 894 | + | |
| 895 | + | |
| 896 | + | |
| 897 | + | |
| 898 | + | |
| 899 | + | |
| 900 | + | |
| 901 | + | |
| 902 | + | |
| 903 | + | |
| 904 | + | |
| 905 | + | |
| 906 | + | |
| 907 | + | |
| 908 | + | |
| 909 | + | |
| 910 | + | |
| 911 | + | |
| 912 | + | |
| 913 | + | |
820 | 914 | | |
821 | 915 | | |
822 | 916 | | |
| |||
841 | 935 | | |
842 | 936 | | |
843 | 937 | | |
844 | | - | |
| 938 | + | |
845 | 939 | | |
846 | 940 | | |
847 | 941 | | |
| |||
850 | 944 | | |
851 | 945 | | |
852 | 946 | | |
853 | | - | |
| 947 | + | |
854 | 948 | | |
855 | 949 | | |
856 | 950 | | |
| |||
859 | 953 | | |
860 | 954 | | |
861 | 955 | | |
862 | | - | |
| 956 | + | |
863 | 957 | | |
864 | 958 | | |
865 | 959 | | |
| |||
Lines changed: 3 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
89 | 89 | | |
90 | 90 | | |
91 | 91 | | |
92 | | - | |
| 92 | + | |
93 | 93 | | |
94 | 94 | | |
95 | 95 | | |
| |||
112 | 112 | | |
113 | 113 | | |
114 | 114 | | |
115 | | - | |
| 115 | + | |
| 116 | + | |
116 | 117 | | |
117 | 118 | | |
118 | 119 | | |
| |||
Lines changed: 39 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
18 | | - | |
| 18 | + | |
19 | 19 | | |
20 | 20 | | |
21 | | - | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
22 | 51 | | |
23 | 52 | | |
24 | 53 | | |
| |||
27 | 56 | | |
28 | 57 | | |
29 | 58 | | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
30 | 67 | | |
31 | 68 | | |
32 | 69 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
0 commit comments