|
| 1 | +# LaTeX Parser @todo Cleanup — Design |
| 2 | + |
| 3 | +**Date**: 2026-03-07 |
| 4 | + |
| 5 | +## Goal |
| 6 | + |
| 7 | +Address ~30 `@todo` comments across `src/compute-engine/latex-syntax/`, covering |
| 8 | +missing parse/serialize implementations (Category B) and presentation quality |
| 9 | +fixes (Category D). |
| 10 | + |
| 11 | +## Work Units |
| 12 | + |
| 13 | +### 1. Stale Derivative Comments (trivial) |
| 14 | + |
| 15 | +The `@todo` items at `definitions-core.ts:1365-1366` reference missing Leibniz |
| 16 | +and Euler derivative parsing. Investigation shows these are **already |
| 17 | +implemented**: |
| 18 | + |
| 19 | +- Leibniz ordinary: `definitions-arithmetic.ts:454-500` (outputs `D`) |
| 20 | +- Leibniz partial: `definitions-arithmetic.ts:420-451` (outputs |
| 21 | + `PartialDerivative`) |
| 22 | +- Euler `D_x f`: `definitions-core.ts:1478-1518` |
| 23 | +- Euler partial `\partial_x f`: `definitions-other.ts:127-157` |
| 24 | +- Newton `\dot{x}`: `definitions-core.ts:1428-1476` |
| 25 | +- `D` serializer: `definitions-core.ts:1381-1425` |
| 26 | + |
| 27 | +**Action**: Remove the stale `@todo` comments. |
| 28 | + |
| 29 | +### 2. Set Operations (medium) |
| 30 | + |
| 31 | +#### 2a. Set Builder Parsing |
| 32 | + |
| 33 | +**File**: `definitions-sets.ts:364` |
| 34 | + |
| 35 | +The `{...}` matchfix parser only handles enumerated sets (`\{1,2,3\}`). Add |
| 36 | +detection of `\mid`, `|`, or `\colon` as a separator to produce set-builder |
| 37 | +notation. |
| 38 | + |
| 39 | +**Parsing**: `\{x \in \R \mid x > 0\}` → |
| 40 | + |
| 41 | +```json |
| 42 | +["Set", ["Element", "x", "RealNumbers"], ["Condition", ["Greater", "x", 0]]] |
| 43 | +``` |
| 44 | + |
| 45 | +The serializer for this shape already exists at `definitions-sets.ts:571-581`. |
| 46 | + |
| 47 | +**Implementation**: Inside the matchfix parse handler, after parsing the body, |
| 48 | +check if the body contains a `\mid`/`|`/`\colon` separator. If so, split into |
| 49 | +expression + condition and wrap in `["Set", expr, ["Condition", cond]]`. |
| 50 | + |
| 51 | +The challenge: `|` is ambiguous (could be absolute value). Use `\mid` and |
| 52 | +`\colon` as unambiguous triggers. For bare `|`, only treat as separator when |
| 53 | +inside `\{...\}` matchfix context (which is already the case here). |
| 54 | + |
| 55 | +#### 2b. `Multiple` — Defer |
| 56 | + |
| 57 | +`Multiple` has no library definition (no entry in `sets.ts`). The latex-syntax |
| 58 | +entry has a name and an empty serialize stub. Since it's not a real operator in |
| 59 | +the engine, **defer** this until `Multiple` is defined in the library. Remove the |
| 60 | +empty serialize stub to avoid confusion. |
| 61 | + |
| 62 | +#### 2c. Multi-arg `CartesianProduct` / `Complement` Serialization |
| 63 | + |
| 64 | +**File**: `definitions-sets.ts:221,228` |
| 65 | + |
| 66 | +Currently these only handle the 2-arg infix case. Extend: |
| 67 | + |
| 68 | +- `CartesianProduct(A, B, C)` → `A \times B \times C` |
| 69 | +- `Complement(A, B)` — already works as postfix `A^\complement`; the multi-arg |
| 70 | + comment may be stale. Verify and update/remove. |
| 71 | + |
| 72 | +### 3. BigOp Step Ranges — Update Comment (trivial) |
| 73 | + |
| 74 | +**File**: `definitions-arithmetic.ts:1712` |
| 75 | + |
| 76 | +The `Element` form (`i \in S`) is already handled at line 1720-1725. The |
| 77 | +step-range gap (`i=1..3..10`) is intentionally deferred — uncommon LaTeX |
| 78 | +notation. **Action**: Update the comment to reflect current state. |
| 79 | + |
| 80 | +### 4. Spacing Commands (small) |
| 81 | + |
| 82 | +#### 4a. Parse `\hspace`, `\hskip`, `\kern` |
| 83 | + |
| 84 | +**File**: `parse.ts:689` |
| 85 | + |
| 86 | +These take dimension arguments. Parse into |
| 87 | +`["HorizontalSpacing", "'<dimension>'"]` with the dimension as a string |
| 88 | +preserving unit. |
| 89 | + |
| 90 | +- `\hspace{1em}`, `\hspace*{1em}` — group argument |
| 91 | +- `\hskip 5pt`, `\kern-3mu` — inline glue (parse number + unit, ignore |
| 92 | + plus/minus stretch) |
| 93 | + |
| 94 | +Register as expression triggers in `definitions-other.ts`. The parse handler |
| 95 | +reads the dimension and returns `["HorizontalSpacing", "'<value><unit>'"]`. |
| 96 | + |
| 97 | +#### 4b. Serialize `HorizontalSpacing` with Math Spacing Classes |
| 98 | + |
| 99 | +**File**: `definitions-other.ts:544` |
| 100 | + |
| 101 | +The 2-arg form `["HorizontalSpacing", expr, "'bin'"]` should serialize as: |
| 102 | + |
| 103 | +- `"bin"` → `\mathbin{expr}` |
| 104 | +- `"op"` → `\mathop{expr}` |
| 105 | +- `"rel"` → `\mathrel{expr}` |
| 106 | +- `"ord"` → `\mathord{expr}` |
| 107 | +- `"open"` → `\mathopen{expr}` |
| 108 | +- `"close"` → `\mathclose{expr}` |
| 109 | +- `"punct"` → `\mathpunct{expr}` |
| 110 | + |
| 111 | +Currently the second argument is silently dropped. |
| 112 | + |
| 113 | +### 5. Serializer Quality (medium) |
| 114 | + |
| 115 | +#### 5a. Skip Redundant Parens on Matchfix Operators |
| 116 | + |
| 117 | +**File**: `serializer.ts:90` |
| 118 | + |
| 119 | +`wrap()` adds parentheses around low-precedence expressions. But matchfix |
| 120 | +operators (`Abs`, `Floor`, `Ceil`, `Delimiter`) already have visible delimiters. |
| 121 | +Adding parens produces `\left(|x|\right)`. |
| 122 | + |
| 123 | +**Fix**: In `wrap()`, check if the expression is a matchfix operator with visible |
| 124 | +delimiters. If so, skip the wrapping. Identify matchfix by operator name: |
| 125 | +`Abs`, `Floor`, `Ceil`, `Norm`, and any `Delimiter` expression. |
| 126 | + |
| 127 | +#### 5b. `serializeTabular()` for Environments |
| 128 | + |
| 129 | +**File**: `definitions.ts:519` |
| 130 | + |
| 131 | +Environment entries use a generic serializer. When the body is a Matrix (List of |
| 132 | +Lists), serialize as tabular: `&` between columns, `\\` between rows. |
| 133 | + |
| 134 | +**Implementation**: Add a `serializeTabular()` helper that takes a matrix |
| 135 | +expression and produces `row1col1 & row1col2 \\ row2col1 & row2col2`. Wire it |
| 136 | +into the environment default serializer when the body matches a matrix shape. |
| 137 | + |
| 138 | +#### 5c. `groupStyle` for `\left..\right` in Matchfix |
| 139 | + |
| 140 | +**File**: `definitions.ts:531` |
| 141 | + |
| 142 | +Matchfix serialization currently emits raw delimiter strings. It should call |
| 143 | +`serializer.groupStyle(expr)` to choose between: |
| 144 | + |
| 145 | +- `"none"` → bare delimiters `(`, `)` |
| 146 | +- `"auto"` → `\left(`, `\right)` |
| 147 | +- `"big"` → `\bigl(`, `\bigr)` |
| 148 | +- etc. |
| 149 | + |
| 150 | +### 6. String Group Symbols (small) |
| 151 | + |
| 152 | +**File**: `parse.ts:1143` |
| 153 | + |
| 154 | +In `parseStringGroup()`, when encountering a `\`-prefixed token, check if it |
| 155 | +maps to a known Unicode symbol (Greek letters, common math symbols). Substitute |
| 156 | +the Unicode character instead of passing through the raw LaTeX command. |
| 157 | + |
| 158 | +Example: `\operatorname{\alpha-test}` → the string `"α-test"` instead of |
| 159 | +`"\\alpha-test"`. |
| 160 | + |
| 161 | +**Implementation**: Use the existing symbol dictionary to look up the mapping. |
| 162 | +Only substitute for symbols that have a single Unicode character representation |
| 163 | +(Greek letters, `\infty`, etc.). Leave unknown commands as-is. |
| 164 | + |
| 165 | +## Out of Scope |
| 166 | + |
| 167 | +- `Multiple` operator (no library definition) |
| 168 | +- Step ranges in BigOp indexing (uncommon notation) |
| 169 | +- Percent notation (`types.ts:618,693`) |
| 170 | +- Domain checks for `Abs`/`Norm` (`definitions-arithmetic.ts:877,1470,1479`) |
| 171 | +- Precedence corrections vs MathML (`definitions-other.ts:54,60,110`) |
| 172 | + |
| 173 | +## Testing Strategy |
| 174 | + |
| 175 | +Each work unit gets its own test block in the appropriate test file: |
| 176 | + |
| 177 | +- Set builder: `test/compute-engine/latex-syntax/sets.test.ts` |
| 178 | +- Spacing: `test/compute-engine/latex-syntax/style.test.ts` |
| 179 | +- Serializer quality: new tests alongside existing serialize tests |
| 180 | +- String groups: `test/compute-engine/latex-syntax/stefnotch.test.ts` or a new |
| 181 | + `string-groups.test.ts` |
| 182 | + |
| 183 | +Round-trip tests (parse → serialize → parse) for all new parse/serialize pairs. |
0 commit comments