Skip to content

Commit ca9ba14

Browse files
committed
New package datocms-structured-text-to-markdown
1 parent 47e101f commit ca9ba14

11 files changed

Lines changed: 5025 additions & 0 deletions

File tree

CLAUDE.md

Lines changed: 173 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,173 @@
1+
# CLAUDE.md
2+
3+
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
4+
5+
## Overview
6+
7+
This is a Lerna-managed monorepo for DatoCMS Structured Text (DAST) utilities. It provides TypeScript libraries for handling, converting, and rendering DatoCMS Structured Text documents across multiple formats.
8+
9+
## Commands
10+
11+
### Building
12+
13+
```bash
14+
npm run build # Bootstrap all packages and build them
15+
lerna bootstrap # Install dependencies for all packages
16+
lerna run build # Build all packages
17+
```
18+
19+
Individual packages can be built by navigating to `packages/<package-name>` and running:
20+
21+
```bash
22+
npm run build # Compiles TypeScript to both CommonJS (dist/cjs) and ESM (dist/esm)
23+
```
24+
25+
### Testing
26+
27+
```bash
28+
npm test # Run linter and all tests
29+
jest # Run tests only
30+
jest <file-pattern> # Run specific test file(s)
31+
```
32+
33+
Tests are located in `__tests__` directories within each package using Jest with ts-jest preset.
34+
35+
### Linting & Formatting
36+
37+
```bash
38+
npm run lint # ESLint check on all .ts/.tsx files
39+
npm run prettier # Format all TypeScript and JSON files
40+
```
41+
42+
Pre-commit hook automatically runs `pretty-quick --staged` to format staged files.
43+
44+
### Publishing
45+
46+
```bash
47+
npm run publish # Build, test, and publish to npm
48+
npm run publish-next # Publish with 'next' dist-tag
49+
```
50+
51+
## Architecture
52+
53+
### Monorepo Structure
54+
55+
The repository contains 9 packages in `packages/`:
56+
57+
**Core:**
58+
59+
- `utils`: Foundation package with TypeScript types, tree manipulation, validation, rendering framework, and inspection utilities for DAST documents
60+
61+
**Converters (to DAST):**
62+
63+
- `html-to-structured-text`: Converts HTML/Hast syntax trees to DAST
64+
- `contentful-to-structured-text`: Migrates Contentful Rich Text to DAST
65+
66+
**Renderers (from DAST):**
67+
68+
- `generic-html-renderer`: Base HTML rendering utilities (marks, spans, node rules)
69+
- `to-html-string`: Server-side HTML string renderer
70+
- `to-dom-nodes`: Browser-based DOM node renderer
71+
- `to-plain-text`: Plain text extractor
72+
73+
**Framework Utilities:**
74+
75+
- `slate-utils`: Slate.js integration helpers
76+
77+
### DAST (DatoCMS Abstract Syntax Tree)
78+
79+
The structured text format follows a tree structure defined in `packages/utils/src/types.ts`:
80+
81+
**Root structure:**
82+
83+
- Every DAST document starts with a `root` node containing block-level children
84+
- Schema is always `"dast"`
85+
86+
**Block nodes:** `paragraph`, `heading`, `list`, `listItem`, `blockquote`, `code`, `block`, `thematicBreak`
87+
88+
**Inline nodes:** `span`, `link`, `itemLink`, `inlineItem`, `inlineBlock`
89+
90+
**Key characteristics:**
91+
92+
- Block nodes can have custom `style` attributes (paragraph, heading)
93+
- Inline nodes contain text marks (strong, emphasis, code, underline, strikethrough, highlight)
94+
- `block` and `inlineBlock` reference external DatoCMS items
95+
- `itemLink` references other DatoCMS records
96+
97+
Node types, allowed children, and allowed attributes are defined in `packages/utils/src/definitions.ts`.
98+
99+
### Core Utilities Package
100+
101+
**Tree Manipulation** (`manipulation.ts`):
102+
103+
- `visit()`: Traverse tree with visitor pattern
104+
- `find()`, `findAll()`: Query nodes with predicates
105+
- `map()`, `filter()`: Transform/filter tree nodes
106+
- `getNodePath()`, `hasPath()`: Path-based navigation
107+
- Works with full documents (`{schema: 'dast', document: ...}`) or bare nodes
108+
109+
**Inspector** (`inspector.ts`):
110+
111+
- `inspect()`: Pretty-print DAST trees with customizable formatting
112+
- Supports custom block formatters for embedded items
113+
- Configurable width, indentation
114+
115+
**Rendering Framework** (`render.ts`):
116+
117+
- `render()`: Generic rendering with adapter pattern
118+
- Adapter requires `renderNode()`, `renderText()`, `renderFragment()`
119+
- `renderRule()`: Helper for creating render rules with guard predicates
120+
121+
**Validation** (`validate.ts`):
122+
123+
- `validate()`: Ensures DAST conforms to specification
124+
- `isValidDocument()`, `isValidNode()`: Type guards
125+
126+
**Type Guards** (`guards.ts`):
127+
128+
- Type predicates for all node types (e.g., `isHeading()`, `isParagraph()`)
129+
- Enables type-safe narrowing in TypeScript
130+
131+
### Package Build System
132+
133+
Each package uses dual build:
134+
135+
- `tsconfig.json`: Compiles to CommonJS (`dist/cjs/`)
136+
- `tsconfig.esnext.json`: Compiles to ES modules (`dist/esm/`)
137+
138+
Both produce TypeScript declarations in `dist/types/`.
139+
140+
The root `tsconfig.json` provides shared compiler options (strict mode, ES2015+ libs).
141+
142+
### Inter-package Dependencies
143+
144+
Most packages depend on `datocms-structured-text-utils` for core types and utilities. Lerna manages workspace linking during development. When publishing, packages reference specific versions of dependencies.
145+
146+
## Development Notes
147+
148+
- TypeScript strict mode is enabled with `strictNullChecks`
149+
- Target is ES5 for broad compatibility
150+
- Both CommonJS and ESM outputs are produced
151+
- All packages export typings for TypeScript consumers
152+
- Generic HTML renderer provides base utilities used by specific renderers
153+
- Tree manipulation functions are immutable - they return new trees rather than mutating
154+
155+
## Package-Specific Notes
156+
157+
**html-to-structured-text:**
158+
159+
- Uses rehype/hast for HTML parsing
160+
- Supports both browser (DOMParser) and Node.js (parse5) environments
161+
- Configurable handlers for custom HTML elements
162+
- Can restrict allowed blocks, heading levels, and marks
163+
164+
**generic-html-renderer:**
165+
166+
- Provides `renderSpanValue()` for handling text spans with marks
167+
- Mark-to-tag mapping (emphasis→em, underline→u, strikethrough→s, highlight→mark)
168+
- Used as foundation by to-html-string and to-dom-nodes
169+
170+
**utils:**
171+
172+
- `update-links.js` script updates GitHub links in README.md to match current line numbers
173+
- Tree manipulation supports custom type parameters for block/inline item types

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,8 @@ Monorepo with Typescript libraries for handling and rendering [DatoCMS Structure
2626
- Plain text renderer for the Structured Text document.
2727
- [`datocms-structured-text-to-html-string`](https://github.com/datocms/structured-text/tree/master/packages/to-html-string)
2828
- HTML renderer for the DatoCMS Structured Text field type.
29+
- [`datocms-structured-text-to-markdown`](https://github.com/datocms/structured-text/tree/master/packages/to-markdown)
30+
- Markdown renderer for the DatoCMS Structured Text field type.
2931
- [`<StructuredText />`](https://github.com/datocms/react-datocms#structured-text)
3032
- React component that you can use to render Structured Text documents.
3133
- [`<datocms-structured-text />`](https://github.com/datocms/vue-datocms#structured-text)

packages/to-markdown/LICENSE.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
MIT License
2+
3+
Copyright (c) [year] [fullname]
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a copy
6+
of this software and associated documentation files (the "Software"), to deal
7+
in the Software without restriction, including without limitation the rights
8+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9+
copies of the Software, and to permit persons to whom the Software is
10+
furnished to do so, subject to the following conditions:
11+
12+
The above copyright notice and this permission notice shall be included in all
13+
copies or substantial portions of the Software.
14+
15+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21+
SOFTWARE.

0 commit comments

Comments
 (0)