Skip to content

Commit c03a9b5

Browse files
committed
chore: fix repo URLs, update README/CI/docs, add CI/CD publishing guide
- Fix repository URLs: opendataloader-project → raphaelmansuy across Cargo.toml (workspace), pdf-cos/Cargo.toml, all npm package.json files, pyproject.toml, and all mission/docs markdown files - Fix edgeparse-cli docs.rs URL: /edgeparse → /edgeparse-cli - Fix Docker workflow: ghcr.io/opendataloader-project → ghcr.io/raphaelmansuy - Update release-rust.yml: add pdf-cos publish step (dependency order fix) - Update release-node.yml: add 'environment: npm', fix step names - Update README.md: add crates.io/PyPI/npm badges, fix Node.js package name throughout, add Rust library install section with docs.rs links - Update docs/06-sdk-integration.md: @edgeparse/pdf → edgeparse throughout - Add docs/07-cicd-publishing.md: registry setup, secrets, workflows, troubleshooting guide for all publish pipelines - gitignore: add .playwright-mcp/
1 parent e091ed3 commit c03a9b5

25 files changed

Lines changed: 571 additions & 35 deletions

.github/workflows/release-docker.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ jobs:
4040
uses: docker/metadata-action@v5
4141
with:
4242
images: |
43-
ghcr.io/opendataloader-project/edgeparse
43+
ghcr.io/raphaelmansuy/edgeparse
4444
docker.io/raphaelmansuy/edgeparse
4545
tags: |
4646
type=semver,pattern={{version}}
@@ -64,7 +64,7 @@ jobs:
6464
- name: Security scan (Trivy)
6565
uses: aquasecurity/trivy-action@master
6666
with:
67-
image-ref: 'ghcr.io/opendataloader-project/edgeparse:${{ github.ref_name }}'
67+
image-ref: 'ghcr.io/raphaelmansuy/edgeparse:${{ github.ref_name }}'
6868
format: 'sarif'
6969
output: 'trivy-results.sarif'
7070
severity: 'HIGH,CRITICAL'

.github/workflows/release-node.yml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -75,6 +75,7 @@ jobs:
7575
name: Publish to npm
7676
needs: build-native
7777
runs-on: ubuntu-latest
78+
environment: npm
7879
steps:
7980
- uses: actions/checkout@v4
8081
- uses: actions/setup-node@v4
@@ -100,6 +101,7 @@ jobs:
100101
const p = \`\${dir}/package.json\`;
101102
if (fs.existsSync(p)) {
102103
const pkg = JSON.parse(fs.readFileSync(p, 'utf8'));
104+
// Update optionalDependencies versions
103105
Object.keys(pkg.optionalDependencies || {}).forEach(k => {
104106
pkg.optionalDependencies[k] = version;
105107
});
@@ -123,7 +125,7 @@ jobs:
123125
(cd "$dir" && npm publish --access public) || echo "::warning::Failed to publish $dir"
124126
done
125127
126-
- name: Publish @edgeparse/pdf
128+
- name: Publish edgeparse (main package)
127129
env:
128130
NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}
129131
run: |

.github/workflows/release-rust.yml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,14 @@ jobs:
3030
uses: taiki-e/install-action@git-cliff
3131
- run: git-cliff --tag "$GITHUB_REF_NAME" --latest -o RELEASE_NOTES.md
3232

33+
- name: Publish pdf-cos (lopdf fork)
34+
env:
35+
CARGO_REGISTRY_TOKEN: ${{ secrets.CARGO_REGISTRY_TOKEN }}
36+
run: cargo publish -p pdf-cos
37+
38+
- name: Wait for crates.io index (pdf-cos)
39+
run: sleep 30
40+
3341
- name: Publish edgeparse-core
3442
env:
3543
CARGO_REGISTRY_TOKEN: ${{ secrets.CARGO_REGISTRY_TOKEN }}

.gitignore

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,4 +43,4 @@ examples/output/
4343
*.log
4444

4545
mission/
46-
logs/
46+
logs/.playwright-mcp/

Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ version = "0.1.0"
1818
edition = "2021"
1919
rust-version = "1.85"
2020
license = "Apache-2.0"
21-
repository = "https://github.com/opendataloader-project/edgeparse"
21+
repository = "https://github.com/raphaelmansuy/edgeparse"
2222
description = "EdgeParse — High-performance PDF-to-structured-data extraction engine"
2323

2424
[workspace.dependencies]

README.md

Lines changed: 29 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -4,10 +4,13 @@
44

55
[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](LICENSE)
66
[![Rust](https://img.shields.io/badge/Rust-1.85%2B-orange.svg)](https://www.rust-lang.org/)
7+
[![crates.io](https://img.shields.io/crates/v/edgeparse-cli.svg)](https://crates.io/crates/edgeparse-cli)
8+
[![PyPI](https://img.shields.io/pypi/v/edgeparse.svg)](https://pypi.org/project/edgeparse/)
9+
[![npm](https://img.shields.io/npm/v/edgeparse.svg)](https://www.npmjs.com/package/edgeparse)
710

811
EdgeParse converts any digital PDF into Markdown, JSON (with bounding boxes), HTML, or plain text — deterministically, without a JVM, without a GPU, without OCR models, and with **best-in-class accuracy** among non-OCR tools on the 200-document benchmark suite included in this repository.
912

10-
Available as a **Rust library**, **CLI binary**, **Python package** (`edgeparse`), and **Node.js package** (`@edgeparse/pdf`).
13+
Available as a **Rust library**, **CLI binary**, **Python package** (`edgeparse`), and **Node.js package** (`edgeparse`).
1114

1215
---
1316

@@ -110,7 +113,7 @@ md = edgeparse.convert(
110113
### Node.js
111114

112115
```js
113-
import { convert } from '@edgeparse/pdf';
116+
import { convert } from 'edgeparse';
114117

115118
// Convert to Markdown (returns a string)
116119
const md = convert('report.pdf', { format: 'markdown' });
@@ -131,7 +134,24 @@ const result = convert('report.pdf', {
131134

132135
## Installation
133136

134-
### Rust CLI (from source)
137+
### CLI (from crates.io)
138+
139+
```bash
140+
cargo install edgeparse-cli
141+
```
142+
143+
### Rust library
144+
145+
Add to `Cargo.toml`:
146+
147+
```toml
148+
[dependencies]
149+
edgeparse-core = "0.1"
150+
```
151+
152+
Docs: [docs.rs/edgeparse-core](https://docs.rs/edgeparse-core) · [docs.rs/edgeparse-cli](https://docs.rs/edgeparse-cli)
153+
154+
### CLI (from source)
135155

136156
Requires [Rust 1.85+](https://rustup.rs/).
137157

@@ -159,7 +179,7 @@ maturin develop --release
159179
### Node.js
160180

161181
```bash
162-
npm install @edgeparse/pdf
182+
npm install edgeparse
163183
```
164184

165185
Requires Node.js 18+. Pre-built native addons for macOS (arm64, x64), Linux (x64, arm64), and Windows (x64).
@@ -302,12 +322,12 @@ edgeparse *.pdf --format json --output-dir out/ --pages "1-3"
302322

303323
## Node.js SDK
304324

305-
**Package:** `@edgeparse/pdf` · **Requires:** Node.js 18+ · **Source:** [`sdks/node/`](sdks/node/)
325+
**Package:** `edgeparse` · **Requires:** Node.js 18+ · **Source:** [`sdks/node/`](sdks/node/)
306326

307327
### `convert()`
308328

309329
```ts
310-
import { convert } from '@edgeparse/pdf';
330+
import { convert } from 'edgeparse';
311331

312332
function convert(inputPath: string, options?: ConvertOptions): string
313333
```
@@ -328,8 +348,8 @@ interface ConvertOptions {
328348
### CLI (Node.js package)
329349

330350
```bash
331-
npx @edgeparse/pdf report.pdf -f markdown -o output.md
332-
npx @edgeparse/pdf report.pdf --format json --pages "1-5"
351+
npx edgeparse report.pdf -f markdown -o output.md
352+
npx edgeparse report.pdf --format json --pages "1-5"
333353
```
334354

335355
---
@@ -501,6 +521,7 @@ Technical documentation lives in [`docs/`](docs/):
501521
| [docs/04-pdf-extraction.md](docs/04-pdf-extraction.md) | PDF loader, chunk parser, font/CMap decoding |
502522
| [docs/05-output-formats.md](docs/05-output-formats.md) | JSON schema, Markdown renderer, HTML/text/CSV output |
503523
| [docs/06-sdk-integration.md](docs/06-sdk-integration.md) | CLI flag reference, Python SDK API, Node.js SDK API, Batch API |
524+
| [docs/07-cicd-publishing.md](docs/07-cicd-publishing.md) | CI/CD publishing pipeline — how it works and how to configure it |
504525

505526
---
506527

crates/edgeparse-cli/Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ description = "EdgeParse CLI — convert PDFs to Markdown, JSON, HTML"
99
readme = "README.md"
1010
keywords = ["pdf", "cli", "markdown", "extraction"]
1111
categories = ["command-line-utilities", "text-processing"]
12-
documentation = "https://docs.rs/edgeparse"
12+
documentation = "https://docs.rs/edgeparse-cli"
1313

1414
[[bin]]
1515
name = "edgeparse"

crates/pdf-cos/Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ keywords = [
3535
]
3636
categories = ["text-processing"]
3737
license = "MIT"
38-
repository = "https://github.com/edgeparse-project/edgeparse"
38+
repository = "https://github.com/raphaelmansuy/edgeparse"
3939
exclude = [".cargo_vcs_info.json", ".cargo-ok", "Cargo.toml.orig"]
4040

4141
[badges.travis-ci]

docs/06-sdk-integration.md

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -366,7 +366,7 @@ IMAGE_OUTPUTS = ("off", "embedded", "external")
366366

367367
## 3. Node.js SDK
368368

369-
**Package name:** `@edgeparse/pdf`
369+
**Package name:** `edgeparse`
370370
**Requires:** Node.js ≥ 18
371371
**Source (TypeScript wrapper):** [`sdks/node/src/`](../sdks/node/src/)
372372
**Source (Rust addon):** [`crates/edgeparse-node/src/lib.rs`](../crates/edgeparse-node/src/lib.rs)
@@ -376,7 +376,7 @@ IMAGE_OUTPUTS = ("off", "embedded", "external")
376376

377377
```
378378
sdks/node/
379-
├── package.json # @edgeparse/pdf, optionalDependencies per platform
379+
├── package.json # edgeparse, optionalDependencies per platform
380380
├── tsconfig.json
381381
├── src/
382382
│ ├── index.ts # convert(), version() — public API
@@ -392,18 +392,18 @@ sdks/node/
392392
└── convert.test.ts # vitest tests
393393
```
394394

395-
Platform packages (`@edgeparse/pdf-{platform}`) are loaded at runtime by
395+
Platform packages (`edgeparse-{platform}`) are loaded at runtime by
396396
`loadNative()` in `index.ts` using `process.platform`/`process.arch` as the
397397
lookup key.
398398

399399
### Installation
400400

401401
```bash
402-
npm install @edgeparse/pdf
402+
npm install edgeparse
403403
# or
404-
yarn add @edgeparse/pdf
404+
yarn add edgeparse
405405
# or
406-
pnpm add @edgeparse/pdf
406+
pnpm add edgeparse
407407
```
408408

409409
The correct platform native addon is automatically selected via
@@ -414,7 +414,7 @@ The correct platform native addon is automatically selected via
414414
Defined in [`sdks/node/src/index.ts`](../sdks/node/src/index.ts):
415415

416416
```ts
417-
import { convert } from '@edgeparse/pdf';
417+
import { convert } from 'edgeparse';
418418

419419
function convert(inputPath: string, options?: ConvertOptions): string
420420
```
@@ -463,7 +463,7 @@ n.convert(inputPath, options ? {
463463
### `version()`
464464

465465
```ts
466-
import { version } from '@edgeparse/pdf';
466+
import { version } from 'edgeparse';
467467
468468
function version(): string
469469
```
@@ -473,7 +473,7 @@ Returns the edgeparse version string from the native addon.
473473
### Example usage
474474

475475
```ts
476-
import { convert } from '@edgeparse/pdf';
476+
import { convert } from 'edgeparse';
477477
478478
// Markdown (default format)
479479
const md = convert('report.pdf');
@@ -499,7 +499,7 @@ const secure = convert('secure.pdf', { password: 'hunter2' });
499499
**Entry point:** `edgeparse` binary (registered in `package.json``bin.edgeparse`)
500500

501501
```bash
502-
npx @edgeparse/pdf [options] <input.pdf>
502+
npx edgeparse [options] <input.pdf>
503503
# or after install:
504504
edgeparse [options] <input.pdf>
505505
```

0 commit comments

Comments
 (0)