You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Merge origin/main into codex/nutrient-skill-router
Resolve the PR conflict by adopting the Python plugin structure merged in #3 while preserving the broader document-processing guidance, modular references, Codex metadata, and expanded capability coverage from this branch. Add a lightweight validation workflow so packaging drift and unresolved markers are caught automatically in future PRs.
-`scripts/*.py` are single-operation scripts only.
216
+
- Multi-step workflows are generated at runtime in a temporary script from `assets/templates/custom-workflow-template.py`.
217
+
- Do not commit runtime pipeline scripts.
218
+
- Use `references/` for HTML/URL generation, compliance outputs, and other workflows that are easier to express as direct API payloads or temporary pipelines.
219
+
231
220
## Documentation
232
221
233
222
-**[SKILL.md](nutrient-document-processing/SKILL.md)** — Agent instructions with setup and operation examples
234
-
-**[REFERENCE.md](nutrient-document-processing/references/REFERENCE.md)** — Complete API reference with all endpoints, parameters, and error codes
223
+
-**[Reference Index](nutrient-document-processing/references/REFERENCE.md)** — Modular cookbook for generation, conversion, extraction, security, compliance, and workflow sequencing
224
+
-**[Testing Guide](nutrient-document-processing/tests/testing-guide.md)** — Manual test procedures
225
+
-**[Custom Workflow Template](nutrient-document-processing/assets/templates/custom-workflow-template.py)** — Runtime pipeline starting point
226
+
-**[Codex App Metadata](nutrient-document-processing/agents/openai.yaml)** — Optional manifest for Codex App packaging
235
227
-**[API Playground](https://dashboard.nutrient.io/processor-api/playground/)** — Interactive API testing
236
228
-**[Official API Docs](https://www.nutrient.io/guides/dws-processor/)** — Nutrient documentation
237
229
@@ -241,4 +233,4 @@ Built by [Nutrient](https://www.nutrient.io/) (formerly PSPDFKit) — document S
description: Use when tasks involve generating PDFs from HTML or URLs, converting Office/images/PDFs, assembling or splitting PDFs, OCRing and extracting content, redacting, watermarking, signing, filling, or producing compliance outputs like PDF/A, PDF/UA, and linearized PDFs with Nutrient DWS. Triggers include convert to PDF, OCR this scan, extract tables, merge these PDFs, redact PII, sign this PDF, make this PDF/A, or linearize for web delivery. Prefer the Nutrient MCP server when it is already configured, otherwise call the API directly.
3
+
description: >-
4
+
Process documents with Nutrient DWS. Use when the user wants to generate PDFs from HTML or URLs,
5
+
convert Office/images/PDFs, assemble or split packets, OCR scans, extract text/tables/key-value
6
+
pairs, redact PII, watermark, sign, fill forms, optimize PDFs, or produce compliance outputs like
7
+
PDF/A or PDF/UA. Triggers include convert to PDF, merge these PDFs, OCR this scan, extract tables,
8
+
redact PII, sign this PDF, make this PDF/A, or linearize for web delivery.
compatibility: "Requires Python 3.10+, uv, and internet. Works with Claude Code, Codex CLI, Gemini CLI, OpenCode, Cursor, Windsurf, GitHub Copilot, Amp, or any Agent Skills-compatible product."
Use Nutrient DWS for managed document workflows where fidelity, compliance, or multi-step processing matters more than local-tool convenience.
11
22
12
-
## Setup assumptions
23
+
## Setup
24
+
- Get a Nutrient DWS API key at <https://dashboard.nutrient.io/sign_up/?product=processor>.
13
25
- Direct API calls use `Authorization: Bearer $NUTRIENT_API_KEY`.
26
+
```bash
27
+
export NUTRIENT_API_KEY="nutr_sk_..."
28
+
```
14
29
- MCP setups commonly use `@nutrient-sdk/dws-mcp-server` with `NUTRIENT_DWS_API_KEY`.
15
-
- Open `references/request-basics.md` first when authentication or payload shape is the blocker.
30
+
- Scripts live in `scripts/` relative to this SKILL.md. Use the directory containing this SKILL.md as the working directory:
31
+
```bash
32
+
cd<directory containing this SKILL.md>&& uv run scripts/<script>.py --help
33
+
```
34
+
- Page ranges use `start:end` with 0-based indexes and end-exclusive semantics. Negative indexes count from the end.
16
35
17
36
## When to use
18
37
- Generate PDFs from HTML templates, uploaded assets, or remote URLs.
@@ -23,38 +42,49 @@ Use Nutrient DWS for managed document workflows where fidelity, compliance, or m
23
42
- Check credits before large, batch, or AI-heavy runs.
24
43
25
44
## Tool preference
26
-
1. Prefer the Nutrient MCP server when it is already configured. It handles file I/O and reduces multipart-request boilerplate.
27
-
2. Fall back to direct API calls when MCP is unavailable or the workflow is easier to express as an explicit payload.
28
-
3. Use local PDF utilities only for lightweight inspection. Use Nutrient when output fidelity or compliance matters.
45
+
1. Prefer `scripts/*.py` for covered single-operation workflows.
46
+
2. Use `assets/templates/custom-workflow-template.py` for multi-step jobs that should still run through the Python client.
47
+
3. Use the modular `references/` docs and direct API payloads for capabilities that do not yet have a dedicated helper script, especially HTML/URL generation and compliance tuning.
48
+
4. Use local PDF utilities only for lightweight inspection. Use Nutrient when output fidelity or compliance matters.
29
49
30
-
## Request model
31
-
- Most workflows use `POST https://api.nutrient.io/build`.
32
-
- Use multipart requests when uploading local files. Use JSON requests when all inputs are remote URLs.
33
-
-`parts` describes source files, HTML inputs, remote URLs, page ranges, and passwords.
34
-
-`actions` applies ordered transformations such as OCR, redaction, watermarking, signing, flattening, or rotation.
35
-
-`output` selects the final format and delivery options such as `pdf`, `text`, `docx`, `png`, `pdfa`, `pdfua`, or optimized PDF output.
36
-
- Dedicated endpoints also exist for some tools such as PDF/UA auto-tagging, but `/build` is the default mental model.
50
+
## Single-operation scripts
51
+
-`convert.py` -> convert between `pdf`, `pdfa`, `pdfua`, `docx`, `xlsx`, `pptx`, `png`, `jpeg`, `webp`, `html`, and `markdown`
52
+
-`merge.py` -> merge multiple files into one PDF
53
+
-`split.py` -> split one PDF into multiple PDFs by page ranges
54
+
-`add-pages.py` -> append blank pages
55
+
-`delete-pages.py` -> remove specific pages
56
+
-`duplicate-pages.py` -> reorder or duplicate pages into a new PDF
When the user asks for multiple operations in one run:
72
+
1. Copy `assets/templates/custom-workflow-template.py` to a temporary location such as `/tmp/ndp-workflow-<task>.py`.
73
+
2. Implement the combined workflow in that temporary script.
74
+
3. Run it with `uv run /tmp/ndp-workflow-<task>.py ...`.
75
+
4. Return generated output files.
76
+
5. Delete the temporary script unless the user explicitly asks to keep it.
47
77
48
-
## Workflow
49
-
1. Identify the source type and the required final artifact.
50
-
2. Decide whether the job is generation, conversion, extraction, security/compliance, or a chained workflow.
51
-
3. Express the full pipeline in one payload when the ordering is clear and the artifact should stay in-memory on the server.
52
-
4. Save outputs with stable suffixes such as `-ocr`, `-redacted`, `-pdfa`, `-pdfua`, or `-linearized`.
78
+
## PDF Requirements
79
+
-`split.py` requires a multi-page PDF and cannot extract ranges from a single-page document.
80
+
-`delete-pages.py` must retain at least one page and cannot delete the entire document.
81
+
-`sign.py` only accepts local file paths for the main PDF.
53
82
54
83
## Decision rules
84
+
- Prefer a helper script when one already covers the requested operation cleanly.
55
85
- If you control the source markup, prefer HTML generation over browser print workflows.
56
86
- Use remote `file.url` inputs when the source already lives at a stable URL and you want to avoid local uploads.
57
-
- Use `output.type` for conversion and finalization targets. Use `actions` for transformations.
87
+
- Use `output.type` for conversion and finalization targets. Use `actions` for transformations when building direct API payloads.
58
88
- OCR before text extraction, key-value extraction, or semantic redaction on scans.
59
89
- Prefer preset or regex redaction when the target is explicit. Use AI redaction only for contextual or natural-language requests.
60
90
- Use the PDF manipulation reference for merge, split, rotate, flatten, and page-range workflows instead of inferring those payloads from conversion examples.
@@ -68,6 +98,7 @@ curl -X POST https://api.nutrient.io/build \
68
98
- Do not flatten forms or annotations until the user confirms the artifact no longer needs to stay editable.
69
99
- Do not sign, archive, or linearize intermediate working files. Keep those as final-delivery steps.
70
100
- Do not promise PDF/A or PDF/UA compliance without a validation step when the requirement is contractual.
101
+
- Do not commit temporary workflow scripts under `scripts/`.
71
102
72
103
## Reference map
73
104
Read only what you need:
@@ -80,9 +111,9 @@ Read only what you need:
80
111
-`references/compliance-and-optimization.md` -> PDF/A, PDF/UA, optimization, and linearization
81
112
-`references/workflow-recipes.md` -> end-to-end sequencing patterns for common business document workflows
0 commit comments