Nutrient DWS TypeScript Client

A Node.js TypeScript client library for Nutrient Document Web Services (DWS) API. This library provides a type-safe and ergonomic interface for document processing operations including conversion, merging, compression, watermarking, and text extraction.

Note: This package is published as @nutrient-sdk/dws-client-typescript on NPM. The package provides full TypeScript support and is designed specifically for Node.js environments.

Features

📄 Powerful document processing: Convert, OCR, edit, compress, watermark, redact, and digitally sign documents
🤖 LLM friendly: Built-in support for popular Coding Agents (Claude Code, GitHub Copilot, JetBrains Junie, Cursor, Windsurf) and documentation on Context7
🔄 100% mapping with DWS Processor API: Complete coverage of all Nutrient DWS Processor API capabilities
🛠️ Convenient functions with sane defaults: Simple interfaces for common operations with smart default settings
⛓️ Chainable operations: Build complex document workflows with intuitive method chaining
🔐 Flexible authentication and security: Support for API keys and async token providers with secure handling
✅ Highly tested: Comprehensive test suite ensuring reliability and stability
🔒 Type-safe: Full TypeScript support with comprehensive type definitions
📦 Multiple module formats: ESM and CommonJS builds

Installation

npm install @nutrient-sdk/dws-client-typescript

or

yarn add @nutrient-sdk/dws-client-typescript

Migration Guides

v2.0.0: See docs/MIGRATION.md for URL input changes and sign() restrictions.

Integration with Coding Agents

This package has built-in support with popular coding agents like Claude Code, GitHub Copilot, Cursor, and Windsurf by exposing scripts that will inject rules instructing the coding agents on how to use the package. This ensures that the coding agent doesn't hallucinate documentation, as well as making full use of all the features offered in Nutrient DWS TypeScript Client.

# Adding code rule to Claude Code
npx dws-add-claude-code-rule

# Adding code rule to GitHub Copilot
npx dws-add-github-copilot-rule

# Adding code rule to Junie (Jetbrains)
npx dws-add-junie-rule

# Adding code rule to Cursor
npx dws-add-cursor-rule

# Adding code rule to Windsurf
npx dws-add-windsurf-rule

The documentation for Nutrient DWS TypeScript Client is also available on Context7

Quick Start

import { NutrientClient } from '@nutrient-sdk/dws-client-typescript';

const client = new NutrientClient({
  apiKey: 'nutr_sk_your_secret_key'
});

Framework Quickstarts

Framework wiring examples are available in examples/src/:

framework_openai_agents.mjs
framework_langchain.mjs
framework_crewai_scope.md (CrewAI scope note for TypeScript users)

Syntax-check commands:

node --check examples/src/framework_openai_agents.mjs
node --check examples/src/framework_langchain.mjs

Working with URLs

Most methods accept URLs directly. The URL is passed to the server, which fetches the content—this avoids SSRF vulnerabilities since the client never fetches URLs itself.

// Pass URL as a string
const result = await client.convert('https://example.com/document.pdf', 'docx');

// Or as an object (useful for TypeScript type narrowing)
const result = await client.convert({ type: 'url', url: 'https://example.com/document.pdf' }, 'docx');

// URLs also work with the workflow builder
const result = await client.workflow()
  .addFilePart('https://example.com/document.pdf')
  .outputPdf()
  .execute();

Exception: The sign() method only accepts local files (file paths, Buffers, streams) because the underlying API endpoint doesn't support URL inputs. For signing remote files, fetch the content first:

// Fetch and pass the bytes for signing
const response = await fetch('https://example.com/document.pdf');
const buffer = Buffer.from(await response.arrayBuffer());
const result = await client.sign(buffer, { /* signature options */ });

Direct Methods

The client provides numerous methods for document processing:

// Convert a document
const pdfResult = await client.convert('document.docx', 'pdf');

// Extract text
const textResult = await client.extractText('document.pdf');

// Add a watermark
const watermarkedDoc = await client.watermarkText('document.pdf', 'CONFIDENTIAL');

// Merge multiple documents
const mergedPdf = await client.merge(['doc1.pdf', 'doc2.pdf', 'doc3.pdf']);

For a complete list of available methods with examples, see the Methods Documentation.

Data Extraction (`/extraction/parse`)

client.parse() exposes Nutrient's Data Extraction API. It's designed for content-extraction workflows where you need to feed document content into a downstream pipeline rather than render or transform the document itself:

RAG / search indexing / content migration — pull a clean Markdown representation of a document for chunking, embedding, and indexing in a vector store or search engine.
Form and invoice extraction — pull structured fields (key/value pairs, tables, semantic regions) out of business documents with bounding boxes and confidence scores attached to every element.
Layout-aware document understanding — get a typed, page-anchored element list (paragraphs with semantic roles, tables with cell spans, formulas in LaTeX, pictures, handwriting) suitable for building document-comprehension tooling, including agentic workflows.

The endpoint accepts PDFs, Office documents (Word, Excel, PowerPoint), and images. Unlike sign(), it is not restricted to PDFs.

Choosing an output format

Format	Best for	Shape
`markdown`	RAG, search indexing, content migration — anywhere structured text beats spatial data	`response.output.markdown` — a single Markdown string
`spatial` (default)	Form/invoice extraction, layout reconstruction, flows that need per-element confidence	`response.output.elements` — flat array of typed elements

Setup — separate Extract API key

Data Extraction is a separate product from the DWS Processor with its own credit pool and its own API key. Pass both keys when constructing the client:

const client = new NutrientClient({
  apiKey: process.env.NUTRIENT_API_KEY!,          // Processor key
  extractApiKey: process.env.NUTRIENT_EXTRACT_API_KEY!, // Data Extraction key
});

extractApiKey is consulted only by parse(), parseToMarkdown(), and parseElements(). Every other method on the client (convert, sign, ocr, merge, …) keeps using apiKey. If you omit extractApiKey, the parse methods fall back to apiKey — that fallback only works on tenants whose single DWS key authorises both products.

Quick start

import { NutrientClient } from '@nutrient-sdk/dws-client-typescript';

const client = new NutrientClient({
  apiKey: process.env.NUTRIENT_API_KEY!,
  extractApiKey: process.env.NUTRIENT_EXTRACT_API_KEY!,
});

// Spatial elements (default) — paragraphs, tables, key-value regions, etc.
const result = await client.parse('contract.pdf', { mode: 'understand' });
if (result.output.elements !== undefined) {
  for (const el of result.output.elements) {
    if (el.type === 'table') console.log(`${el.rowCount}x${el.columnCount} table`);
  }
}

// Whole-document Markdown from a born-digital PDF.
const mdResult = await client.parse('report.pdf', { mode: 'text' });
if (mdResult.output.markdown !== undefined) {
  console.log(mdResult.output.markdown);
}

Modes — when to use which

Mode	Credits / page	When to use
`text`	1	Born-digital documents only. No OCR, no AI. Fastest and cheapest path to Markdown.
`structure`	1.5	OCR-based segmentation with bounding boxes. Handles scanned documents, images, and any input that requires OCR.
`understand`	9	Full pipeline with AI augmentation on top of OCR. Most accurate for tables, multi-column layouts, formulas, and forms.
`agentic`	18	Builds on `understand` and adds a vision-language model. Best for image descriptions and complex visual layouts.

Recipes

RAG ingestion — PDF → Markdown → chunks → embeddings → vector store:

const result = await client.parse('whitepaper.pdf', { mode: 'text' });
const markdown = result.output.markdown!;
// Then: chunk on headings, embed, push to your vector store.

For born-digital PDFs, mode: 'text' is the cheapest path (1 credit/page). For scanned PDFs or images, switch to mode: 'structure' so OCR runs.

Or use the convenience wrapper:

const markdown = await client.parseToMarkdown('whitepaper.pdf');

Form/invoice extraction — PDF → spatial elements → structured object:

const result = await client.parse('invoice.pdf', { mode: 'understand' });
const elements = result.output.elements!;

// Pull key/value pairs from form regions.
const fields: Record<string, unknown> = {};
for (const el of elements) {
  if (el.type === 'keyValueRegion') {
    for (const pair of el.pairs) {
      if (pair.key && pair.value) {
        fields[String(pair.key.value)] = pair.value.value;
      }
    }
  }
}

// Walk tables — each cell carries row/col indices and span counts.
for (const el of elements) {
  if (el.type === 'table') {
    console.log(`Table: ${el.rowCount}×${el.columnCount}`);
    for (const cell of el.cells) {
      console.log(`  [${cell.row}][${cell.column}] ${cell.text}`);
    }
  }
}

For complex documents that mix dense images with text, step up to mode: 'agentic' so the VLM produces image descriptions and semantic classifications (18 credits/page).

Or use the convenience wrapper to skip output-format discrimination entirely:

const elements = await client.parseElements('invoice.pdf', 'understand');

Billing — extraction credits vs processor credits

/extraction/parse is billed against extraction credits, a separate billing bucket from the processor API credits consumed by convert, ocr, sign, merge, and every other endpoint on this client. The two buckets never debit each other.

Extraction-credit accounting is returned per request:

const result = await client.parse('document.pdf', { mode: 'structure' });
const usage = result.usage?.data_extraction_credits;
console.log(`Cost: ${usage?.cost} extraction credits`);
console.log(`Remaining: ${usage?.remainingCredits} extraction credits`);

The hand-composed types (ExtractionCredits, ParseOptions, ParseInstructions, ParseResponse, ParseResponseSpatial, ParseResponseMarkdown, ParseOutputOptions) are exported from the package root. The spec primitives — Mode, Element and the six element subtypes, Bounds, PageRef, Word, TableCell, KeyValuePair, KeyValueEntity, Metrics, Usage, Configuration, ParseErrorResponse, etc. — live under the extractComponents namespace:

import type { extractComponents } from '@nutrient-sdk/dws-client-typescript';

type ParagraphElement = extractComponents['schemas']['ParagraphElement'];
type TableElement = extractComponents['schemas']['TableElement'];

This mirrors how the Processor types are exposed via the existing components namespace.

Workflow System

The client also provide a fluent builder pattern with staged interfaces to create document processing workflows:

const result = await client
  .workflow()
  .addFilePart('document.pdf')
  .addFilePart('appendix.pdf')
  .applyAction(BuildActions.watermarkText('CONFIDENTIAL', {
    opacity: 0.5,
    fontSize: 48
  }))
  .outputPdf({ 
    optimize: { 
      mrcCompression: true,
      imageOptimizationQuality: 2 
    } 
  })
  .execute();

The workflow system follows a staged approach:

Add document parts (files, HTML, pages)
Apply actions (optional)
Set output format
Execute or perform a dry run

For detailed information about the workflow system, including examples and best practices, see the Workflow Documentation.

Error Handling

The library provides a comprehensive error hierarchy:

import { 
  NutrientError,
  ValidationError,
  APIError,
  AuthenticationError,
  NetworkError
} from '@nutrient-sdk/dws-client-typescript';

try {
  const result = await client.convert('file.docx', 'pdf');
} catch (error) {
  if (error instanceof ValidationError) {
    // Invalid input parameters
    console.error('Invalid input:', error.message, error.details);
  } else if (error instanceof AuthenticationError) {
    // Authentication failed
    console.error('Auth error:', error.message, error.statusCode);
  } else if (error instanceof APIError) {
    // API returned an error
    console.error('API error:', error.message, error.statusCode, error.details);
  } else if (error instanceof NetworkError) {
    // Network request failed
    console.error('Network error:', error.message, error.details);
  }
}

Testing

The library includes comprehensive unit and integration tests:

# Run all tests
npm test

# Run with coverage report
npm test -- --coverage

# Run only unit tests
npm run test:unit

# Run integration tests (requires API key)
NUTRIENT_API_KEY=your_key npm run test:integration

The library maintains high test coverage across all API methods, including:

Unit tests for all public methods
Integration tests for real API interactions

Contributing

We welcome contributions to improve the library! Please follow our development standards to ensure code quality and maintainability.

Quick start for contributors:

Clone and setup the repository
Make changes following atomic commit practices
Use conventional commits for clear change history
Include appropriate tests for new features

For detailed contribution guidelines, see the Contributing Guide.

Project Structure

src/
├── __tests__/   # Test files
├── builders/    # Builder classes
├── generated/   # Generated code
├── types/       # TypeScript interfaces and types
├── build.ts     # Build utilities
├── client.ts    # Main NutrientClient class
├── errors.ts    # Error classes
├── http.ts      # HTTP layer
├── inputs.ts    # Input handling
├── workflow.ts  # WorkflowBuilder class
└── index.ts     # Public exports

CI/CD

This project uses GitHub Actions for continuous integration and deployment:

CI: Runs linting, type checking, and tests on every push and PR
Integration Tests: Tests against the real Nutrient API
Scheduled Integration Tests: Daily API compatibility check
Security: Automated security scanning

For security reasons, API keys are stored as GitHub Secrets and Integration tests only run on trusted sources.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support

For issues and feature requests, please use the GitHub issue tracker.

For questions about the Nutrient DWS Processor API, refer to the official documentation.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
.github		.github
docs		docs
examples		examples
scripts		scripts
src		src
.env.example		.env.example
.gitignore		.gitignore
.npmignore		.npmignore
.prettierignore		.prettierignore
.prettierrc.json		.prettierrc.json
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
LLM_DOC.md		LLM_DOC.md
README.md		README.md
context7.json		context7.json
dws-api-spec.yml		dws-api-spec.yml
dws-data-extraction-spec.yml		dws-data-extraction-spec.yml
eslint.config.mjs		eslint.config.mjs
jest.config.mjs		jest.config.mjs
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
tsconfig.test.json		tsconfig.test.json
tsconfig.types.json		tsconfig.types.json
tsup.config.ts		tsup.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Nutrient DWS TypeScript Client

Features

Installation

Migration Guides

Integration with Coding Agents

Quick Start

Framework Quickstarts

Working with URLs

Direct Methods

Data Extraction (`/extraction/parse`)

Choosing an output format

Setup — separate Extract API key

Quick start

Modes — when to use which

Recipes

Billing — extraction credits vs processor credits

Workflow System

Error Handling

Testing

Contributing

Project Structure

CI/CD

License

Support

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Nutrient DWS TypeScript Client

Features

Installation

Migration Guides

Integration with Coding Agents

Quick Start

Framework Quickstarts

Working with URLs

Direct Methods

Data Extraction (/extraction/parse)

Choosing an output format

Setup — separate Extract API key

Quick start

Modes — when to use which

Recipes

Billing — extraction credits vs processor credits

Workflow System

Error Handling

Testing

Contributing

Project Structure

CI/CD

License

Support

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Data Extraction (`/extraction/parse`)

Packages