This document provides detailed information about the workflow system in the Nutrient DWS TypeScript Client.
The Nutrient DWS TypeScript Client uses a fluent builder pattern with staged interfaces to create document processing workflows. This architecture provides several benefits:
- Type Safety: The staged interface ensures that methods are only available at appropriate stages
- Readability: Method chaining creates readable, declarative code
- Discoverability: IDE auto-completion guides you through the workflow stages
- Flexibility: Complex workflows can be built with simple, composable pieces
The workflow builder follows a staged approach:
You have several ways of creating a workflow
// Creating Workflow from a client
const workflow = client.workflow()
// Override the client timeout
const workflow = client.workflow(60000)
// Create a workflow without a client
const workflow = new StagedWorkflowBuilder({
apiKey: "your-api-key",
})In this stage, you add document parts to the workflow:
const workflow = client.workflow()
.addFilePart('document.pdf')
.addFilePart('appendix.pdf');Available methods:
Adds a file part to the workflow.
Parameters:
file: FileInput- The file to add to the workflow. Can be a local file path, Buffer, or URL.options?: object- Additional options for the file part.actions?: BuildAction[]- Actions to apply to the file part.
Returns: WorkflowWithPartsStage - The workflow builder instance for method chaining.
Example:
// Add a PDF file from a local path
workflow.addFilePart('/path/to/document.pdf');
// Add a file with options and actions
workflow.addFilePart(
'/path/to/document.pdf',
{ pages: { start: 1, end: 3 } },
[BuildActions.watermarkText('CONFIDENTIAL')]
);Adds an HTML part to the workflow.
Parameters:
html: FileInput- The HTML content to add. Can be a file path, Buffer, or URL.assets?: FileInput[]- Optional array of assets (CSS, images, etc.) to include with the HTML. Only local files or Buffers are supported (not URLs).options?: object- Additional options for the HTML part.actions?: BuildAction[]- Actions to apply to the HTML part.
Returns: WorkflowWithPartsStage - The workflow builder instance for method chaining.
Example:
// Add HTML content from a file
workflow.addHtmlPart('/path/to/content.html');
// Add HTML with assets and options
workflow.addHtmlPart(
'/path/to/content.html',
['/path/to/style.css', '/path/to/image.png'],
{ layout: { size: 'A4' } }
);Adds a new blank page to the workflow.
Parameters:
options?: object- Additional options for the new page, such as page size, orientation, etc.actions?: BuildAction[]- Actions to apply to the new page.
Returns: WorkflowWithPartsStage - The workflow builder instance for method chaining.
Example:
// Add a simple blank page
workflow.addNewPage();
// Add a new page with specific options
workflow.addNewPage(
{ layout: { size: 'A4', orientation: 'portrait' } }
);Adds a document part to the workflow by referencing an existing document by ID.
Parameters:
documentId: string- The ID of the document to add to the workflow.options?: object- Additional options for the document part.options.layer?: string- Optional layer name to select a specific layer from the document.
actions?: BuildAction[]- Actions to apply to the document part.
Returns: WorkflowWithPartsStage - The workflow builder instance for method chaining.
Example:
// Add a document by ID
workflow.addDocumentPart('doc_12345abcde');
// Add a document with a specific layer and options
workflow.addDocumentPart(
'doc_12345abcde',
{
layer: 'content',
pages: { start: 0, end: 3 }
}
);In this stage, you can apply actions to the document:
workflow.applyAction(BuildActions.watermarkText('CONFIDENTIAL', {
opacity: 0.5,
fontSize: 48
}));Available methods:
Applies a single action to the workflow.
Parameters:
action: BuildAction- The action to apply to the workflow.
Returns: WorkflowWithActionsStage - The workflow builder instance for method chaining.
Example:
// Apply a watermark action
workflow.applyAction(
BuildActions.watermarkText('CONFIDENTIAL', {
opacity: 0.3,
rotation: 45
})
);
// Apply an OCR action
workflow.applyAction(BuildActions.ocr('eng'));Applies multiple actions to the workflow.
Parameters:
actions: BuildAction[]- An array of actions to apply to the workflow.
Returns: WorkflowWithActionsStage - The workflow builder instance for method chaining.
Example:
// Apply multiple actions to the workflow
workflow.applyActions([
BuildActions.watermarkText('DRAFT', { opacity: 0.5 }),
BuildActions.ocr('eng'),
BuildActions.flatten()
]);Creates an OCR (Optical Character Recognition) action to extract text from images or scanned documents.
Parameters:
language: string | string[]- Language(s) for OCR. Can be a single language or an array of languages.
Example:
// Basic OCR with English language
workflow.applyAction(BuildActions.ocr('english'));
// OCR with multiple languages
workflow.applyAction(BuildActions.ocr(['english', 'french', 'german']));
// OCR with options (via object syntax)
workflow.applyAction(BuildActions.ocr({
language: 'english',
enhanceResolution: true
}));Creates an action to rotate pages in the document.
Parameters:
rotateBy: 90 | 180 | 270- Rotation angle in degrees (must be 90, 180, or 270).
Example:
// Rotate pages by 90 degrees
workflow.applyAction(BuildActions.rotate(90));
// Rotate pages by 180 degrees
workflow.applyAction(BuildActions.rotate(180));Creates an action to flatten annotations into the document content, making them non-interactive but permanently visible.
Parameters:
annotationIds?: (string | number)[]- Optional array of annotation IDs to flatten. If not specified, all annotations will be flattened.
Example:
// Flatten all annotations
workflow.applyAction(BuildActions.flatten());
// Flatten specific annotations
workflow.applyAction(BuildActions.flatten(['annotation1', 'annotation2']));Creates an action to add a text watermark to the document.
Parameters:
text: string- Watermark text content.options?: object- Watermark options:width: Width dimension of the watermark (value and unit, e.g.{value: 100, unit: '%'})height: Height dimension of the watermark (value and unit)top,right,bottom,left: Position of the watermark (value and unit)rotation: Rotation of the watermark in counterclockwise degrees (default: 0)opacity: Watermark opacity (0 is fully transparent, 1 is fully opaque)fontFamily: Font family for the text (e.g. 'Helvetica')fontSize: Size of the text in pointsfontColor: Foreground color of the text (e.g. '#ffffff')fontStyle: Text style array ('bold', 'italic', or both)
Example:
// Simple text watermark
workflow.applyAction(BuildActions.watermarkText('CONFIDENTIAL'));
// Customized text watermark
workflow.applyAction(BuildActions.watermarkText('DRAFT', {
opacity: 0.5,
rotation: 45,
fontSize: 36,
fontColor: '#FF0000',
fontStyle: ['bold', 'italic']
}));Creates an action to add an image watermark to the document.
Parameters:
image: FileInput- Watermark image (file path, Buffer, or URL).options?: object- Watermark options:width: Width dimension of the watermark (value and unit, e.g.{value: 100, unit: '%'})height: Height dimension of the watermark (value and unit)top,right,bottom,left: Position of the watermark (value and unit)rotation: Rotation of the watermark in counterclockwise degrees (default: 0)opacity: Watermark opacity (0 is fully transparent, 1 is fully opaque)
Example:
// Simple image watermark
workflow.applyAction(BuildActions.watermarkImage('/path/to/logo.png'));
// Customized image watermark
workflow.applyAction(BuildActions.watermarkImage('/path/to/logo.png', {
opacity: 0.3,
width: { value: 50, unit: '%' },
height: { value: 50, unit: '%' },
top: { value: 10, unit: 'px' },
left: { value: 10, unit: 'px' },
rotation: 0
}));Creates an action to apply annotations from an Instant JSON file to the document.
Parameters:
file: FileInput- Instant JSON file input (file path, Buffer, or URL).
Example:
// Apply annotations from Instant JSON file
workflow.applyAction(BuildActions.applyInstantJson('/path/to/annotations.json'));Creates an action to apply annotations from an XFDF file to the document.
Parameters:
file: FileInput- XFDF file input (file path, Buffer, or URL).options?: object- Apply XFDF options:ignorePageRotation?: boolean- If true, ignores page rotation when applying XFDF data (default: false)richTextEnabled?: boolean- If true, plain text annotations will be converted to rich text annotations. If false, all text annotations will be plain text annotations (default: true)
Example:
// Apply annotations from XFDF file with default options
workflow.applyAction(BuildActions.applyXfdf('/path/to/annotations.xfdf'));
// Apply annotations with specific options
workflow.applyAction(BuildActions.applyXfdf('/path/to/annotations.xfdf', {
ignorePageRotation: true,
richTextEnabled: false
}));Creates an action to add redaction annotations based on text search.
Parameters:
text: string- Text to search and redact.options?: object- Redaction options:content?: object- Visual aspects of the redaction annotation (background color, overlay text, etc.)
strategyOptions?: object- Redaction strategy options:includeAnnotations?: boolean- If true, redaction annotations are created on top of annotations whose content match the provided text (default: true)caseSensitive?: boolean- If true, the search will be case sensitive (default: false)start?: number- The index of the page from where to start the search (default: 0)limit?: number- Starting from start, the number of pages to search (default: to the end of the document)
Example:
// Create redactions for all occurrences of "Confidential"
workflow.applyAction(BuildActions.createRedactionsText('Confidential'));
// Create redactions with custom appearance and search options
workflow.applyAction(BuildActions.createRedactionsText('Confidential',
{
content: {
backgroundColor: '#000000',
overlayText: 'REDACTED',
textColor: '#FFFFFF'
}
},
{
caseSensitive: true,
start: 2,
limit: 5
}
));Creates an action to add redaction annotations based on regex pattern matching.
Parameters:
regex: string- Regex pattern to search and redact.options?: object- Redaction options:content?: object- Visual aspects of the redaction annotation (background color, overlay text, etc.)
strategyOptions?: object- Redaction strategy options:includeAnnotations?: boolean- If true, redaction annotations are created on top of annotations whose content match the provided regex (default: true)caseSensitive?: boolean- If true, the search will be case sensitive (default: true)start?: number- The index of the page from where to start the search (default: 0)limit?: number- Starting from start, the number of pages to search (default: to the end of the document)
Example:
// Create redactions for email addresses
workflow.applyAction(BuildActions.createRedactionsRegex('[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}'));
// Create redactions with custom appearance and search options
workflow.applyAction(BuildActions.createRedactionsRegex('[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}',
{
content: {
backgroundColor: '#FF0000',
overlayText: 'EMAIL REDACTED'
}
},
{
caseSensitive: false,
start: 0,
limit: 10
}
));Creates an action to add redaction annotations based on a preset pattern.
Parameters:
preset: string- Preset pattern to search and redact (e.g. 'email-address', 'credit-card-number', 'social-security-number', etc.)options?: object- Redaction options:content?: object- Visual aspects of the redaction annotation (background color, overlay text, etc.)
strategyOptions?: object- Redaction strategy options:includeAnnotations?: boolean- If true, redaction annotations are created on top of annotations whose content match the provided preset (default: true)start?: number- The index of the page from where to start the search (default: 0)limit?: number- Starting from start, the number of pages to search (default: to the end of the document)
Example:
// Create redactions for email addresses using preset
workflow.applyAction(BuildActions.createRedactionsPreset('email-address'));
// Create redactions for credit card numbers with custom appearance
workflow.applyAction(BuildActions.createRedactionsPreset('credit-card-number',
{
content: {
backgroundColor: '#000000',
overlayText: 'FINANCIAL DATA'
}
},
{
start: 0,
limit: 5
}
));Creates an action to apply previously created redaction annotations, permanently removing the redacted content.
Example:
// First create redactions
workflow.applyAction(BuildActions.createRedactionsPreset('email-address'));
// Then apply them
workflow.applyAction(BuildActions.applyRedactions());In this stage, you specify the desired output format:
workflow.outputPdf({
optimize: {
mrcCompression: true,
imageOptimizationQuality: 2
}
});Available methods:
Sets the output format to PDF.
Parameters:
options?: object- Additional options for PDF output, such as compression, encryption, etc.options.metadata?: object- Document metadata properties like title, author.options.labels?: array- Custom labels to add to the document for organization and categorization.options.userPassword?: string- Password required to open the document. When set, the PDF will be encrypted.options.ownerPassword?: string- Password required to modify the document. Provides additional security beyond the user password.options.userPermissions?: array- Array of permissions granted to users who open the document with the user password. Options include: "printing", "modification", "content-copying", "annotation", "form-filling", etc.options.optimize?: object- PDF optimization settings to reduce file size and improve performance.options.optimize.mrcCompression?: boolean- When true, applies Mixed Raster Content compression to reduce file size.options.optimize.imageOptimizationQuality?: number- Controls the quality of image optimization (1-5, where 1 is highest quality).
Returns: WorkflowWithOutputStage<'pdf'> - The workflow builder instance for method chaining.
Example:
// Set output format to PDF with default options
workflow.outputPdf();
// Set output format to PDF with specific options
workflow.outputPdf({
userPassword: 'secret',
userPermissions: ["printing"],
metadata: {
title: 'Important Document',
author: 'Document System'
},
optimize: {
mrcCompression: true,
imageOptimizationQuality: 3
}
});Sets the output format to PDF/A (archival PDF).
Parameters:
options?: object- Additional options for PDF/A output.options.conformance?: string- The PDF/A conformance level to target. Options include 'pdfa-1b', 'pdfa-1a', 'pdfa-2b', 'pdfa-2a', 'pdfa-3b', 'pdfa-3a'. Different levels have different requirements for long-term archiving.options.vectorization?: boolean- When true, attempts to convert raster content to vector graphics where possible, improving quality and reducing file size.options.rasterization?: boolean- When true, converts vector graphics to raster images, which can help with compatibility in some cases.options.metadata?: object- Document metadata properties like title, author.options.labels?: array- Custom labels to add to the document for organization and categorization.options.userPassword?: string- Password required to open the document. When set, the PDF will be encrypted.options.ownerPassword?: string- Password required to modify the document. Provides additional security beyond the user password.options.userPermissions?: array- Array of permissions granted to users who open the document with the user password. Options include: "printing", "modification", "content-copying", "annotation", "form-filling", etc.options.optimize?: object- PDF optimization settings to reduce file size and improve performance.options.optimize.mrcCompression?: boolean- When true, applies Mixed Raster Content compression to reduce file size.options.optimize.imageOptimizationQuality?: number- Controls the quality of image optimization (1-5, where 1 is highest quality).
Returns: WorkflowWithOutputStage<'pdfa'> - The workflow builder instance for method chaining.
Example:
// Set output format to PDF/A with default options
workflow.outputPdfA();
// Set output format to PDF/A with specific options
workflow.outputPdfA({
conformance: 'pdfa-2b',
vectorization: true,
metadata: {
title: 'Archive Document',
author: 'Document System'
},
optimize: {
mrcCompression: true
}
});Sets the output format to PDF/UA (Universal Accessibility).
Parameters:
options?: object- Additional options for PDF/UA output.options.metadata?: object- Document metadata properties like title, author.options.labels?: array- Custom labels to add to the document for organization and categorization.options.userPassword?: string- Password required to open the document. When set, the PDF will be encrypted.options.ownerPassword?: string- Password required to modify the document. Provides additional security beyond the user password.options.userPermissions?: array- Array of permissions granted to users who open the document with the user password. Options include: "printing", "modification", "content-copying", "annotation", "form-filling", etc.options.optimize?: object- PDF optimization settings to reduce file size and improve performance.options.optimize.mrcCompression?: boolean- When true, applies Mixed Raster Content compression to reduce file size.options.optimize.imageOptimizationQuality?: number- Controls the quality of image optimization (1-5, where 1 is highest quality).
Returns: WorkflowWithOutputStage<'pdfua'> - The workflow builder instance for method chaining.
Example:
// Set output format to PDF/UA with default options
workflow.outputPdfUA();
// Set output format to PDF/UA with specific options
workflow.outputPdfUA({
metadata: {
title: 'Accessible Document',
author: 'Document System'
},
optimize: {
mrcCompression: true,
imageOptimizationQuality: 3
}
});Sets the output format to an image format (PNG, JPEG, WEBP).
Parameters:
format: 'png' | 'jpeg' | 'jpg' | 'webp'- The image format to output.- PNG: Lossless compression, supports transparency, best for graphics and screenshots
- JPEG/JPG: Lossy compression, smaller file size, best for photographs
- WEBP: Modern format with both lossy and lossless compression, good for web use
options?: object- Additional options for image output, such as resolution, quality, etc. Note: At least one of options.width, options.height, or options.dpi must be specified.options.pages?: object- Specifies which pages to convert to images. If omitted, all pages are converted.options.pages.start?: number- The first page to convert (0-based index).options.pages.end?: number- The last page to convert (0-based index).
options.width?: number- The width of the output image in pixels. If specified without height, aspect ratio is maintained.options.height?: number- The height of the output image in pixels. If specified without width, aspect ratio is maintained.options.dpi?: number- The resolution in dots per inch. Higher values create larger, more detailed images. Common values: 72 (web), 150 (standard), 300 (print quality), 600 (high quality).
Returns: WorkflowWithOutputStage<format> - The workflow builder instance for method chaining.
Example:
// Set output format to PNG with dpi specified
workflow.outputImage('png', { dpi: 300 });
// Set output format to JPEG with specific options
workflow.outputImage('jpeg', {
dpi: 300,
pages: { start: 1, end: 3 }
});
// Set output format to WEBP with specific dimensions
workflow.outputImage('webp', {
width: 1200,
height: 800,
dpi: 150
});Sets the output format to an Office document format (DOCX, XLSX, PPTX).
Parameters:
format: 'docx' | 'xlsx' | 'pptx'- The Office format to output ('docx' for Word, 'xlsx' for Excel, or 'pptx' for PowerPoint).
Returns: WorkflowWithOutputStage<format> - The workflow builder instance for method chaining.
Example:
// Set output format to Word document (DOCX)
workflow.outputOffice('docx');
// Set output format to Excel spreadsheet (XLSX)
workflow.outputOffice('xlsx');
// Set output format to PowerPoint presentation (PPTX)
workflow.outputOffice('pptx');Sets the output format to HTML.
Parameters:
layout: 'page' | 'reflow'- The layout type to use for conversion to HTML:- 'page' layout keeps the original structure of the document, segmented by page.
- 'reflow' layout converts the document into a continuous flow of text, without page breaks.
Returns: WorkflowWithOutputStage<'html'> - The workflow builder instance for method chaining.
Example:
// Set output format to HTML
workflow.outputHtml('page');Sets the output format to Markdown.
Returns: WorkflowWithOutputStage<'markdown'> - The workflow builder instance for method chaining.
Example:
// Set output format to Markdown with default options
workflow.outputMarkdown();Sets the output format to JSON content.
Parameters:
options?: object- Additional options for JSON output.options.plainText?: boolean- When true, extracts plain text content from the document and includes it in the JSON output. This provides the raw text without structural information.options.structuredText?: boolean- When true, extracts text with structural information (paragraphs, headings, etc.) and includes it in the JSON output.options.keyValuePairs?: boolean- When true, attempts to identify and extract key-value pairs from the document (like form fields, labeled data, etc.) and includes them in the JSON output.options.tables?: boolean- When true, attempts to identify and extract tabular data from the document and includes it in the JSON output as structured table objects.options.language?: string | string[]- Specifies the language(s) of the document content for better text extraction. Can be a single language code or an array of language codes for multi-language documents. Examples: "english", "french", "german", or ["english", "spanish"].
Returns: WorkflowWithOutputStage<'json-content'> - The workflow builder instance for method chaining.
Example:
// Set output format to JSON with default options
workflow.outputJson();
// Set output format to JSON with specific options
workflow.outputJson({
plainText: true,
structuredText: true,
keyValuePairs: true,
tables: true,
language: "english"
});
// Set output format to JSON with multiple languages
workflow.outputJson({
plainText: true,
tables: true,
language: ["english", "french", "german"]
});In this final stage, you execute the workflow or perform a dry run:
const result = await workflow.execute();Available methods:
Executes the workflow and returns the result.
Parameters:
options?: WorkflowExecuteOptions- Options for workflow execution.options.onProgress?: (current: number, total: number) => void- Callback for progress updates.
Returns: Promise<TypedWorkflowResult<TOutput>> - A promise that resolves to the workflow result.
Example:
// Execute the workflow with default options
const result = await workflow.execute();
// Execute with progress tracking
const result = await workflow.execute({
onProgress: (current, total) => {
console.log(`Processing step ${current} of ${total}`);
}
});Performs a dry run of the workflow without generating the final output. This is useful for validating the workflow configuration and estimating processing time.
Returns: Promise<WorkflowDryRunResult> - A promise that resolves to the dry run result, containing validation information and estimated processing time.
Example:
// Perform a dry run with default options
const dryRunResult = await workflow
.addFilePart('/path/to/document.pdf')
.outputPdf()
.dryRun();const result = await client
.workflow()
.addFilePart('document.docx')
.outputPdf()
.execute();const result = await client
.workflow()
.addFilePart('document1.pdf')
.addFilePart('document2.pdf')
.applyAction(BuildActions.watermarkText('CONFIDENTIAL', {
opacity: 0.5,
fontSize: 48
}))
.outputPdf()
.execute();const result = await client
.workflow()
.addFilePart('scanned-document.pdf')
.applyAction(BuildActions.ocr({
language: 'english',
enhanceResolution: true
}))
.outputPdf()
.execute();const result = await client
.workflow()
.addHtmlPart('index.html', undefined, {
layout: {
size: 'A4',
margin: {
top: 50,
bottom: 50,
left: 50,
right: 50
}
}
})
.outputPdf()
.execute();const result = await client
.workflow()
.addFilePart('document.pdf', { pages: { start: 0, end: 5 } })
.addFilePart('appendix.pdf')
.applyActions([
BuildActions.ocr({ language: 'english' }),
BuildActions.watermarkText('CONFIDENTIAL'),
BuildActions.createRedactionsPreset('email-address', 'apply')
])
.outputPdfA({
level: 'pdfa-2b',
optimize: {
mrcCompression: true
}
})
.execute({
onProgress: (current, total) => {
console.log(`Processing step ${current} of ${total}`);
}
});For more complex scenarios where you need to build workflows dynamically, you can use the staged workflow builder:
// Create a staged workflow
const workflow = client.workflow()
// Add parts
workflow.addFilePart('document.pdf');
// Conditionally add more parts
if (includeAppendix) {
workflow.addFilePart('appendix.pdf');
}
// Conditionally apply actions
if (needsWatermark) {
(workflow as WorkflowWithPartsStage).applyAction(BuildActions.watermarkText('CONFIDENTIAL'));
}
// Set output format based on user preference
if (outputFormat === 'pdf') {
(workflow as WorkflowWithActionsStage).outputPdf();
} else if (outputFormat === 'docx') {
(workflow as WorkflowWithActionsStage).outputOffice('docx');
} else {
(workflow as WorkflowWithActionsStage).outputImage('png');
}
// Execute the workflow
const result = await (workflow as WorkflowWithOutputStage).execute();Workflows provide detailed error information:
try {
const result = await client
.workflow()
.addFilePart('document.pdf')
.outputPdf()
.execute();
if (!result.success) {
// Handle workflow errors
result.errors?.forEach(error => {
console.error(`Step ${error.step}: ${error.error.message}`);
});
}
} catch (error) {
// Handle unexpected errors
console.error('Workflow execution failed:', error);
}The result of a workflow execution includes:
interface WorkflowResult {
// Overall success status
success: boolean;
// Output data (if successful)
output?: {
// For File output
mimeType: string;
filename: string;
// For Binary File (PDF, Image, Office)
buffer: Buffer;
// For Text File (HTML, Markdown)
content: string
// For JSON output:
data?: any;
};
// Error information (if failed)
errors?: Array<{
step: string;
error: {
message: string;
code: string;
details?: any;
};
}>;
}For optimal performance with workflows:
- Minimize the number of parts: Combine related files when possible
- Use appropriate output formats: Choose formats based on your needs
- Consider dry runs: Use
dryRun()to estimate resource usage - Monitor progress: Use the
onProgresscallback for long-running workflows - Handle large files: For very large files, consider splitting into smaller workflows