Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
54 commits
Select commit Hold shift + click to select a range
174699b
cleanup jsdoc description
xenova Apr 23, 2026
3a1c923
Export GenerationConfig
xenova Apr 23, 2026
ccef1d6
set pipelines modules
xenova Apr 23, 2026
25cdaf1
formatting
xenova Apr 23, 2026
59152c3
Update tokenization_utils.js
xenova Apr 23, 2026
34a5293
set module
xenova Apr 23, 2026
ea32d94
Custom jsdoc to markdown conversion
xenova Apr 23, 2026
6e7647f
document NoBadWordsLogitsProcessor
xenova Apr 24, 2026
ee47f0d
document StoppingCriteriaList
xenova Apr 24, 2026
9096f55
Update audio.js
xenova Apr 24, 2026
cd3df7c
Update random.js
xenova Apr 24, 2026
d051275
unprivate _int32
xenova Apr 24, 2026
8e3584e
Update streamers.js
xenova Apr 24, 2026
f8a8c42
Update image.js
xenova Apr 24, 2026
84abd06
Update tensor.js
xenova Apr 24, 2026
94a51f9
Update _toctree.yml
xenova Apr 24, 2026
9de1380
use load_image/load_audio where possible
xenova Apr 24, 2026
e1e4493
Update random.js
xenova Apr 24, 2026
d4ea25a
update model ids
xenova Apr 24, 2026
2fa2e74
Document all AutoModelXXX classes
xenova Apr 24, 2026
bd5e68d
update jsdoc
xenova Apr 24, 2026
0559d25
bump jinja.js
xenova Apr 24, 2026
ac20e29
Update tensor.js
xenova Apr 24, 2026
7aa3953
Update random.js
xenova Apr 24, 2026
1b388c2
replace example jsdoc with markdown
xenova Apr 24, 2026
d9a03c3
automatic skills generation
xenova Apr 24, 2026
6c691b4
improve docs
xenova Apr 24, 2026
f587689
add ai contents
xenova Apr 24, 2026
295bafe
improve
xenova Apr 25, 2026
2e48589
document video.js
xenova Apr 25, 2026
92e8128
use permute instead of transpose where necessary
xenova Apr 25, 2026
e4a5455
link to video docs
xenova Apr 25, 2026
400ea6c
deduplicate tokenization docs
xenova Apr 25, 2026
ccf3229
update docs
xenova Apr 25, 2026
a5ac9f0
improve docs parsing
xenova Apr 25, 2026
d4f4556
add video to docs
xenova Apr 25, 2026
c5fe49b
improvements
xenova Apr 25, 2026
6bf631a
Update zero-shot-audio-classification.js
xenova Apr 25, 2026
f44cea3
Update TASKS.md
xenova Apr 25, 2026
8451da9
update generator
xenova Apr 25, 2026
7a2ee02
improve rendering of callable objects
xenova Apr 25, 2026
05c29da
use js script instead of py for readme generation
xenova Apr 25, 2026
b74f3b8
update
xenova Apr 25, 2026
fdc5fbf
cleanup
xenova Apr 25, 2026
69937d3
use single docs generation command
xenova Apr 25, 2026
03b4044
Update pipelines.md
xenova Apr 25, 2026
0b8cb77
fix typos
xenova Apr 25, 2026
9bbfee8
fix links
xenova Apr 25, 2026
c7fee9d
formatting
xenova Apr 25, 2026
4b70566
cleanup
xenova Apr 25, 2026
5cb33e3
Update AGENTS.md
xenova Apr 25, 2026
c610f76
consistency checks
xenova Apr 25, 2026
6a682fc
fix commands
xenova Apr 25, 2026
8b90895
fix escaping
xenova Apr 25, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
56 changes: 56 additions & 0 deletions .ai/AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# Agent guidelines for transformers.js

This file governs AI-assisted contributions to the `huggingface/transformers.js`
repository. Agentic users must read and follow it before proposing changes.

## Available skills

- [`transformers-js`](skills/transformers-js/SKILL.md) — how to use the library
itself. Load this skill when working on code that calls `@huggingface/transformers`.

## Contributing

Before opening a pull request:

- **Check for existing work.** Search open PRs and issues (`gh pr list`, `gh issue list`)
for the area you're touching. Don't duplicate someone else's in-progress work.
- **Coordinate on the issue first.** If an issue exists, comment on it before drafting a
PR. If approval from the issue author or a maintainer is unclear, stop and ask.
- **No low-value busywork.** Reformatting, renaming, and cosmetic-only PRs that don't
fix a reported problem or implement a requested feature are discouraged.
- **Run the full test suite locally.** `pnpm test` at the package root must pass before
you submit. Don't skip hooks with `--no-verify`.
- **Accountability.** AI-assisted patches are the responsibility of the human submitter.
Review the diff yourself before pushing.

## Local setup

```bash
pnpm install
pnpm --filter @huggingface/transformers test
pnpm --filter @huggingface/transformers typegen
pnpm --filter @huggingface/transformers docs-generate
```

## Documentation generation

Run the full documentation generator after any JSDoc, docs snippet, task metadata,
or generated skill content change:

```bash
pnpm --filter @huggingface/transformers docs-generate
```

`docs-generate` runs [`docs/scripts/generate-all.js`](../packages/transformers/docs/scripts/generate-all.js),
which generates:

- `packages/transformers/docs/source/api/**/*.md` from JSDoc comments in
`packages/transformers/src/**/*.js`.
- Generated sections in `.ai/skills/transformers-js/SKILL.md`.
- `.ai/skills/transformers-js/references/TASKS.md`.

Do not edit generated API pages or generated skill reference files by hand. Update
the source JSDoc, docs snippets, or generator modules instead.

The generator also validates generated API pages against `docs/source/_toctree.yml`
and checks local Markdown links and anchors under `docs/source/`.
160 changes: 160 additions & 0 deletions .ai/skills/transformers-js/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,160 @@
---
name: transformers-js
description: Run state-of-the-art machine learning models directly in JavaScript. `@huggingface/transformers` supports text, vision, audio, and multimodal tasks in browsers and Node.js / Bun / Deno, with WebGPU or WASM execution.
license: Apache-2.0
metadata:
author: huggingface
repository: https://github.com/huggingface/transformers.js
compatibility: Node.js 18+ (or equivalent Bun / Deno), or a modern browser with ES modules. WebGPU requires runtime and hardware support; WASM is the fallback. Model downloads from the Hugging Face Hub require network access unless you ship models locally.
---

# transformers.js

ML inference for JavaScript, without a Python server. Supports text, vision, audio,
and multimodal tasks through a single `pipeline()` entry point.

## Install

```bash
npm install @huggingface/transformers
```

## Quick start

```javascript
import { pipeline } from "@huggingface/transformers";

const classifier = await pipeline("sentiment-analysis");
const output = await classifier("I love transformers!");
// [{ label: "POSITIVE", score: 0.9998 }]
```

`pipeline(task, model?, options?)` is the one function you need 90% of the time.
Passing no `model` uses the default for that task.

## Supported tasks

<!-- @generated:start id=task-list -->
- [`text-classification`](references/TASKS.md#text-classification) _(alias: `sentiment-analysis`)_ — default model: `Xenova/distilbert-base-uncased-finetuned-sst-2-english`
- [`token-classification`](references/TASKS.md#token-classification) _(alias: `ner`)_ — default model: `Xenova/bert-base-multilingual-cased-ner-hrl`
- [`question-answering`](references/TASKS.md#question-answering) — default model: `Xenova/distilbert-base-cased-distilled-squad`
- [`fill-mask`](references/TASKS.md#fill-mask) — default model: `onnx-community/ettin-encoder-32m-ONNX`
- [`summarization`](references/TASKS.md#summarization) — default model: `Xenova/distilbart-cnn-6-6`
- [`translation`](references/TASKS.md#translation) — default model: `Xenova/t5-small`
- [`text2text-generation`](references/TASKS.md#text2text-generation) — default model: `Xenova/flan-t5-small`
- [`text-generation`](references/TASKS.md#text-generation) — default model: `onnx-community/Qwen3-0.6B-ONNX`
- [`zero-shot-classification`](references/TASKS.md#zero-shot-classification) — default model: `Xenova/distilbert-base-uncased-mnli`
- [`audio-classification`](references/TASKS.md#audio-classification) — default model: `Xenova/wav2vec2-base-superb-ks`
- [`zero-shot-audio-classification`](references/TASKS.md#zero-shot-audio-classification) — default model: `Xenova/clap-htsat-unfused`
- [`automatic-speech-recognition`](references/TASKS.md#automatic-speech-recognition) _(alias: `asr`)_ — default model: `Xenova/whisper-tiny.en`
- [`text-to-audio`](references/TASKS.md#text-to-audio) _(alias: `text-to-speech`)_ — default model: `onnx-community/Supertonic-TTS-ONNX`
- [`image-to-text`](references/TASKS.md#image-to-text) — default model: `Xenova/vit-gpt2-image-captioning`
- [`image-classification`](references/TASKS.md#image-classification) — default model: `Xenova/vit-base-patch16-224`
- [`image-segmentation`](references/TASKS.md#image-segmentation) — default model: `Xenova/detr-resnet-50-panoptic`
- [`background-removal`](references/TASKS.md#background-removal) — default model: `Xenova/modnet`
- [`zero-shot-image-classification`](references/TASKS.md#zero-shot-image-classification) — default model: `Xenova/clip-vit-base-patch32`
- [`object-detection`](references/TASKS.md#object-detection) — default model: `Xenova/detr-resnet-50`
- [`zero-shot-object-detection`](references/TASKS.md#zero-shot-object-detection) — default model: `Xenova/owlvit-base-patch32`
- [`document-question-answering`](references/TASKS.md#document-question-answering) — default model: `Xenova/donut-base-finetuned-docvqa`
- [`image-to-image`](references/TASKS.md#image-to-image) — default model: `Xenova/swin2SR-classical-sr-x2-64`
- [`depth-estimation`](references/TASKS.md#depth-estimation) — default model: `onnx-community/depth-anything-v2-small`
- [`feature-extraction`](references/TASKS.md#feature-extraction) _(alias: `embeddings`)_ — default model: `onnx-community/all-MiniLM-L6-v2-ONNX`
- [`image-feature-extraction`](references/TASKS.md#image-feature-extraction) — default model: `onnx-community/dinov3-vits16-pretrain-lvd1689m-ONNX`
<!-- @generated:end id=task-list -->

For full recipes — every task, grouped by modality, with runnable code —
see [`references/TASKS.md`](references/TASKS.md).

## Choosing a model

Browse models compatible with transformers.js on the Hub:
<https://huggingface.co/models?library=transformers.js>

Filter by task with the `pipeline_tag` parameter, e.g.
<https://huggingface.co/models?library=transformers.js&pipeline_tag=text-generation>.

### Quantization

Most pipelines accept a `dtype` option. Smaller dtypes download and run faster
at the cost of some accuracy:

| `dtype` | Size | Use when |
|----------|-----------|----------------------------------------------------|
| `fp32` | Largest | Maximum accuracy, Node.js with lots of RAM |
| `fp16` | ~50% of fp32 | GPU / WebGPU inference |
| `q8` | ~25% of fp32 | Good default for browsers |
| `q4` | ~12% of fp32 | Tight memory budgets, large language models |

```javascript
const pipe = await pipeline("text-generation", "onnx-community/Qwen3-0.6B-ONNX", {
dtype: "q4",
});
```

### Device

Default is CPU/WASM. Pass `device: "webgpu"` to run on the GPU when available:

```javascript
const pipe = await pipeline("sentiment-analysis", null, { device: "webgpu" });
```

## Memory management

Pipelines hold onto model weights and backend sessions. **Always call
`pipe.dispose()`** when you're done with one — especially in long-running
servers, before loading a replacement, or on component unmount.

```javascript
const pipe = await pipeline("sentiment-analysis");
try {
const result = await pipe("Great!");
} finally {
await pipe.dispose();
}
```

## Configuration

The [`env`](https://huggingface.co/docs/transformers.js/api/env) export lets
you control model sources, caching, logging, and the fetch function.

```javascript
import { env, LogLevel } from "@huggingface/transformers";

env.allowRemoteModels = true;
env.useFSCache = true; // Node.js: cache downloaded models on disk
env.useBrowserCache = true; // Browser: cache via the Cache API
env.logLevel = LogLevel.WARNING;
```

See [`references/CONFIGURATION.md`](references/CONFIGURATION.md) for the full
set of environment options, cache management, and private / gated models.

## Pipeline options

Every pipeline accepts a `progress_callback` for download progress plus options
controlling device, dtype, and caching. Task-specific call options (e.g.
`top_k`, `max_new_tokens`, streaming, chat templates) live with each task in
[`references/TASKS.md`](references/TASKS.md). Common options and their types:
[`references/PIPELINE_OPTIONS.md`](references/PIPELINE_OPTIONS.md).

## Things to never do

- **Don't reuse a disposed pipeline.** Create a new one with `pipeline(...)` after `dispose()`.
- **Don't recreate pipelines inside hot loops.** Create once, call many times.
- **Don't block startup on model downloads.** Show progress via `progress_callback`.
- **Don't fabricate model IDs.** Confirm a model exists on the Hub and has ONNX files
(look for an `onnx/` directory in the repo) before suggesting it to a user.

## Reference documentation

- Official site: <https://huggingface.co/docs/transformers.js>
- API reference: <https://huggingface.co/docs/transformers.js/api/pipelines>
- Examples repo: <https://github.com/huggingface/transformers.js-examples>

This skill's local references:

- [`TASKS.md`](references/TASKS.md) — recipes for every task, grouped by modality _(generated)_
- [`CONFIGURATION.md`](references/CONFIGURATION.md) — `env` options, caching, model inspection
- [`PIPELINE_OPTIONS.md`](references/PIPELINE_OPTIONS.md) — common pipeline options, dtype, device, generation parameters
166 changes: 166 additions & 0 deletions .ai/skills/transformers-js/references/CONFIGURATION.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,166 @@
# Configuration

All global configuration lives on the `env` object exported from the package.
Mutating fields on `env` changes library-wide behavior at runtime.

```javascript
import { env, LogLevel } from "@huggingface/transformers";
```

## Quick examples

<!-- @generated:start id=examples:env -->
**Example:** Load models from your own server and disable remote downloads.

```javascript
import { env } from '@huggingface/transformers';
env.allowRemoteModels = false;
env.localModelPath = '/path/to/local/models/';
```

**Example:** Point the filesystem cache at a custom directory (Node.js).

```javascript
import { env } from '@huggingface/transformers';
env.cacheDir = '/path/to/cache/directory/';
```
<!-- @generated:end id=examples:env -->

## All options

<!-- @generated:start id=typedef:TransformersEnvironment -->
| Option | Type | Description |
|--------|------|-------------|
| `version` | `string` | This version of Transformers.js. |
| `backends` | `object` | Expose environment variables of different backends, allowing users to set these variables if they want to. |
| `logLevel` | `number` | The logging level. Use LogLevel enum values. Defaults to LogLevel.ERROR. |
| `allowRemoteModels` | `boolean` | Whether to allow loading of remote files, defaults to `true`. If set to `false`, it will have the same effect as setting `local_files_only=true` when loading pipelines, models, tokenizers, processors, etc. |
| `remoteHost` | `string` | Host URL to load models from. Defaults to the Hugging Face Hub. |
| `remotePathTemplate` | `string` | Path template to fill in and append to `remoteHost` when loading models. |
| `allowLocalModels` | `boolean` | Whether to allow loading of local files, defaults to `false` if running in-browser, and `true` otherwise. If set to `false`, it will skip the local file check and try to load the model from the remote host. |
| `localModelPath` | `string` | Path to load local models from. Defaults to `/models/`. |
| `useFS` | `boolean` | Whether to use the file system to load files. By default, it is `true` if available. |
| `useBrowserCache` | `boolean` | Whether to use Cache API to cache models. By default, it is `true` if available. |
| `useFSCache` | `boolean` | Whether to use the file system to cache files. By default, it is `true` if available. |
| `cacheDir` | `string\|null` | The directory to use for caching files with the file system. By default, it is `./.cache`. |
| `useCustomCache` | `boolean` | Whether to use a custom cache system (defined by `customCache`), defaults to `false`. |
| `customCache` | `CacheInterface\|null` | The custom cache to use. Defaults to `null`. Note: this must be an object which implements the `match` and `put` functions of the Web Cache API. For more information, see https://developer.mozilla.org/en-US/docs/Web/API/Cache. |
| `useWasmCache` | `boolean` | Whether to pre-load and cache WASM binaries and the WASM factory (.mjs) for ONNX Runtime. Defaults to `true` when cache is available. This can improve performance and enables offline usage by avoiding repeated downloads. |
| `cacheKey` | `string` | The cache key to use for storing models and WASM binaries. Defaults to 'transformers-cache'. |
| `experimental_useCrossOriginStorage` | `boolean` | Whether to use the Cross-Origin Storage API to cache model files across origins, allowing different sites to share the same cached model weights. Defaults to `false`. Requires the Cross-Origin Storage Chrome extension: https://chromewebstore.google.com/detail/cross-origin-storage/denpnpcgjgikjpoglpjefakmdcbmlgih. The `experimental_` prefix indicates that the underlying browser API is not yet standardised and may change or be removed without a major version bump. For more information, see https://github.com/WICG/cross-origin-storage. |
| `fetch` | `(input: string \| URL, init?: any) => Promise<any>` | The fetch function to use. Defaults to `fetch`. |
<!-- @generated:end id=typedef:TransformersEnvironment -->

## Log levels

Pass one of these values to `env.logLevel`:

| Value | Numeric |
|----------------|---------|
| `LogLevel.DEBUG` | `10` |
| `LogLevel.INFO` | `20` |
| `LogLevel.WARNING` | `30` (default) |
| `LogLevel.ERROR` | `40` |
| `LogLevel.NONE` | `50` |

Higher numbers suppress more output. Setting `env.logLevel` also propagates to
ONNX Runtime, so you get matching verbosity from the inference backend.

## Common patterns

**Development (fast iteration with remote models):**

```javascript
env.allowRemoteModels = true;
env.useFSCache = true;
env.logLevel = LogLevel.INFO;
```

**Production Node.js server (air-gapped, local only):**

```javascript
env.allowRemoteModels = false;
env.allowLocalModels = true;
env.localModelPath = "/opt/models";
```

**Browser app with a CDN mirror:**

```javascript
env.remoteHost = "https://cdn.example.com";
env.useBrowserCache = true;
```

## Custom fetch (private / gated models, retries, etc.)

Override `env.fetch` to inject auth headers, retry logic, or abort signals:

```javascript
env.fetch = (url, init) =>
fetch(url, {
...init,
headers: { ...init?.headers, Authorization: `Bearer ${process.env.HF_TOKEN}` },
});
```

## Custom cache backends

```javascript
env.useCustomCache = true;
env.customCache = {
async match(key) { /* return Response or undefined */ },
async put(key, response) { /* persist */ },
};
```

The cache must implement the Web Cache API's `match` and `put` methods.

## Inspecting models before loading

`ModelRegistry` reports which files a model needs, whether they're cached
locally, which dtypes the model ships with, and can clear caches selectively.
Useful for pre-flight UI and disk management.

```javascript
import { ModelRegistry } from "@huggingface/transformers";

const task = "feature-extraction";
const modelId = "onnx-community/all-MiniLM-L6-v2-ONNX";

const cached = await ModelRegistry.is_pipeline_cached(task, modelId);
if (!cached) {
const files = await ModelRegistry.get_pipeline_files(task, modelId);
// ask the user to confirm before `pipeline(...)` downloads these.
}

const dtypes = await ModelRegistry.get_available_dtypes(modelId);
const dtype = ["q4", "q8", "fp16", "fp32"].find((d) => dtypes.includes(d));
const pipe = await pipeline(task, modelId, { dtype });
```

<!-- @generated:start id=class:ModelRegistry -->
Static class for cache and file management operations.

**Methods**

- `get_files(modelId, [options])` → `Promise<string[]>` — Get all files (model, tokenizer, processor) needed for a model.
- `get_pipeline_files(task, modelId, [options])` → `Promise<string[]>` — Get all files needed for a specific pipeline task.
- `get_model_files(modelId, [options])` → `Promise<string[]>` — Get model files needed for a specific model.
- `get_tokenizer_files(modelId)` → `Promise<string[]>` — Get tokenizer files needed for a specific model.
- `get_processor_files(modelId)` → `Promise<string[]>` — Get processor files needed for a specific model.
- `get_available_dtypes(modelId, [options])` → `Promise<string[]>` — Detects which quantization levels (dtypes) are available for a model by checking which ONNX files exist on the hub or locally.
- `is_cached(modelId, [options])` → `Promise<boolean>` — Quickly checks if a model is fully cached by verifying `config.json` is present, then confirming all required files are cached.
- `is_cached_files(modelId, [options])` → `Promise<CacheCheckResult>` — Checks if all files for a given model are already cached, with per-file detail.
- `is_pipeline_cached(task, modelId, [options])` → `Promise<boolean>` — Quickly checks if all files for a specific pipeline task are cached by verifying `config.json` is present, then confirming all required files are cached.
- `is_pipeline_cached_files(task, modelId, [options])` → `Promise<CacheCheckResult>` — Checks if all files for a specific pipeline task are already cached, with per-file detail.
- `get_file_metadata(path_or_repo_id, filename, [options])` → `Promise<{exists: boolean, size?: number, contentType?: string, fromCache?: boolean}>` — Get metadata for a specific file without downloading it.
- `clear_cache(modelId, [options])` → `Promise<CacheClearResult>` — Clears all cached files for a given model.
- `clear_pipeline_cache(task, modelId, [options])` → `Promise<CacheClearResult>` — Clears all cached files for a specific pipeline task.
<!-- @generated:end id=class:ModelRegistry -->

To reclaim disk space for a specific model or task:

```javascript
await ModelRegistry.clear_cache(modelId);
await ModelRegistry.clear_pipeline_cache(task, modelId);
```
Loading
Loading