Skip to content

Auto device on Linux x64 fails hard when CUDA shared library is unavailable instead of falling back #1642

@alexewerlof

Description

@alexewerlof

System Info

Minimal repro

Environment

  • OS: Linux x64
  • CPU: AMD Ryzen (no NVIDIA CUDA runtime installed)
  • Node: LTS 24.14.0
  • @huggingface/transformers: 4.0.1 (or exact resolved version)

Environment/Platform

  • Website/web-app
  • Browser extension
  • Server-side (e.g., Node.js, Deno, Bun)
  • Desktop app (e.g., Electron)
  • Other (e.g., VSCode extension)

Description

Summary

When running in Node on Linux x64 with device set to 'auto', Transformers.js currently prioritizes 'cuda' and fails with a hard error if CUDA shared libraries are not available. This prevents fallback to other providers (like 'webgpu' or 'cpu').

Observed error

Error: OrtSessionOptionsAppendExecutionProvider_Cuda: Failed to load shared library
      at new OnnxruntimeSessionHandler (node_modules/onnxruntime-node/lib/backend)
      at Immediate.<anonymous> (node_modules/onnxruntime-node/lib/backend.ts:147:)
      at process.processImmediate (node:internal/timers:504:21)

On Linux x64, CUDA is added to supported devices first in onnx.js:

    switch (process.platform) {
        case 'win32': // Windows x64 and Windows arm64
            supportedDevices.push('dml');
            break;
        case 'linux': // Linux x64 and Linux arm64
            if (process.arch === 'x64') {
                supportedDevices.push('cuda');
            }
            break;
        case 'darwin': // MacOS x64 and MacOS arm64
            supportedDevices.push('coreml');
            break;
    }

    supportedDevices.push('webgpu');
    supportedDevices.push('cpu');
    defaultDevices = ['cpu'];

Proposed fix

In Node, when device is 'auto' and session creation fails specifically due to CUDA provider shared-library load error, retry once with executionProviders filtered to remove cuda.

Even better: 'auto' should retry the list from the best performance preferred first to safe fallback option ('cpu').

'auto' returns that list which is passed directly to session options in session.js.

Reproduction

Run this snippet on an AMD APU

import { pipeline } from '@huggingface/transformers'

await pipeline('text-generation', 'onnx-community/LFM2-1.2B-ONNX', {
  device: 'auto',
  dtype: 'q4',
})

Expected behavior

With device set to 'auto', unavailable CUDA provider should be skipped and session creation should continue with remaining providers (for example webgpu or cpu).

Actual behavior

Session creation fails immediately on CUDA provider shared-library error.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions