Skip to content

Add hardware-accelerated decoding and encoding support#445

Merged
IanButterworth merged 6 commits into
masterfrom
ib/hwaccel
Mar 16, 2026
Merged

Add hardware-accelerated decoding and encoding support#445
IanButterworth merged 6 commits into
masterfrom
ib/hwaccel

Conversation

@IanButterworth

@IanButterworth IanButterworth commented Mar 13, 2026

Copy link
Copy Markdown
Member

Made possible now that we have JuliaPackaging/Yggdrasil#13288

Developed with Claude


Decoding (openvideo / load):

  • Add hwaccel kwarg to VideoReader/openvideo for GPU-accelerated decoding (VideoToolbox, CUDA, VAAPI, QSV)
  • Frames auto-transferred from HW surface to CPU; identical to SW decoding
  • Lazy SwsTransform rebuild when actual HW download format differs from initial guess
  • Store swscale/sws_color options on VideoReader so rebuild preserves them
  • Add available_hw_devices() public API

Encoding (open_video_out / save):

  • Add hwaccel kwarg to VideoWriter/open_video_out/save for HW encoding
  • Auto-select HW encoder when hwaccel set and codec_name is nil
  • Create hw_device_ctx for codecs needing HW_DEVICE_CTX (NVENC, QSV)
  • Add available_hw_encoders() public API

Tests:

  • test/hwaccel.jl: decode + encode tests
  • util/bench_hwaccel.jl: SW vs HW throughput benchmark

Docs:

  • docs/src/reading.md: HW decoding section
  • docs/src/writing.md: HW encoding section

On my M2 mac

% julia --project util/bench_hwaccel.jl     
Platform HW device: :videotoolbox
HW encoders:        ["h264_videotoolbox", "hevc_videotoolbox", "prores_videotoolbox"]

Resolution  Mode                  Enc (s)   Enc FPS   File MB   Dec (s)   Dec FPS   
------------------------------------------------------------------------------------
640×480     SW encode/decode      0.165     728.1     0.01      0.153     783.4     
            HW enc → SW dec       0.158     759.0     0.02      0.039     3097.6    
            SW enc → HW dec       -         -         -         0.198     604.7     
            HW enc → HW dec       -         -         -         0.231     518.5     

1280×720    SW encode/decode      0.519     231.4     0.02      0.18      666.4     
            HW enc → SW dec       0.404     297.3     0.04      0.108     1109.2    
            SW enc → HW dec       -         -         -         0.409     293.2     
            HW enc → HW dec       -         -         -         0.505     237.6     

1920×1080   SW encode/decode      1.13      106.2     0.02      0.417     287.7     
            HW enc → SW dec       0.851     141.0     0.07      0.241     498.0     
            SW enc → HW dec       -         -         -         0.768     156.2     
            HW enc → HW dec       -         -         -         1.02      117.7 
     Testing Running tests...
┌ Info: Available HW device types (compile-time)
│   hw_devs =
│    1-element Vector{Symbol}:
└     :videotoolbox
┌ Info: Functional HW device types (runtime)
│   working_hw_devs =
│    1-element Vector{Symbol}:
└     :videotoolbox
┌ Info: Available HW encoders
│   hw_encoders =
│    3-element Vector{String}:
│     "h264_videotoolbox"
│     "hevc_videotoolbox"
└     "prores_videotoolbox"
Test Summary:                                                | Pass  Total     Time
VideoIO                                                      | 1239   1239  1m56.4s
  Pointer utilities                                          |   10     10     0.1s
  AVPtr                                                      |   30     30     0.1s
  Reading of various example file formats                    |  140    140    25.9s
  Reading monochrome videos                                  |    5      5     0.7s
  Reading RGB video as monochrome                            |    2      2     2.1s
  IO reading of various example file formats                 |   20     20    34.1s
  Reading video metadata                                     |    9      9     0.2s
  Encoding video across all supported colortypes             |    4      4     0.4s
  Encoding RGBA images (#440)                                |    2      2     0.1s
  Encoding with bare Gray type (#336)                        |    2      2     0.2s
  Missing file extension error (#370)                        |    1      1     0.0s
  Simultaneous encoding and muxing                           |  864    864     2.4s
  Monochrome rescaling                                       |    8      8     0.3s
  Encoding monochrome videos                                 |   12     12     1.1s
  Encoding video with integer frame rates                    |   44     44     0.8s
  Encoding video with rational frame rates                   |    4      4     0.2s
  Encoding video with float frame rates                      |    4      4     0.2s
  Encoding with colorspace and framerate fixes               |   16     16     0.6s
  Video encode/decode accuracy (read, encode, read, compare) |   15     15     5.7s
  Friendly error messages (#360)                             |    1      1     0.0s
  eof correctness for file streams (#320)                    |    4      4     0.1s
  Hardware acceleration                                      |   42     42     2.3s
     Testing VideoIO tests passed 

Decoding (openvideo / load):
- Add hwaccel kwarg to VideoReader/openvideo for GPU-accelerated decoding (VideoToolbox, CUDA, VAAPI, QSV)
- Frames auto-transferred from HW surface to CPU; identical to SW decoding
- Lazy SwsTransform rebuild when actual HW download format differs from initial guess
- Store swscale/sws_color options on VideoReader so rebuild preserves them
- Add available_hw_devices() public API

Encoding (open_video_out / save):
- Add hwaccel kwarg to VideoWriter/open_video_out/save for HW encoding
- Auto-select HW encoder when hwaccel set and codec_name is nil
- Create hw_device_ctx for codecs needing HW_DEVICE_CTX (NVENC, QSV)
- Add available_hw_encoders() public API

Tests:
- test/hwaccel.jl: decode + encode tests
- util/bench_hwaccel.jl: SW vs HW throughput benchmark

Docs:
- docs/src/reading.md: HW decoding section
- docs/src/writing.md: HW encoding section

Co-Authored-By: Claude <81847+Claude@users.noreply.github.com>
…devices in tests

- Add `hwaccel_available(sym::Symbol) -> Bool` to src/avio.jl: probes whether
  a device type is actually usable at runtime (not just compile-time supported)
  by attempting av_hwdevice_ctx_create and releasing the context on success
- Export hwaccel_available
- Improve error messages in VideoReader and VideoWriter hw device ctx creation
  to hint at hwaccel_available
- Rewrite test/hwaccel.jl: compute working_hw_devs = filter(hwaccel_available, hw_devs)
  at the top; remove try/catch from all decode tests (device is confirmed operational);
  use working_hw_devs in all loops; add standalone hwaccel_available testset
- Fixes CI failures on Ubuntu where :vaapi/:drm appear in available_hw_devices()
  (compile-time support) but no GPU hardware is present at runtime
- Docs: add hwaccel_available @docs block and usage examples to reading.md and
  writing.md alongside available_hw_devices / available_hw_encoders"
@codecov

codecov Bot commented Mar 16, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 90.41096% with 14 lines in your changes missing coverage. Please review.
✅ Project coverage is 79.12%. Comparing base (738548b) to head (bf527ff).
⚠️ Report is 2 commits behind head on master.

Files with missing lines Patch % Lines
src/encoding.jl 80.76% 10 Missing ⚠️
src/avio.jl 95.74% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #445      +/-   ##
==========================================
+ Coverage   77.87%   79.12%   +1.24%     
==========================================
  Files          10       10              
  Lines        1329     1466     +137     
==========================================
+ Hits         1035     1160     +125     
- Misses        294      306      +12     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@IanButterworth IanButterworth marked this pull request as ready for review March 16, 2026 13:44
- available_hw_encoders: iterate all hw config indices instead of only index 0
- _vio_rebuild_sws_for_hw!: read dst colorspace from dstframe fields instead
  of re-deriving defaults; persist color_range/color_primaries on dstframe
  at SwsTransform construction so rebuild has accurate values
- decode: update r.input_pix_fmt for non-SwsTransform hw paths (AVFramePtr,
  GrayTransform) not just SwsTransform, fixing stale format after lazy HW
  format correction
- retrieve_raw / retrieve_raw!: call fill_graph_input! before buffer
  allocation / size check so lazy HW format correction fires first and the
  buffer is sized for the real pixel format
- encoding: add _vio_is_rgb_pix_fmt helper; when hwaccel is set and
  avcodec_find_best_pix_fmt_of_list auto-selects a packed-RGB format
  (e.g. BGRA), re-select using NV12 as source so VideoToolbox HEVC/ProRes
  encoders don't reject the stream at avcodec_open2
- test/hwaccel.jl: add transcode=false raw interface test, explicit
  target_colorspace_details test, bump explicit-codec test to 128x128,
  use .mov for ProRes, pass hwaccel=dev to explicit-codec test, remove
  misused @test_broken annotations
@IanButterworth IanButterworth merged commit 6d2f1dd into master Mar 16, 2026
18 checks passed
@IanButterworth IanButterworth deleted the ib/hwaccel branch March 16, 2026 21:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant