Skip to content

[UNMAINTAINED] feat(vnc): add VNC server with H.264#1447

Closed
mcuelenaere wants to merge 21 commits into
jetkvm:devfrom
mcuelenaere:claude/quirky-shaw-b8e9bf
Closed

[UNMAINTAINED] feat(vnc): add VNC server with H.264#1447
mcuelenaere wants to merge 21 commits into
jetkvm:devfrom
mcuelenaere:claude/quirky-shaw-b8e9bf

Conversation

@mcuelenaere
Copy link
Copy Markdown
Contributor

@mcuelenaere mcuelenaere commented May 6, 2026

Closes #117

Summary

Adds a VNC server that an H.264-capable VNC client (TigerVNC ≥ 1.13.0 built with FFmpeg, or PiKVM's mdevaev/tigervnc build) can connect to as an alternative to the WebRTC web UI. Frames are shipped using RFB pseudo-encoding 50 (rfbproto PR #39, TigerVNC PR #1194) — the JetKVM device emits H.264 NALs, and the server forwards them to the client unchanged. No software decode on the device.

Why this approach

The "obvious" path — implement a normal RFB server that ships raw RGB pixels — is not viable here. JetKVM's video pipeline only emits already-encoded H.264/H.265 NALs from an opaque native binary; there is no path to raw frames or JPEG snapshots. Software-decoding H.264 just to re-encode as Tight/Raw would burn most of the CPU budget on the embedded ARM SoC.

PiKVM solved the same problem with RFB pseudo-encoding 50 ("Open H.264 Encoding"), which carries an Annex-B H.264 bytestream straight to the client. JetKVM already produces exactly the right bytes. Compatible with TigerVNC ≥ 1.13.0 built with FFmpeg.

RDP was considered (and the user asked) but rejected: the RDP stack (TPKT/X.224 → MCS → TLS/CredSSP/NLA → RDPGFX for H.264) is roughly an order of magnitude more code than RFB, and there is no mature pure-Go RDP server.

What's in the PR

Layered bottom-up across 9 commits, each independently reviewable and not breaking the build:

  1. video: refcount the capture pipeline — replaces the WebRTC-specific VideoStart/VideoStop with named-consumer refcount (acquireVideoStream / releaseVideoStream). Behaviour-preserving prerequisite.
  2. internal/rfb: protocol primitives — pure-Go RFB 3.8 server library (handshake, VNCAuth via crypto/des, message decoders, Raw + OpenH264 + DesktopSize encoding writers). 38 unit tests, no JetKVM imports.
  3. internal/rfb: keysym to HID Usage ID mapping — full ~150-entry table: ASCII printable, F1–F24, modifiers, navigation, full keypad, lock keys, Latin-1.
  4. internal/rfb: placeholder framebuffer for non-H.264 clients — "Please use TigerVNC ≥ 1.13.0 with H.264 support" rendered via golang.org/x/image/font/basicfont. Pinned to v0.18.0 for Go 1.24 compatibility.
  5. vnc: scaffold (config, JSON-RPC, logger) — adds VncEnabled / VncPort / VncPassword config fields, JSON-RPC getters/setters, vncLogger, and stub Start/StopVNCServer.
  6. vnc: server, frame fan-out, HID forwarding — fills out the scaffold: TCP listener, per-client bounded channels (cap=2, non-blocking sends drop on full so slow clients can't backpressure WebRTC), SPS/PPS caching for late-joiners, IDR-frame filtering, HID forwarding (PointerEventrpcAbsMouseReport + wheel translation; KeyEvent → keysym lookup → rpcKeypressReport). resolveCodec forces H.264 when VNC is enabled so a Mac/Windows browser can't renegotiate H.265 mid-VNC-session.
  7. ui: VNC settings panel/settings/vnc route with toggle, port, password. Localization keys added to all 14 language files (English values only — translations are a follow-up).
  8. vnc: drop VncAllowOverWAN — removed an opaque non-loopback-bind guard that surfaced as a generic "internal error" in the frontend. Operators are responsible for tunneling VNCAuth via SSH/Tailscale on untrusted networks; godoc on VncPassword says so.
  9. vnc: pin H.264 codec when VNC starts the capture pipeline — fixes a subtle bug uncovered by on-device testing: when VNC was the first consumer of the capture pipeline (no browser open), VideoSetCodecType was never called, so the codec stayed at whatever the previous WebRTC session left it. With H.265 sessions on Mac/Windows browsers, VNC then shipped H.265 NALs over OpenH264 (which is H.264-only) and clients silently failed to decode. acquireVideoStreamWithCodec(consumer, codecType) now sets the codec only when the caller is the first consumer, preserving the "must not be called mid-stream" invariant.

Tests

  • 38 unit tests in internal/rfb covering handshake byte sequences, VNCAuth (DES), message round-trips, Raw + OpenH264 + DesktopSize encoding writers, keysym→HID lookup, and placeholder rendering.
  • 11 unit tests in the top-level kvm package for the H.264 Annex-B splitter, mouse-coord scaling, and wheel-button rising-edge detection.
  • End-to-end on-device: deployed and verified handshake, auth, and HID input forwarding work; codec-pinning bug surfaced and fixed under real load.

Known limitations / follow-ups

  • Up-to-GOP-length blank screen on connect. No ForceKeyFrame RPC exists in the native binary. New clients wait for the next natural IDR (typically ≤ 1–2 s). Documented; the right fix is a native-binary change.
  • No JPEG fallback for non-H.264 clients. They get the placeholder Raw rect; mouse/keyboard still work. JPEG fallback would require native-binary support for a JPEG memsink.
  • VeNCrypt / TLS not yet implemented. VNCAuth is the only auth mechanism; users on untrusted networks should tunnel via SSH or Tailscale, or use LocalLoopbackOnly. The repo already has TLS infrastructure (web_tls.go) that VeNCrypt can reuse later.
  • Clipboard, mobile-VNC quirks, PiKVM-style hotkey sequences — out of scope; nice follow-ups.
  • Coordination with feat: add RTSP server, Chromecast WebRTC casting, and multi-session support #1329. That PR adds RTSP and a multi-session WebRTC registry. It also has H.264 codec coordination needs; if it merges first, the small splitAnnexB helper duplicated here can be consolidated. Per its review status I chose not to base on its branch — the conflict is one line in native.go either way.

Logging

The default log level is WARN. To see VNC diagnostics (server listening, client connections, advertised encodings):

JETKVM_LOG_INFO=vnc /userdata/jetkvm/bin/jetkvm_app
# or full trace:
JETKVM_LOG_TRACE=vnc

Set persistently via a systemd unit override at /etc/systemd/system/jetkvm.service.d/log.conf.

Checklist

  • Ran make test_e2e locally and passed
  • Linked to issue(s) above by issue number (Closes #117)
  • One problem per PR (no unrelated changes)
  • Lints pass; CI green — running on push
  • Tricky parts are commented in code (codec coordination, frame buffer aliasing, IDR-wait, SPS/PPS cache priming)

🤖 Generated with Claude Code


Note

High Risk
High risk because it introduces a new TCP-exposed remote-access service with authentication handling and changes capture-pipeline start/stop/codec coordination shared with WebRTC, which can impact availability and video behavior.

Overview
Adds a configurable VNC server (RFB 3.8) that listens on a user-set port, supports None or classic VNCAuth, and streams video using OpenH264 pseudo-encoding 50 (with a rendered Raw placeholder for non-H.264 clients), including resolution-change signaling and an extended mouse-buttons negotiation path.

Integrates VNC into the runtime: new config fields + JSON-RPC getVNCConfig/setVNCConfig, new vnc logger, VNC start on boot when enabled, and per-frame fan-out from the native video callback with SPS/PPS caching, IDR resync logic, and HID input forwarding.

Refactors capture-pipeline lifecycle to a named-consumer refcount (acquireVideoStreamWithCodec/releaseVideoStream) used by both WebRTC and VNC, updates HDMI sleep behavior to depend on active consumers, and forces H.264 codec selection whenever VNC is enabled to avoid H.265/VNC decoder desync.

Adds a new /settings/vnc UI route (with localization keys) to manage enable/port/password/keymap, and introduces golang.org/x/image dependency for placeholder rendering.

Reviewed by Cursor Bugbot for commit 83ca5c2. Bugbot is set up for automated code reviews on this repo. Configure here.

mcuelenaere and others added 9 commits May 1, 2026 12:33
Replace the WebRTC-specific VideoStart/VideoStop calls with a named-
consumer refcount. The first acquire starts the native pipeline and
pauses the HDMI sleep ticker; the last release stops it. WebRTC is
the only consumer today, so behavior is unchanged.

This sets up the infrastructure for additional consumers (RTSP, VNC)
to share the capture pipeline without each duplicating its own
start/stop logic or fighting over the sleep timer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pure-Go RFB 3.8 protocol library, no JetKVM dependencies. Includes
the version handshake, security negotiation, VNCAuth (DES challenge),
ClientInit/ServerInit, all client-message decoders, and Raw +
OpenH264 (pseudo-encoding 50) + DesktopSize encoding writers.

OpenH264 follows the wire format from rfbproto PR jetkvm#39 and TigerVNC
PR jetkvm#1194 — payload is U32 length, U32 flags, then an Annex-B H.264
bytestream. Compatible with TigerVNC ≥ 1.13.0 built with FFmpeg.

Tests cover the handshake byte sequence, VNCAuth match against a
DES-derived reference, message decoder round-trips, and encoding
writers byte-for-byte.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Translates X11 keysyms (carried by RFB KeyEvent) to USB HID Usage IDs
(consumed by JetKVM's USB gadget). Covers ASCII printable, F1–F24, all
modifiers, navigation cluster, full keypad, lock keys, Print/Pause/Menu.

Shifted ASCII keysyms ('!', '@', etc.) map to the same scancode as
their unshifted counterpart; clients send Shift_L/R presses
separately, and the host applies the modifier itself. This matches
TigerVNC and PiKVM behaviour.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Renders a "Please use TigerVNC >= 1.13.0 with H.264 support" message
on a dark blue background, sized to the framebuffer dimensions. Used
as a Raw rectangle (encoding 0) when a client connects but doesn't
advertise OpenH264 (encoding 50) in its SetEncodings list.

Uses x/image/font/basicfont (Face7x13) — adds golang.org/x/image to
go.mod, pinned to v0.18.0 for Go 1.24 compatibility.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add VncEnabled, VncPort (default 5900), VncPassword, VncAllowOverWAN
fields to Config plus JSON-RPC getters/setters and a vncLogger. The
new vnc.go exposes Start/StopVNCServer entry points that subsequent
commits fill in; today they just log.

The VncPassword setter accepts a "********" sentinel meaning "leave
unchanged", so the frontend can submit the form without re-typing
the password and never round-trips the plaintext value back from
the server.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fills out the VNC scaffold from the previous commit into a working
server. Clients connect, authenticate (VNCAuth, or none if no
password is set), receive H.264 frames as RFB encoding-50 rectangles,
and drive the host's USB HID keyboard and mouse.

Wiring:
- native.go's frame callback now also calls vncServer.WriteFrame(),
  which fans the encoded frame out to per-client bounded channels
  (cap 2; non-blocking sends drop on full so slow clients can't
  back-pressure WebRTC).
- OnVideoStateChange notifies VNC clients of resolution changes;
  the dispatcher emits a DesktopSize pseudo-encoding update plus
  the encoding-50 reset-decoder bit on the next outgoing rect.
- main.go starts the VNC server when config.VncEnabled.
- webrtc.go's resolveCodec forces H.264 when VNC is enabled, so a
  Chrome session can't renegotiate to H.265 mid-stream and desync
  any TigerVNC clients (encoding 50 is H.264-only).

Protocol fidelity:
- Cached SPS/PPS are prepended to the first encoding-50 rectangle a
  new client sees, so its decoder is primed before the next IDR
  arrives. We can't ask the encoder for an immediate keyframe, so
  non-IDR frames are dropped at the dispatcher until the natural
  IDR arrives (typically <= 1-2 s).
- Clients that don't advertise OpenH264 (encoding 50) get a Raw
  rectangle with the "please use TigerVNC >= 1.13.0" placeholder.
- VncAllowOverWAN gates non-loopback binding so a casually-deployed
  device doesn't expose VNCAuth on the public internet.

splitAnnexB() is duplicated from PR jetkvm#1329's rtsp.go; if that PR
lands first, follow up with a consolidation commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a /settings/vnc route with toggle, port, password, and
allow-over-WAN switch, wired to the JSON-RPC getVNCConfig /
setVNCConfig methods registered in the previous commit. Includes a
help banner pointing users to TigerVNC >= 1.13.0 with H.264 support
for video.

Localization keys are added to all 14 language files; only the
English values are localized, the rest hold the same English text
and need translation in a follow-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The flag's failure mode was opaque: if a user enabled VNC without
also flipping VncAllowOverWAN, the JSON-RPC call returned an error
that surfaced as a generic "internal error" toast in the frontend
with no actionable hint. Two settings to enable one feature is also
a poor experience.

Drop the field everywhere. VNCAuth's well-known weaknesses are now
documented in the VncPassword godoc and the help banner on the
settings page already points users to TigerVNC + tunneling for
untrusted networks.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When VNC was the first consumer of the video stream (no browser open
when the first VNC client connected), acquireVideoStream() called
VideoStart() without first calling VideoSetCodecType, so the codec
stayed at whatever the previous WebRTC session had left it. If a
browser had negotiated H.265 before disconnecting, VNC then shipped
H.265 NALs over OpenH264 (encoding 50), which the spec defines as
H.264-only — clients silently fail to decode.

Split acquireVideoStream into a thin wrapper plus
acquireVideoStreamWithCodec(consumer, codecType). VNC always passes
0 (H.264). WebRTC passes the codec it negotiated. VideoSetCodecType
runs only when the caller is the first consumer (i.e. about to call
VideoStart), preserving the "must not be called mid-stream"
invariant.

Also log the client's full SetEncodings list (info-level) when a
client connects, so it is obvious from the logs whether TigerVNC is
actually advertising encoding 50.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented May 6, 2026

CLA assistant check
All committers have signed the CLA.

@mcuelenaere mcuelenaere changed the title feat(vnc): add VNC server with H.264 (RFB encoding 50) feat(vnc): add VNC server with H.264 May 6, 2026
Comment thread vnc_conn.go
Comment thread ui/src/routes/devices.$id.settings.vnc.tsx
Comment thread internal/rfb/messages.go
mcuelenaere and others added 2 commits May 6, 2026 20:53
If a browser tab negotiated H.265 before VNC was toggled on, the
capture pipeline keeps emitting H.265 NALs. resolveCodec only
forces H.264 for *new* WebRTC sessions; the in-flight one is not
disturbed. So when a TigerVNC client subsequently connected, it
received H.265 NALs marked as OpenH264 (encoding 50), which the
spec defines as H.264-only — the client's decoder failed silently
and the user kept seeing the (already-rendered) placeholder.

After enabling VNC via JSON-RPC, check the running codec; if it is
not H.264, cycle the pipeline (VideoStop → SetCodecType(0) →
VideoStart). The active WebRTC session will see frames briefly
disappear and then need to refresh — the warning log says so.

Also adds trace-level diagnostics so the user can see what is
happening with `JETKVM_LOG_TRACE=vnc`:

  - codec value at the moment a VNC client connects
  - dropped non-IDR frames during the keyframe wait
  - rect emissions (OpenH264 size/flags or Raw placeholder)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three issues from Cursor Bugbot's review on commit 1d41410, plus the
golangci-lint failures from the same CI run.

1. Deadlock in writeUpdate (HIGH severity).

   Conn.LockWrite acquired writeMu, then Conn.Flush at the end
   re-acquired the same non-reentrant mutex. Every VNC client hung on
   its first FramebufferUpdate; this is most likely the actual cause
   of the placeholder-forever symptom seen on-device.

   The intended pattern was a multi-call composite write with a
   private flushLocked variant — but in practice we have a strict
   single-writer model: serveClient drives the handshake on its own
   goroutine, then the dispatcher goroutine is the only writer
   afterward. The reader goroutine never writes. So the mutex adds
   no safety, only a deadlock. Strip writeMu, LockWrite, UnlockWrite,
   and flushLocked. Document the single-writer requirement on Conn.

2. SetEncodings allocated on the wire-supplied count (LOW severity).

   make([]EncodingType, count) was allocating up to 65 535 * 4 bytes
   regardless of the maxEncodings cap (256). Allocate based on the
   capped n; the loop still drains the full count from the wire so
   framing for the next message stays aligned.

3. Save button hidden when VNC was disabled in the UI (MEDIUM
   severity).

   The button lived inside the {settings.enabled && (...)} block, so
   unchecking the toggle hid the button — the disabled state could
   not be persisted. Move the button outside the conditional.

Linter follow-ups (golangci-lint v2.4 in CI):

- goimports re-aligned const blocks in keysym.go and placeholder.go.
- Removed the unused acquireVideoStream wrapper now that all callers
  pass an explicit codec.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread vnc_conn.go
Comment thread native.go
The kvm-package-level vnc_h264.go and vnc_h264_test.go were the only
tests in package kvm, and that package transitively imports
internal/native — which links libjknative.a, only built for ARM.
Running 'go test ./...' on the CI's x86_64 Linux host therefore
failed at link time as soon as a test file made package kvm
participate in the test build.

Move the H.264 Annex-B splitter (and its tests) into internal/rfb,
where it logically belongs: the encoding-50 path is the only
consumer, and the rfb package has no platform-specific imports so
it tests cleanly on any host. Export it as rfb.SplitAnnexB.

Drop the trivial scaleCoord/rising helpers' tests with this; the
functions are five lines each and exercised end-to-end on the
device.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@mcuelenaere mcuelenaere force-pushed the claude/quirky-shaw-b8e9bf branch from c88a6e1 to a816f53 Compare May 6, 2026 19:11
Comment thread vnc.go
Comment thread vnc.go
Two more bugs from Cursor Bugbot's review on a816f53.

1. Right-click became middle-click on the host.

   RFB (RFC 6143 §7.5.5) and USB HID disagree on the order of bits 1
   and 2 in the mouse-button byte:

       RFB layout:  bit 0 = left, bit 1 = MIDDLE, bit 2 = RIGHT
       USB HID:     bit 0 = left, bit 1 = RIGHT,  bit 2 = MIDDLE

   handlePointerEvent was passing the RFB mask straight to
   rpcAbsMouseReport with only the wheel pseudo-bits stripped, so a
   VNC client's right-click reached the host as a middle-click and
   vice versa.

   Add rfb.PointerMaskToHIDButtons that does the bit 1<->2 swap and
   drops wheel bits, with a unit test for the obvious combinations.

2. Data race on the vncServer global.

   StartVNCServer / StopVNCServer wrote vncServer under vncServerMu,
   but the OnVideoStateChange and OnVideoFrameReceived callbacks in
   native.go read it without taking the lock. Even though the
   nil-receiver methods were safe, the read itself is undefined
   behaviour under Go's memory model and the race detector would
   flag it on every frame.

   Convert the global to atomic.Pointer[VNCServer]: hot-path readers
   use Load() (no locking, no allocations); the start/stop pair
   keeps a serialising mutex so a config change can't half-create or
   half-destroy the server, and uses Store / Swap to publish the
   pointer atomically.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread vnc_conn.go
mcuelenaere and others added 4 commits May 7, 2026 13:55
Logs the raw keysym, down flag, mapped HID code, and whether the
keysym was in our table — under TRC level so it only fires with
JETKVM_LOG_TRACE=vnc.

Motivated by an on-device report from a macOS TightVNC user where
left Cmd reached the host as Option, left Option was not received,
and Caps Lock didn't toggle. Mac VNC clients vary widely in how
they map modifiers to X11 keysyms; without seeing the actual
keysyms on the wire it is impossible to tell whether the client
mapping is wrong, our table is wrong, or macOS itself is
intercepting the key.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mirror what we did for KeyEvent: log the raw RFB button-mask, the
pointer position, and the resulting HID button byte at trace level
on every PointerEvent. Lets a user grep the device log to discover
exactly what their client sends when they click the side buttons
on a Logitech-style mouse.

Mouse buttons 4 and 5 (back/forward) have no canonical RFB
representation — the 8-bit button-mask reserves bits 0..2 for
primary buttons and 3..6 for wheel emulation. Some clients send
side buttons on bit 7 or via the Extended PointerEvent pseudo-
encoding (not implemented here). Wire up the actual bits once we
see what a real client emits.

The JetKVM USB gadget descriptor already advertises buttons 1..5
(Usage Maximum 5), so the host side will accept buttons 4 and 5 as
soon as we forward them.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…logging)

User reported visual artifacts during scrolling on the VNC stream
that don't appear in the WebRTC stream. Most likely cause: my
non-blocking per-client frame channel (cap 2) drops H.264 frames
under brief TCP backpressure when the encoder produces large P-frames.
Each dropped P-frame breaks the decoder's prediction chain and stays
broken until the next IDR (1–2 s later).

WebRTC doesn't show this because pion's RTP path runs over UDP and
doesn't queue at the application layer the same way.

Three changes, all aimed at making the TCP path keep up:

1. SetNoDelay(true) on accepted connections. The frame is sent as a
   composite of small writes (FBU header + rect header + length +
   flags + payload); Nagle would coalesce them with up to 40 ms of
   added latency, which is enough to overflow my channel during
   high-motion frames.

2. Per-client frame buffer cap raised from 2 to 8 — enough headroom
   for ~130 ms of TCP write stalls at 60 fps without dropping.

3. A WARN log every ~60th drop so the user can confirm the channel
   is overflowing if artifacts persist, plus the cumulative dropped
   count on disconnect.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Firing on every FramebufferUpdate is one log line per frame at 60
fps per VNC client; with JETKVM_LOG_TRACE=vnc enabled, that floods
the device log so badly that the actually-useful traces (key event,
pointer event, dropped frames) get pushed off.

The information was only ever useful for the original "is VNC even
sending rects?" diagnosis, which is no longer in question.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread vnc_conn.go
mcuelenaere and others added 2 commits May 7, 2026 14:26
Trace logs from a user on macOS TightVNC showed the client emits
non-standard keysyms for Cmd / Option:

  Mac key       keysym                  meaning on Linux
  ----------    ----------------------  --------------------
  Left Cmd      Alt_L            (e9)   Left Alt
  Right Cmd     Super_L          (eb)   Left Super
  Left Option   Mode_switch      (7e)   AltGr / unmapped
  Right Option  ISO_Level3_Shift (03)   AltGr (works)

So in the default Linux-convention table:
- Left Cmd reaches the host as Left Option,
- Right Cmd reaches the host as Left Cmd,
- Left Option doesn't reach the host at all.

Globally remapping these keysyms would break Linux/Windows clients
that send Alt_L for the actual Alt key. Add a per-server keymap
config option instead:

  vnc_keymap = "default"  (or empty)  -- current behaviour
  vnc_keymap = "macos"                -- TightVNC-on-Mac overrides

The macos overrides are applied before the default table lookup,
covering only the three keysyms we have evidence for. Other Mac VNC
clients (Apple Screen Sharing, Chicken, TigerVNC's Mac viewer) may
need additional entries — extend the override map as those traces
come in.

UI: dropdown on the VNC settings panel; localization keys added in
English to all 14 locale files (translations are a follow-up).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
TigerVNC's pseudo-encoding -316 widens the PointerEvent button-mask
from 7 bits to 15 bits, which is what carries Logitech-style back /
forward side buttons (X buttons 8 and 9 → HID buttons 4 and 5).

Source of truth for the wire format: TigerVNC's CMsgWriter.cxx /
SMsgReader.cxx / SMsgWriter.cxx. The extension works as follows:

1. Client advertises -316 in SetEncodings.
2. Server confirms by sending a fake FramebufferUpdate with a single
   0x0 rectangle at (0,0) encoded as -316.
3. The PointerEvent message gains a marker bit at bit 7 of the byte
   that follows the message type. Marker = 0 keeps the legacy 6-byte
   format; marker = 1 selects the extended 7-byte form, where the
   last byte holds RFB mask bits 7..14.

Server-side changes:

  - rfb.PointerEventMessage.ButtonMask widened to uint16; parser
    reads the trailing high byte when SetExtendedMouseButtons(true)
    has been called and the marker is set. Without negotiation, bit 7
    is treated as reserved (ignored), matching TigerVNC's writer
    which always clears it on legacy frames.
  - rfb.PointerMaskToHIDButtons routes RFB bit 7 (back) → HID bit 3
    and RFB bit 8 (forward) → HID bit 4. The JetKVM USB gadget
    descriptor already advertises Buttons 1..5, so the host sees
    them as soon as we forward.
  - vnc_conn detects the encoding in SetEncodings, queues an
    announce rect on the next FramebufferUpdate, and flips the
    rfb.Conn into extended mode so subsequent PointerEvents are
    parsed in the new format.

Tests cover the legacy/extended parser switch, marker-bit
boundary handling, the announce rect's wire bytes, and the
back/forward → HID translation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread internal/rfb/conn.go
// wire format. Set by SetExtendedMouseButtons after the server
// has sent the announce rectangle. Single-writer model also
// applies to this flag.
extendedMouseButtons bool
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Data race on extendedMouseButtons between goroutines

Medium Severity

The extendedMouseButtons field on rfb.Conn is a plain bool written by the dispatch goroutine via SetExtendedMouseButtons(true) and read concurrently by the read goroutine in ReadClientMessage. This is a data race under Go's memory model — no synchronization (mutex, atomic, or channel) ensures the reader sees the updated value. If the reader sees a stale false, it will parse an extended 7-byte PointerEvent as the legacy 6-byte format, consuming one byte too few and permanently misaligning all subsequent message parsing on that connection.

Additional Locations (2)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit ba5cd0a. Configure here.

Comment thread vnc_conn.go
mcuelenaere and others added 2 commits May 7, 2026 15:07
Three small tweaks aimed at the latency / artifact gap a user is
seeing between VNC and WebRTC. None of them is a silver bullet —
RFB is TCP-only and that's the architectural ceiling — but each one
removes a low-grade source of pain:

  - rfb.NewConn now sizes its bufio.Writer at 64 KB instead of the
    default 4 KB. With NoDelay enabled the dispatcher's flush
    boundary becomes the kernel's TCP write boundary, so a 50 KB
    P-frame previously turned into ~12 syscalls / TCP segments.
    Most P-frames now go out in one Write.

  - serveClient bumps SO_SNDBUF to 4 MB on the accepted TCP conn so
    a bursty I-frame doesn't immediately backpressure the dispatcher
    against a default ~64 KB kernel buffer.

  - The first-frame SPS/PPS prepend now only fires when the IDR
    *doesn't* already include parameter sets. Most modern H.264
    encoders emit SPS+PPS+IDR in a single access unit, so the
    previous unconditional prepend put SPS/PPS on the wire twice.
    Most decoders tolerate it but it's still an unusual stream
    shape — easier to debug if we don't generate it. Includes a
    helper frameHasParameterSets that walks the NALs once.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When my non-blocking channel send drops an H.264 P-frame, the
client's decoder loses the prediction chain and produces blocky
artifacts until the next IDR — typically ~1 s of GOP. The user's
recording confirmed this: smear/blocks during scroll that heal
exactly at GOP boundaries.

Today, after a drop the dispatcher kept feeding the client every
subsequent P-frame, all of which the decoder was already incapable
of decoding. That extends the visible glitch unnecessarily and
keeps the TCP send buffer full while doing nothing useful.

Convert waitingForIDR from a stateMu-guarded bool into an
atomic.Bool so WriteFrame can flip it from the producer goroutine
on every drop. Combined with the existing dispatcher logic that
already drops non-IDR frames in this state, the effect is:

  - First drop -> resync mode on for the connection.
  - WriteFrame stops queuing P-frames into a doomed channel
    (also relieves TCP backpressure faster).
  - The dispatcher pulls and discards anything still queued until
    the next IDR arrives.
  - When the IDR arrives, we reset c.primed (so SPS/PPS get
    re-prepended if the IDR doesn't already include them), set
    needsResetCtx so the rect carries OpenH264FlagResetContext,
    and the decoder re-initialises cleanly on a single keyframe.

This doesn't reduce the drop count — TCP backpressure under WiFi /
Tailscale is what it is — but each drop now produces a brief freeze
followed by a clean keyframe, instead of a full GOP of corruption.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 2 total unresolved issues (including 1 from previous review).

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 83ca5c2. Configure here.

Comment thread vnc_conn.go
// client regardless of whether we prepended or not — once a
// decodable frame has reached the client, the subsequent
// stream is self-contained.
c.primed = true
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Data race on primed field between goroutines

Medium Severity

The primed field is read (line 308) and written (line 330) in writeUpdate without holding stateMu, but WriteFrame sets c.primed = false under stateMu from the producer goroutine. This is a data race between the dispatcher and producer goroutines. The field needs to be read/written under the same lock in both locations, or moved into the existing stateMu-guarded block at the top of writeUpdate.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 83ca5c2. Configure here.

@quentinmit
Copy link
Copy Markdown

The "obvious" path — implement a normal RFB server that ships raw RGB pixels — is not viable here. JetKVM's video pipeline only emits already-encoded H.264/H.265 NALs from an opaque native binary; there is no path to raw frames or JPEG snapshots.

This doesn't seem correct to me. The native binary is open-source and in this repo, and it captures raw YUV data:

fmt.fmt.pix_mp.pixelformat = V4L2_PIX_FMT_YUYV;

That YUV data is currently being given to the RK_MPI video encoder, but it could be processed directly by the VNC code. (IDK if VNC supports YUV data, so you might need to ask for RGB data instead.)

You could probably even selectively feed portions of a frame to the hardware H.264 or JPEG encoders after calculating damage.

Comment thread vnc_conn.go
Comment on lines +65 to +71
// waitingForIDR signals that the client's H.264 decoder needs an
// IDR keyframe to (re)synchronise — set on initial connect and
// also flipped back to true by WriteFrame whenever a drop occurs,
// so the dispatcher can skip every subsequent P-frame instead of
// shoveling broken references at the decoder for the rest of the
// GOP. Cleared once the IDR has been emitted with the OpenH264
// reset-decoder flag.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without closing the feedback loop to the encoder, this is going to cause significant latency to build up if the client isn't able to request frames at the same rate as the encoder is producing them, until eventually the queue is full and it (re)starts with a new I-frame.

The update request should be plumbed all the way through to the encoder loop so it only encodes the next frame when there is a client that needs it:

if (RK_MPI_VENC_SendFrame(VENC_CHANNEL, &stFrame, 2000) != RK_SUCCESS)

mcuelenaere added a commit to mcuelenaere/kvm that referenced this pull request May 12, 2026
…eVideo

Introduces acquireVideoStream / releaseVideoStream / videoStreamHasConsumers
helpers in video.go that gate the native encoder's VideoStart / VideoStop
and the HDMI sleep ticker on a set of named consumers. The encoder runs
iff at least one consumer holds it; the 0→1 transition runs VideoStart so
the first delivered frame is an IDR.

Each WebRTC Session gets a stable id (uuid in newSession) and its own
consumer key "webrtc:<id>". The slot is acquired when ICE reaches
Connected and released when it reaches Closed. pauseVideo / resumeVideo
JSON-RPC notifications toggle the calling session's own slot, handled
inline in onRPCMessage so the dispatcher's session reference is in scope
(the generic reflection-based dispatch doesn't pass *Session).

This data model fixes by construction the two race classes Cursor Bugbot
flagged on earlier drafts of this PR:

- Handover-while-paused: the new session's own acquire is what starts
  the encoder; no special-case "restart if the outgoing session was
  paused" code at the handover site.
- Stale-session pause: a pauseVideo from the soon-to-close session
  releases only its own slot, not the new session's; no need for a
  session != currentSession gate.

The 1s handover overlap can briefly deliver frames to a paused session's
track if the other session is keeping the encoder alive; bounded by the
existing peer-connection close delay and acceptable for a feature aimed
at sustained idle bandwidth saving.

The refcount helpers are byte-identical to those in PR jetkvm#1447 (commit
dec25e1), so that PR will merge cleanly on top.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@mcuelenaere
Copy link
Copy Markdown
Contributor Author

I've decided to no longer pursue this feature. My original reasons for adding VNC support, was because I wanted a better client experience. After realizing that this was going to become very difficult by going the VNC route, I've decide to instead write a custom client, leveraging WebRTC as protocol.

This solves all my needs and thus I an no longer interested in adding VNC support.

Having said that, there is still value in this work so I won't close this PR (unless the maintainers want me to). Someone can start from this and maybe implement the suggested (delta) JPEG encoding to

  1. add support for a wider range of VNC clients (only TigerVNC compiled with ffmpeg supports H264)
  2. potentially improve latency (although I very much doubt that will fix the issue)

@mcuelenaere mcuelenaere changed the title feat(vnc): add VNC server with H.264 [UNMAINTAINED] feat(vnc): add VNC server with H.264 May 12, 2026
mcuelenaere added a commit to mcuelenaere/kvm that referenced this pull request May 26, 2026
…eVideo

Introduces acquireVideoStream / releaseVideoStream / videoStreamHasConsumers
helpers in video.go that gate the native encoder's VideoStart / VideoStop
and the HDMI sleep ticker on a set of named consumers. The encoder runs
iff at least one consumer holds it; the 0→1 transition runs VideoStart so
the first delivered frame is an IDR.

Each WebRTC Session gets a stable id (uuid in newSession) and its own
consumer key "webrtc:<id>". The slot is acquired when ICE reaches
Connected and released when it reaches Closed. pauseVideo / resumeVideo
JSON-RPC notifications toggle the calling session's own slot, handled
inline in onRPCMessage so the dispatcher's session reference is in scope
(the generic reflection-based dispatch doesn't pass *Session).

This data model fixes by construction the two race classes Cursor Bugbot
flagged on earlier drafts of this PR:

- Handover-while-paused: the new session's own acquire is what starts
  the encoder; no special-case "restart if the outgoing session was
  paused" code at the handover site.
- Stale-session pause: a pauseVideo from the soon-to-close session
  releases only its own slot, not the new session's; no need for a
  session != currentSession gate.

The 1s handover overlap can briefly deliver frames to a paused session's
track if the other session is keeping the encoder alive; bounded by the
existing peer-connection close delay and acceptable for a feature aimed
at sustained idle bandwidth saving.

The refcount helpers are byte-identical to those in PR jetkvm#1447 (commit
dec25e1), so that PR will merge cleanly on top.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
mcuelenaere added a commit to mcuelenaere/kvm that referenced this pull request May 29, 2026
…eVideo

Introduces acquireVideoStream / releaseVideoStream / videoStreamHasConsumers
helpers in video.go that gate the native encoder's VideoStart / VideoStop
and the HDMI sleep ticker on a set of named consumers. The encoder runs
iff at least one consumer holds it; the 0→1 transition runs VideoStart so
the first delivered frame is an IDR.

Each WebRTC Session gets a stable id (uuid in newSession) and its own
consumer key "webrtc:<id>". The slot is acquired when ICE reaches
Connected and released when it reaches Closed. pauseVideo / resumeVideo
JSON-RPC notifications toggle the calling session's own slot, handled
inline in onRPCMessage so the dispatcher's session reference is in scope
(the generic reflection-based dispatch doesn't pass *Session).

This data model fixes by construction the two race classes Cursor Bugbot
flagged on earlier drafts of this PR:

- Handover-while-paused: the new session's own acquire is what starts
  the encoder; no special-case "restart if the outgoing session was
  paused" code at the handover site.
- Stale-session pause: a pauseVideo from the soon-to-close session
  releases only its own slot, not the new session's; no need for a
  session != currentSession gate.

The 1s handover overlap can briefly deliver frames to a paused session's
track if the other session is keeping the encoder alive; bounded by the
existing peer-connection close delay and acceptable for a feature aimed
at sustained idle bandwidth saving.

The refcount helpers are byte-identical to those in PR jetkvm#1447 (commit
dec25e1), so that PR will merge cleanly on top.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add VNC Support as an Alternative to the Web Client

4 participants