Skip to content

Fix/registry and keep alive config phase#76

Closed
rayanbzd wants to merge 14 commits into
Quozul:masterfrom
rayanbzd:fix/registry-and-keep-alive-config-phase
Closed

Fix/registry and keep alive config phase#76
rayanbzd wants to merge 14 commits into
Quozul:masterfrom
rayanbzd:fix/registry-and-keep-alive-config-phase

Conversation

@rayanbzd
Copy link
Copy Markdown
Contributor

Fix Velocity + PacketEvents interop issues during configuration phase

Context

We run PicoLimbo as a backend behind a Velocity proxy with PacketEvents 2.12.1. The proxy intentionally holds connecting players in the CONFIGURATION state for an extended period (typically more than 30 seconds) to run a custom registration / authentication dialog flow before transitioning the player to the actual gameplay server.

In this setup we hit a cluster of bugs that prevent PicoLimbo from working as a Velocity backend for modern Minecraft clients (1.21.11+ / protocol 26.1.x). This PR fixes all of them.

The root causes are independent but they interact:

  • PacketEvents fails to parse the minecraft:timeline registry that PicoLimbo emits.
  • PicoLimbo doesn't implement the select_known_packs handshake, so it sends every registry to every client.
  • PicoLimbo sends a minecraft:core pack version string that doesn't match what 26.1.x clients have locally.
  • PicoLimbo never sends keep_alive while the client is in CONFIGURATION, so Velocity's backend read-timeout (30 s by default) tears down the connection during the dialog flow.

Bug 1 — IllegalStateException from PacketEvents on minecraft:timeline

Symptom (Velocity logs):

java.lang.IllegalStateException: Error while reading registry minecraft:timeline

Reproduces on every connection from a 26.1.x client passing through Velocity + PacketEvents 2.12.x.

Root cause:

PicoLimbo's json_to_nbt converter downcasts integers to the smallest-fitting NBT type (e.g. 0Byte, 1000Short, 24000Short) and downcasts floats to Double when precision can't round-trip to Float. Mojang's vanilla server emits the canonical NBT types: Int (or Long on overflow) for integers, Float (32-bit) for floats, always.

The MC 26.1 minecraft:timeline registry codec on the client (and the corresponding decoder in PacketEvents 2.12.x) expects strictly canonical tags. When a Short(24000) arrives where the codec expects Int(24000), the decoder throws. Cross-reference: retrooper/packetevents#1482.

Fix:

  • New NbtOptions::numeric_widening flag in pico_nbt and a new json_to_nbt_with_options(json, options) entry point.
  • When numeric_widening is enabled: integer JSON numbers always become Value::Int (or Long on overflow) — never downcast Byte/Short. Floats always become Value::Float (32-bit), even at the cost of f64→f32 precision. JSON booleans remain Value::Byte(0/1) (vanilla behavior). Integer arrays become IntArray / LongArray, never ByteArray.
  • pico_registries is the only call site that loads registry JSON; it now passes numeric_widening(true).
  • Pinning regression test against the real V26_1 timeline/day data report.

Default behavior of json_to_nbt (without the flag) is unchanged — any other use of the converter is unaffected.


Bug 2 — Wrong minecraft:core pack version in ClientBoundKnownPacksPacket

Symptom:

A 26.1.x vanilla client never accepts the pack PicoLimbo offers. The client's select_known_packs response is an empty list.

Root cause:

ClientBoundKnownPacksPacket::new(protocol_version.humanize())humanize() returns the protocol's minor name, which for V26_1 is "26.1". The client compares the pack version string strictly: a client running 26.1.2 has a locally-stored minecraft:core@26.1.2 pack, not minecraft:core@26.1, so the offer is rejected and the client tells us it has no packs in common.

Fix:

New known_pack_version(version) helper that maps protocol versions whose Mojang release tag differs from humanize() to the right literal. Today only V26_1 → "26.1.2"; future protocols whose release tag matches humanize() fall through and don't need an arm.

Note: this only optimizes for 26.1.2 clients. Clients on 26.1.0 / 26.1.1 (same protocol number, different point release) will reject the offer and the server falls back to sending full registry data — which is correct, and the fallback path is now safe thanks to Bug 1's fix.


Bug 3 — select_known_packs handshake not implemented

Symptom:

On LoginAcknowledged, PicoLimbo dumps the entire configuration burst (brand + select_known_packs offer + tags + every registry + FinishConfiguration) in one batch, regardless of how the client responds. Even when the client accepts the vanilla pack we offer, we still ship every registry — wasted bandwidth, and on Velocity + PacketEvents this is precisely the path that triggers Bug 1.

Root cause:

There is no PacketIn decoder for select_known_packs, no ServerBoundSelectKnownPacks variant in PacketRegistry, and no handler. The server-bound side of the handshake is dropped on the floor.

Fix:

  • New ServerBoundKnownPacksPacket (in minecraft_packets::configuration) deriving PacketIn.
  • New ServerBoundSelectKnownPacks variant in PacketRegistry, registered as state = "configuration", bound = "serverbound", name = "minecraft:select_known_packs".
  • KnownPack struct fields made pub so the server-bound handler can compare them.
  • send_configuration_packets is split into two functions:
    • send_configuration_packets: for protocols >= 1.20.5, sends only brand + select_known_packs and stops. For pre-1.20.5 (no handshake), still emits the full burst as before.
    • send_post_known_packs_configuration_packets: sends tags + registry data (unless skipped) + FinishConfiguration. Called either directly (pre-1.20.5) or from the new handler (post-1.20.5).
  • New impl PacketHandler for ServerBoundKnownPacksPacket: when the client's response contains minecraft:core@<expected_version>, registry data is skipped — matching Mojang vanilla / Paper behavior.

Pre-1.20.5 protocols are unaffected.


Bug 4 — No keep_alive in CONFIGURATION state → Velocity read-timeout

Symptom (Velocity logs):

[connected player] read timed out

…fires after 30 seconds whenever the proxy holds the player in CONFIGURATION longer than that — which is exactly what our register/auth dialog does.

Root cause:

  1. PacketRegistry only declares ClientBoundKeepAlive in state = "play". The CONFIGURATION variant of the packet doesn't exist on the server side.
  2. The keep-alive ticker (ControllableInterval) is only armed when send_play_packets runs, i.e. at the end of AcknowledgeConfiguration. While the client is held in CONFIGURATION, the ticker is idle and the backend connection sits silent on the network.
  3. Even if the ticker were armed, send_keep_alive early-returns unless the client is in State::Play.

Velocity's default backend read-timeout is 30 seconds. Any CONFIGURATION hold longer than that → silent backend → connection torn down.

Fix:

  • New ConfigurationClientBoundKeepAlive variant in PacketRegistry, reusing the existing ClientBoundKeepAlivePacket struct (the wire format is identical; only the per-state protocol ID differs and the PacketReport macro picks the right one per version).
  • send_keep_alive now dispatches on state: emits ConfigurationClientBoundKeepAlive in State::Configuration, ClientBoundKeepAlive in State::Play, no-op elsewhere.
  • The ticker is armed in the LoginAcknowledged handler (i.e. as soon as the client enters CONFIGURATION), not delayed until PLAY.
  • Keep-alive period lowered from 15 s to 10 s — gives a 3× safety margin under Velocity's default 30 s read-timeout instead of a 2× margin.

Why these are in one PR

Bug 1 by itself doesn't fix the Velocity backend scenario: even with canonical NBT, the client still gets timed out at 30 s if the proxy holds it in CONFIGURATION (Bug 4). Bugs 2 and 3 don't fix the crash from Bug 1 — they just avoid the code path that triggers it for vanilla clients. The four are entangled enough that they should land together.

Concrete coverage:

  • Vanilla 1.21.11+ client connecting directly to PicoLimbo → needs Bug 1.
  • Vanilla 1.21.11+ client connecting via Velocity + PacketEvents → needs Bug 1, or Bugs 2 + 3 (handshake skips the path).
  • Modded / non-vanilla client (or one that rejects the pack offer) → only Bug 1 fixes it.
  • Any client held in CONFIGURATION by a proxy for >30 s → needs Bug 4.

Tests

  • pico_nbt: 32 unit tests, including 14 new ones covering scalar and array widening for numeric_widening. Default code path is preserved.
  • pico_registries: new integration test registry_widening.rs loads the real V26_1 timeline/day.json from the data report and asserts canonical Int / Float at multiple paths (period_ticks, time_markers.wake_up_from_sleep, time_markers.day.ticks, show_in_commands, moon_angle.ease.cubic_bezier[0]).
  • pico_limbo: refactored configuration tests into three (initial burst, vanilla-rejected post-handshake, vanilla-accepted post-handshake) and tightened the LoginAcknowledged test to assert the keep-alive ticker is armed.
  • cargo test --workspace: all green.
  • cargo clippy --workspace --all-targets -- -D warnings: clean.

Compatibility

  • No breaking changes to public API.
  • pico_nbt::json_to_nbt keeps its previous signature and behavior; the new functionality is opt-in via json_to_nbt_with_options(json, options).
  • Pre-1.20.5 protocols are not affected by the handshake change — the full configuration burst is still sent in one go.
  • Pre-1.21.x clients are not affected by the timeline registry change (timeline didn't exist).

End-to-end verification

Built statically against x86_64-unknown-linux-musl and tested against the live setup: Velocity 3.5.x + PacketEvents 2.12.x + a 1.21.11+ client + a configuration dialog holding the player ~60 seconds.

Before this PR:

  • IllegalStateException: Error while reading registry minecraft:timeline on every connection.
  • After locally patching the parse crash, [connected player] read timed out at the 30 s mark.

After this PR:

  • No parse crashes.
  • No read timeouts.
  • For 26.1.2 clients, the handshake correctly skips registry data; for older 26.1.x clients, the full registry path is taken and works.

Commits

14 commits, grouped by bug:

Bug 1 — NBT canonical types (5 commits):

  • feat(nbt): add NbtOptions::numeric_widening flag
  • feat(nbt): respect numeric_widening in json_to_nbt
  • feat(nbt): re-export json_to_nbt_with_options
  • fix(registries): emit canonical Int/Long NBT tags when loading registry JSON
  • test(registries): assert timeline registry emits canonical Int+Float NBT tags

Bug 2 — pack version (1 commit):

  • fix(login): send correct minecraft:core pack version per protocol

Bug 3 — KnownPacks handshake (4 commits):

  • feat(packets): add ServerBoundKnownPacksPacket
  • feat(packets): make KnownPack fields public
  • feat(login): split configuration burst around the KnownPacks handshake
  • style(login): backtick KnownPacks in doc comment for clippy

Bug 4 — keep-alive in CONFIG (4 commits):

  • feat(packet-registry): add ConfigurationClientBoundKeepAlive variant
  • refactor(keep-alive): lower interval from 15s to 10s
  • feat(keep-alive): emit ConfigurationClientBoundKeepAlive in CONFIG state
  • feat(keep-alive): start ticker when entering CONFIGURATION state

Bazhard added 14 commits May 11, 2026 00:42
Adds json_to_nbt_with_options(json, options) and reroutes json_to_nbt
through it with default options. When numeric_widening is enabled:
- integer JSON numbers become Int (or Long on overflow), not the
  smallest-fitting Byte/Short
- float JSON numbers become Float (32-bit) unconditionally, matching
  what Mojang vanilla emits for registry data
- booleans stay Byte
- integer arrays become IntArray/LongArray, never ByteArray

This is the foundation for emitting registry data that strict
codecs (PacketEvents) can decode.
…ry JSON

Routes registry JSON loading through json_to_nbt_with_options with
numeric_widening enabled. The on-wire RegistryDataPacket bytes now use
the same NBT tag types Mojang vanilla emits — Int instead of downcast
Byte/Short for tick counts and similar fields, Float instead of Double
for ease curves and similar floats.

Fixes the PacketEvents 2.12.x crash on minecraft:timeline registry sync
in MC 1.21.11+. Cross-references retrooper/packetevents#1482.
…NBT tags

Pins the regression: loading data/generated/V26_1/data/minecraft/timeline/day.json
must produce period_ticks: Int(24000), time_markers.wake_up_from_sleep: Int(0),
day.ticks: Int(1000), show_in_commands: Byte(1) (boolean), and
moon_angle.ease.cubic_bezier[0]: Float(~0.362) — matching what Mojang vanilla
and NanoLimbo emit on the wire.
ProtocolVersion::humanize() returns the minor name (e.g. "26.1"), but
the Mojang select_known_packs handshake requires the exact release tag
(e.g. "26.1.2"). Sending the wrong string causes the client to reject
the pack and forces the server to send every registry instead of letting
the client use its bundled vanilla data — defeating the whole handshake.

Adds a known_pack_version() helper with per-version overrides. Only
V26_1 needs one today; older protocols where humanize() already matches
the release tag fall through to humanize().
Mirrors ClientBoundKnownPacksPacket but derives PacketIn instead of
PacketOut, so the server can decode the client's select_known_packs
response and decide what configuration packets to send next.
Consumers of PacketIn-decoded KnownPack instances need to inspect
namespace, id, and version. Matches the field visibility of other
serverbound packets in this crate.
Before this commit, LoginAcknowledged handler dumped the entire
configuration burst (brand + KnownPacks + tags + every registry +
FinishConfiguration) in one go for protocols >= 1.20.5. The client's
select_known_packs response was ignored — every client received every
registry whether or not it already had the vanilla pack locally.

After this commit:
- LoginAcknowledged sends only brand + KnownPacks and stops.
- ServerBoundKnownPacksPacket is registered as a configuration-state
  serverbound packet and dispatched to a new handler.
- The handler decides what to send next based on whether the client
  accepted minecraft:core at the expected version. If yes, only tags +
  FinishConfiguration. If no, tags + every registry + FinishConfiguration.

For protocols < 1.20.5 (no KnownPacks handshake), behavior is
unchanged — the full burst is sent from LoginAcknowledged.

This matches Mojang vanilla / Paper behavior and avoids shipping every
registry to clients that already have them, which was the underlying
cause of the PacketEvents timeline crash for non-vanilla-pack-accepting
clients.
Re-uses the existing ClientBoundKeepAlivePacket struct; the wire format
(long id) is identical between CONFIGURATION and PLAY. Only the
protocol-id-in-state differs (id=4 in CONFIG vs id=44 in PLAY on V26_1),
and the PacketReport macro pulls that per-version from the data report
files.
Gives a 3x margin under Velocity's default 30s read-timeout, and matches
the period the CONFIGURATION-state keep-alive will use.
Without this, the ControllableInterval stays Disabled until the player
acknowledges FinishConfiguration, leaving Velocity's read-timeout (30s
by default) free to kill the backend connection during long CONFIGURATION
holds — breaking flows that gate the player on an async dialog (e.g.
registration) for longer than the timeout.

const NAMELESS_ROOT: u8 = 1 << 0;
const DYNAMIC_LISTS: u8 = 1 << 1;
const NUMERIC_WIDENING: u8 = 1 << 2;
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have any documentation about this numeric widening or have you reverse-engineered Mojang's code?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverse-engineered by comparing wire output with NanoLimbo (which works with PacketEvents) and inspecting how Mojang's NbtOps emits canonical NBT for any value passed through their Codec API.

/// whose release tag differs from `humanize()` needs an arm here.
fn known_pack_version(version: ProtocolVersion) -> &'static str {
match version {
ProtocolVersion::V26_1 => "26.1.2",
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what happens if I join PicoLimbo in 26.1 or 26.1.1 ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, this shouldn't have stayed in the PR.
I added this temporarily while diagnosing the timeline issue, to confirm whether the client was rejecting our select_known_packs offer.

protocol_version.humanize() returns "26.1" but in my tests i found that the client compares the pack version strictly against its locally-stored minecraft:core@26.1.2 (or .1 / .0), so the offer never matches and the client always responds with an empty list, which forces the full registry send path. Hardcoding V26_1 -> "26.1.2" was just a quick way to confirm that hypothesis on the wire, not a real fix.

let period = Duration::from_secs(15);
// 10 s gives a 3x safety margin under Velocity's default 30 s
// read-timeout, and matches the period used in CONFIGURATION state.
let period = Duration::from_secs(10);
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The vanilla interval is 15 seconds, I'd prefer to match vanilla's behavior. Or perhaps the period is different during the configuration phase?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, vanilla is 15 seconds. I went with 10 seconds because most proxies (e.g. Velocity) have a default read timeout of 30 seconds, so a single miss could lead to a read timeout in the vast majority of default setups. 10 seconds was just an idea to keep some margin.
Side note: PaperMC sends this keep-alive every second, lol.
Would it be interesting to make this value configurable in server.toml, or not?

bound = "clientbound",
name = "minecraft:keep_alive"
)]
ConfigurationClientBoundKeepAlive(ClientBoundKeepAlivePacket),
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The packet id for this packet may need to be added for older versions, I think it is missing prior to 1.21

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cross-checked the data reports: minecraft:keep_alive clientbound in CONFIGURATION state is present from V1_20_2 onwards (id=3 in V1_20_2/V1_20_3, id=4 from V1_20_5+). Pre-V1_20_2 versions don't have CONFIGURATION state at all, and that's already gated by supports_configuration_state() / is_after_inclusive(V1_20_5) checks in send_configuration_packets.

@Quozul
Copy link
Copy Markdown
Owner

Quozul commented May 12, 2026

Superceded by #77 #78 and #79, I'll close this PR.

@Quozul Quozul closed this May 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants