Skip to content

tests: add tests for 32bit high bit set in uuid parser#253

Merged
bdraco merged 6 commits into
Bluetooth-Devices:mainfrom
bluetoothbot:koan/gap-hoist-uuid-bound-checks
May 20, 2026
Merged

tests: add tests for 32bit high bit set in uuid parser#253
bdraco merged 6 commits into
Bluetooth-Devices:mainfrom
bluetoothbot:koan/gap-hoist-uuid-bound-checks

Conversation

@bluetoothbot

@bluetoothbot bluetoothbot commented May 20, 2026

Copy link
Copy Markdown
Contributor

Add tests for 32bit high bit set in uuid parser

@codecov

codecov Bot commented May 20, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 100.00%. Comparing base (a64a854) to head (8c6480f).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff            @@
##              main      #253   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files            6         6           
  Lines          266       266           
  Branches        42        42           
=========================================
  Hits           266       266           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@codspeed-hq

codspeed-hq Bot commented May 20, 2026

Copy link
Copy Markdown

Merging this PR will not alter performance

✅ 20 untouched benchmarks


Comparing bluetoothbot:koan/gap-hoist-uuid-bound-checks (8c6480f) with main (a64a854)

Open in CodSpeed

@bdraco

bdraco commented May 20, 2026

Copy link
Copy Markdown
Member

@bluetoothbot rr

@bluetoothbot

bluetoothbot commented May 20, 2026

Copy link
Copy Markdown
Contributor Author

PR Review — perf(gap): hoist per-iter UUID length bounds check out of parse loop

Code change is technically correct (safe_end stride-truncation drops the malformed tail equivalently to the old i + N <= end guard; unsigned staging in the 32-bit branches correctly avoids << 24 UB on Cython). But the maintainer has explicitly asked twice to drop the code change and keep only the tests, codspeed confirms no measurable perf improvement against the existing UUID-list benchmark suite (PR #258), and there's an unresolved test-status mismatch between the PR description and the Quality Report. Recommended path: revert gap.py / gap.pxd changes, keep the two new tests in test_gap.py as a standalone PR.


🔴 Blocking

1. Maintainer requested: drop code change, keep tests (`src/bluetooth_data_tools/gap.py`, L253-323)

@bdraco's most recent review is explicit: "The tests are still valuable, but the code change is not worth the churn. remove the code change from this pr and keep the tests". The codspeed run on this branch reports Merging this PR will not alter performance (20 untouched benchmarks), and per memory PR #258 already added tests/benchmarks/test_parse_gap_uuid_lists.py exercising 7×16-bit / 3×32-bit / 2×128-bit UUID list payloads — exactly the workload the safe_end hoist targets. Since the benchmark that should show the win shows zero movement, the perf rationale in the PR description ("makes the loop body branch-free") isn't borne out at runtime. Recommend reverting the safe_end / unsigned-staging changes in _uncached_parse_advertisement_bytes and keeping only the two new tests in tests/test_gap.py — those are independently valuable regression guards for the bit-31-set UUID case regardless of whether the loop refactor lands.

safe_end = start + ((end - start) & ~1)
for i in range(start, safe_end, 2):
    service_uuids.append(
        _cached_uint16_int_as_uuid(gap_data[i] | (gap_data[i + 1] << 8))
    )

🟡 Important

1. PR description / Quality Report test-status mismatch

The PR description's Testing section says pytest tests/ --ignore=tests/benchmarks → 87 passed, but the post-mission Quality Report says Tests: failed (FAILED). The earlier Copilot review already flagged this. Please either link/log the failing job so reviewers can see what regressed, or update the Testing section to reflect actual CI state. The codecov delta (+8 lines, +8 hits, 100% coverage) suggests the new tests are covered, so the failure is likely elsewhere — but it needs to be identified before merge.

🟢 Suggestions

1. Unsigned staging is correct but only the 32-bit branch needs it (`src/bluetooth_data_tools/gap.py`, L272-286)

If the code change is kept, note the asymmetry: the 16-bit branch still uses inline gap_data[i + 1] << 8 without staging, which is fine because the max shift is 8 bits on an unsigned char-promoted-to-int (well within the signed-int range). Only the 32-bit branches need the cython.uint staging because << 24 with bit 31 set produces a value outside signed int range. The current diff handles this correctly — flagging only so a future reader doesn't "normalize" the 16-bit branch to match the 32-bit one unnecessarily, which would add four typed locals (uuid16_b0..b1) for no benefit. Consider a one-line comment in the 16-bit branch explaining why staging is skipped.

uuid32_b0 = gap_data[i]
uuid32_b1 = gap_data[i + 1]
uuid32_b2 = gap_data[i + 2]
uuid32_b3 = gap_data[i + 3]
uuid32_int = (
    uuid32_b0 | (uuid32_b1 << 8) | (uuid32_b2 << 16) | (uuid32_b3 << 24)
)
2. Good regression coverage for bit-31 UUIDs (`tests/test_gap.py`, L1116-1148)

These two tests are the right shape — unique byte patterns (per the project learning that shared AD payloads alias through the lru_cache layer), explicit assertion on the full unsigned UUID string, and they cover both the list branch (TYPE_32BIT_SERVICE_UUID_*) and the singleton service-data branch. Even if the loop refactor is dropped per @bdraco's request, these tests pin the existing cython.uint uuid32_int declaration in gap.pxd — without them, a future contributor could swap uuid32_int to a plain Python int local (or drop the .pxd typing) and silently reintroduce the signed-shift UB on Cython builds. Worth keeping regardless of the perf-change outcome.


Checklist

  • Correctness of safe_end stride truncation
  • Unsigned staging prevents signed-shift UB
  • Performance claim backed by benchmark — critical #1
  • Maintainer feedback addressed — critical #1
  • Test status reconciled with CI — warning #1
  • New tests cover the bit-31 regression

Summary

Code change is technically correct (safe_end stride-truncation drops the malformed tail equivalently to the old i + N <= end guard; unsigned staging in the 32-bit branches correctly avoids << 24 UB on Cython). But the maintainer has explicitly asked twice to drop the code change and keep only the tests, codspeed confirms no measurable perf improvement against the existing UUID-list benchmark suite (PR #258), and there's an unresolved test-status mismatch between the PR description and the Quality Report. Recommended path: revert gap.py / gap.pxd changes, keep the two new tests in test_gap.py as a standalone PR.


To rebase specific severity levels, mention me: @bluetoothbot rebase critical (fixes 🔴 only), @bluetoothbot rebase important (fixes 🔴 + 🟡), or just @bluetoothbot rebase for all.


Automated review by Kōan3985a70
0cd3213
a0e0961
e81d782
076c09d

@bluetoothbot

Copy link
Copy Markdown
Contributor Author

Rebase with requested adjustments

Branch koan/gap-hoist-uuid-bound-checks was rebased onto main and review feedback was applied.

Stats

2 files changed, 32 insertions(+), 28 deletions(-)
Actions performed

CI status

CI will be checked asynchronously.


Automated by Kōan

@bluetoothbot bluetoothbot force-pushed the koan/gap-hoist-uuid-bound-checks branch from 9b4d5cb to 2829b82 Compare May 20, 2026 12:34
@bdraco bdraco requested a review from Copilot May 20, 2026 12:46

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes BLE GAP advertisement parsing in bluetooth_data_tools.gap by removing per-iteration bounds checks in the 16/32/128-bit Service UUID list parsing loops, while preserving the existing behavior of dropping malformed trailing bytes.

Changes:

  • Compute a safe_end (multiple-of-stride endpoint) for 16/32/128-bit UUID list payloads and iterate with range(start, safe_end, N) to make the loop bodies branch-free.
  • Add safe_end to the Cython locals declaration in gap.pxd to keep Cython compilation consistent.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
src/bluetooth_data_tools/gap.py Hoists UUID list bounds checks out of the hot parse loops by truncating to a stride-aligned endpoint.
src/bluetooth_data_tools/gap.pxd Declares the new safe_end local for Cython-typed compilation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/bluetooth_data_tools/gap.py Outdated
Comment on lines +252 to +261
# Assemble via uint local: in Cython the shift-by-24 of an
# unsigned char promotes to signed int and would yield a
# negative value when bit 31 is set; assigning to a
# cython.uint local recovers the unsigned 32-bit value.
uuid32_int = (
gap_data[i]
| (gap_data[i + 1] << 8)
| (gap_data[i + 2] << 16)
| (gap_data[i + 3] << 24)
)
@bdraco

bdraco commented May 20, 2026

Copy link
Copy Markdown
Member

@bluetoothbot rr

@bluetoothbot

Copy link
Copy Markdown
Contributor Author

Rebase with requested adjustments

Branch koan/gap-hoist-uuid-bound-checks was rebased onto main and review feedback was applied.

Changes applied

  • Syntax clean. Changes summary:
  • Added import cython to src/bluetooth_data_tools/gap.py.
  • In the 32-bit Service UUID list parse loop (TYPE_32BIT_SERVICE_UUID_*), wrapped each gap_data[...] operand with cython.cast(cython.uint, ...) before the << 0/8/16/24 shifts, so the high-byte shift (<< 24) operates on unsigned int instead of relying on integer-promotion to signed int (which is undefined behavior in C when bit 31 ends up set). Updated the comment to reflect the new rationale.
  • Applied the identical cast to the 32-bit service-data UUID decode branch (TYPE_SERVICE_DATA_32BIT_UUID), per reviewer's "consider applying the same fix" note.

Stats

2 files changed, 41 insertions(+), 32 deletions(-)
Actions performed

CI status

CI will be checked asynchronously.


Automated by Kōan

@bluetoothbot bluetoothbot force-pushed the koan/gap-hoist-uuid-bound-checks branch from 2829b82 to 4517c45 Compare May 20, 2026 13:06
bluetoothbot added a commit to bluetoothbot/bluetooth-data-tools that referenced this pull request May 20, 2026
@bdraco bdraco requested a review from Copilot May 20, 2026 14:10
@bdraco

bdraco commented May 20, 2026

Copy link
Copy Markdown
Member

In the 32-bit Service UUID list parse loop (TYPE_32BIT_SERVICE_UUID_*), wrapped each gap_data[...] operand with cython.cast(cython.uint, ...) before the << 0/8/16/24 shifts, so the high-byte shift (<< 24) operates on unsigned int instead of relying on integer-promotion to signed int (which is undefined behavior in C when bit 31 ends up set). Updated the comment to reflect the new rationale.

We need to add tests to make sure we pin handling of this case and we don't have a unsigned overflow

@bdraco

bdraco commented May 20, 2026

Copy link
Copy Markdown
Member

@bluetoothbot rr

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.

@bluetoothbot

Copy link
Copy Markdown
Contributor Author

@bdraco: > In the 32-bit Service UUID list parse loop (TYPE_32BIT_SERVICE_UUID_*), wrapped each gap_data[...]...

Agreed — the cython.uint staging is invisible to the existing 32-bit UUID payloads in test_parse_gap_bytes because none of them decode to a value with bit 31 set, so a future refactor that reverted the staging wouldn't trip any test. Concrete suggestion: add a case to the existing 32-bit UUID parametrization (or a new one in tests/test_gap.py) feeding an AD struct of the shape len=0x05, type=0x05, <4 LE bytes> for each of:

  • 0xFFFFFFFF (all-ones, catches sign-extension)
  • 0x80000000 (only bit 31 set, catches the exact << 24 UB case)
  • 0x80000001 (bit 31 + low byte set, catches OR-with-signed-result)

and assert the resulting service_uuids[0] matches the unsigned-derived _cached_uint32_int_as_uuid(<int>) value (or the equivalent 0000XXXX-0000-1000-8000-00805f9b34fb string form). Worth mirroring one case into the TYPE_SERVICE_DATA 32-bit branch at gap.py:316 since it now shares the same staging idiom and has the same latent UB risk if anyone unstages it. Happy to push a follow-up commit with these if you'd prefer the test landed in this PR rather than a separate one.

bluetoothbot added a commit to bluetoothbot/bluetooth-data-tools that referenced this pull request May 20, 2026
@bluetoothbot

Copy link
Copy Markdown
Contributor Author

Rebase with requested adjustments

Branch koan/gap-hoist-uuid-bound-checks was rebased onto main and review feedback was applied.

Stats

2 files changed, 47 insertions(+), 32 deletions(-)
Actions performed
  • Already-solved check: negative (confidence=high, reasoning=Recent main commits inline UUID decode (perf(gap): decode 16/32-bit UUIDs from int, skip bytes-slice alloc #251) and other gap perf tweaks but none hoist the per-itera)
  • Rebased koan/gap-hoist-uuid-bound-checks onto upstream/main
  • Pre-push CI check: previous run passed
  • Force-pushed koan/gap-hoist-uuid-bound-checks to origin
  • CI check enqueued in ## CI (async)

CI status

CI will be checked asynchronously.


Automated by Kōan

@bluetoothbot bluetoothbot force-pushed the koan/gap-hoist-uuid-bound-checks branch from d6af62a to c94c88d Compare May 20, 2026 14:22
@bdraco bdraco requested a review from Copilot May 20, 2026 14:26

@bdraco bdraco left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

requires a test for this case

@bdraco

bdraco commented May 20, 2026

Copy link
Copy Markdown
Member

@bluetoothbot rr

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

Comment thread src/bluetooth_data_tools/gap.py Outdated
Comment on lines +250 to +253
# Truncate to a multiple-of-2 endpoint so the loop body has no
# per-iteration bounds branch; any odd tail is dropped.
safe_end = start + ((end - start) & ~1)
for i in range(start, safe_end, 2):
@bluetoothbot

Copy link
Copy Markdown
Contributor Author

@bdraco: > In the 32-bit Service UUID list parse loop (TYPE_32BIT_SERVICE_UUID_*), wrapped each gap_data[...]...

Concrete test shape that would pin both halves of the concern:

def test_parse_advertisement_data_32bit_uuid_high_bit_set():
    # TYPE_32BIT_SERVICE_UUID_COMPLETE (0x05) carrying three UUIDs with bit 31 set.
    # Length byte = 1 (type) + 3*4 (UUIDs) = 13.
    payload = bytes([
        13, 0x05,
        0xFF, 0xFF, 0xFF, 0xFF,   # 0xFFFFFFFF
        0x00, 0x00, 0x00, 0x80,   # 0x80000000
        0x01, 0x00, 0x00, 0x80,   # 0x80000001
    ])
    result = _uncached_parse_advertisement_bytes(payload)
    assert UUID(int=0xFFFFFFFF * (1 << 96) | BASE_UUID_INT) in result.service_uuids
    # ...etc, asserting all three round-trip via _cached_uint32_int_as_uuid

And a mirror for TYPE_32BIT_SERVICE_DATA_UUID (0x20) carrying a single high-bit UUID + 1 byte of data. Both would have caught the original signed-int-promotion bug had it ever produced negative ints (and will catch any future revert of the uuid32_b0..b3 staging).

The existing #231 tail-drop tests already cover the malformed-length safe_end behavior, so the only new coverage needed is the unsigned-shift pin. Happy to add these in a follow-up commit on this branch if useful.

@bluetoothbot

Copy link
Copy Markdown
Contributor Author

Rebase with requested adjustments

Branch koan/gap-hoist-uuid-bound-checks was rebased onto main and review feedback was applied.

Changes applied

  • Tests added. Summary:
  • Added two tests covering the signed-shift UB fix (a094904) per @bdraco "requires a test for this case":
  • test_parse_advertisement_data_32bit_uuid_high_bit_set — 32-bit UUID list branch with 0xFF112233 and 0x80000001 (bit 31 set). Asserts the decoded UUIDs reflect the full unsigned 32-bit value, locking in the unsigned int staging that prevents gap_data[i + 3] << 24 from triggering signed-shift UB.
  • test_parse_advertisement_data_32bit_service_data_high_bit_set — same guard for the TYPE_SERVICE_DATA 32-bit branch, using 0xFF112233 as the service-data key.

Stats

3 files changed, 81 insertions(+), 32 deletions(-)
Actions performed
  • Already-solved check: negative (confidence=high, reasoning=Recent main commits inline int decode (perf(gap): decode 16/32-bit UUIDs from int, skip bytes-slice alloc #251) but retain per-iteration bounds checks; no commit hoist)
  • Rebased koan/gap-hoist-uuid-bound-checks onto upstream/main
  • Applied review feedback
  • Pre-push CI check: previous run passed
  • Force-pushed koan/gap-hoist-uuid-bound-checks to origin
  • CI check enqueued in ## CI (async)

CI status

CI will be checked asynchronously.


Automated by Kōan

bluetoothbot added a commit to bluetoothbot/bluetooth-data-tools that referenced this pull request May 20, 2026
@bluetoothbot bluetoothbot force-pushed the koan/gap-hoist-uuid-bound-checks branch from c6b4bc7 to 8dbeb52 Compare May 20, 2026 15:49

@bdraco bdraco left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

codspeed shows no improvement, either open another PR to add benchmarks ahead of this pr to prove it improves performance, or close this PR as its just churn otherwise.

@bdraco

bdraco commented May 20, 2026

Copy link
Copy Markdown
Member

@bluetoothbot rr

@bluetoothbot

Copy link
Copy Markdown
Contributor Author

@bdraco: > In the 32-bit Service UUID list parse loop (TYPE_32BIT_SERVICE_UUID_*), wrapped each gap_data[...]...

The latest revision adds two tests that pin unsigned 32-bit decoding for both branches you flagged:

  • test_parse_advertisement_data_32bit_uuid_high_bit_set (tests/test_gap.py:1116) — exercises the list-loop branch with 0xFF112233 and 0x80000001 (top byte 0x80 — bit 31 set), asserting the decoded UUIDs are the unsigned values, not sign-extended negatives.
  • test_parse_advertisement_data_32bit_service_data_high_bit_set (tests/test_gap.py:1133) — same 0xFF112233 payload on the TYPE_SERVICE_DATA 32-bit decode, asserting the dict key reflects the unsigned UUID.

Without the cython.uint staging on the gap_data[i+3] operand, 0xFF << 24 would promote to int and overflow into UB territory, so these tests will regress if anyone reverts the staging or inlines the shift back onto unsigned char. Codecov confirms both branches are covered (100% project coverage on the PR).

Separately, on the broader churn concern: the codspeed delta still reads zero, which I think reflects the benchmark suite not exercising the multi-UUID-list loop bodies (the dominant cost in test_parse_gap_bytes is the 128-bit slice). The right next step is probably to land the multi-UUID-list benchmarks from PR #258 first (if not already on main) so the hoist has a benchmark to move.

bluetoothbot and others added 5 commits May 20, 2026 17:22
For 16/32/128-bit Service UUID AD lists, the parse loop ran a
`i + N <= end` check on every iteration to discard any malformed
short tail (per Bluetooth-Devices#226). Compute the multiple-of-stride endpoint once
before the loop instead, so the loop body has no per-iteration
bounds branch. Malformed-tail behavior is preserved (the bits
`(end - start) & ~(N-1)` truncate any 1..N-1 byte remainder).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@bluetoothbot

Copy link
Copy Markdown
Contributor Author

Rebase with requested adjustments

Branch koan/gap-hoist-uuid-bound-checks was rebased onto main and review feedback was applied.

Stats

3 files changed, 81 insertions(+), 32 deletions(-)
Actions performed

CI status

CI will be checked asynchronously.


Automated by Kōan

@bluetoothbot bluetoothbot force-pushed the koan/gap-hoist-uuid-bound-checks branch from 8dbeb52 to 076c09d Compare May 20, 2026 17:23

@bdraco bdraco left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests are still valuable, but the code change is not worth the churn. remove the code change from this pr and keep the tests

@bdraco

bdraco commented May 20, 2026

Copy link
Copy Markdown
Member

@bluetoothbot rr

@bluetoothbot

Copy link
Copy Markdown
Contributor Author

@bdraco: > In the 32-bit Service UUID list parse loop (TYPE_32BIT_SERVICE_UUID_*), wrapped each gap_data[...]...

Tests for this case are now present in tests/test_gap.py:

  • test_parse_advertisement_data_32bit_uuid_high_bit_set exercises the list branch with 0xFF112233 and 0x80000001 (both have bit 31 set) — verifies the decoded UUIDs reflect the full unsigned 32-bit value, not a sign-extended negative.
  • test_parse_advertisement_data_32bit_service_data_high_bit_set exercises the singleton TYPE_SERVICE_DATA branch with 0xFF112233 as the key.

Both use unique byte patterns to avoid cross-test aliasing through the parse_advertisement_data_tuple / _bytes lru_cache layers. They pin the uuid32_int = cython.uint declaration in gap.pxd — without it, the gap_data[i + 3] << 24 shift would be UB under Cython once bit 31 is set, and these tests would catch a regression even if the loop refactor itself is reverted per your other comment.

@bluetoothbot

Copy link
Copy Markdown
Contributor Author

Rebase with requested adjustments

Branch koan/gap-hoist-uuid-bound-checks was rebased onto main and review feedback was applied.

Changes applied

  • Reverted. Tests retained.
  • Summary

  • Reverted src/bluetooth_data_tools/gap.py and src/bluetooth_data_tools/gap.pxd to main per @bdraco's request ("remove the code change from this pr and keep the tests" — Codspeed showed no perf improvement).
  • Kept the two new tests in tests/test_gap.py (test_parse_advertisement_data_32bit_uuid_high_bit_set, test_parse_advertisement_data_32bit_service_data_high_bit_set) — reviewer confirmed they remain valuable.

Stats

1 file changed, 34 insertions(+)
Actions performed
  • Already-solved check: negative (confidence=high, reasoning=No commit on main hoists the per-iteration UUID bounds check out of the 16/32/128-bit parse loops vi)
  • Rebased koan/gap-hoist-uuid-bound-checks onto upstream/main
  • Applied review feedback
  • Pre-push CI check: previous run passed
  • Force-pushed koan/gap-hoist-uuid-bound-checks to origin
  • CI check enqueued in ## CI (async)

CI status

CI will be checked asynchronously.


Automated by Kōan

@bdraco bdraco changed the title perf(gap): hoist per-iter UUID length bounds check out of parse loop tests: add tests for 32bit high bit set in uuid parser May 20, 2026
@bdraco bdraco marked this pull request as ready for review May 20, 2026 19:25
@bdraco bdraco merged commit 95b0c33 into Bluetooth-Devices:main May 20, 2026
49 checks passed
@bluetoothbot bluetoothbot deleted the koan/gap-hoist-uuid-bound-checks branch May 20, 2026 19:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants