Skip to content

Add small key type tests#812

Open
ksapru wants to merge 6 commits intoNVIDIA:devfrom
ksapru:add-small-key-type-tests
Open

Add small key type tests#812
ksapru wants to merge 6 commits intoNVIDIA:devfrom
ksapru:add-small-key-type-tests

Conversation

@ksapru
Copy link
Copy Markdown
Contributor

@ksapru ksapru commented May 1, 2026

Extends the static_set unit-test type matrices to cover sub-4-byte key types (uint8_t and uint16_t), as requested in the good first issue.

Changes:

  • Added uint8_t and uint16_t to the TEMPLATE_TEST_CASE_SIG type axes in all 8 templated static_set test files
  • Added a compile-time num_keys guard (sizeof(Key) == 1 ? 100 : 400) to keep key sequences within each type's representable range, leaving room for the 0xFF / 0xFFFF empty sentinel
  • Adjusted hardcoded gold_capacity values in unique_sequence_test and retrieve_all_test to be conditional on num_keys
  • Parameterized expected sum values in for_each_test to match the reduced key count for uint8_t

The underlying implementation (packed_cas via cuda::atomic_ref) already supports 1B and 2B atomic operations — these tests verify that correctness holds end-to-end for small key types.

Addresses #805

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 1, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@ksapru ksapru marked this pull request as ready for review May 1, 2026 20:56
@sleeepyjack
Copy link
Copy Markdown
Collaborator

/ok to test 7b0beb1

@NVIDIA NVIDIA deleted a comment from copy-pr-bot Bot May 1, 2026
@sleeepyjack
Copy link
Copy Markdown
Collaborator

/ok to test 323adf8

@ksapru ksapru force-pushed the add-small-key-type-tests branch from 323adf8 to e77f232 Compare May 2, 2026 14:33
@ksapru
Copy link
Copy Markdown
Contributor Author

ksapru commented May 2, 2026

/ok to test 281721b

@sleeepyjack
Copy link
Copy Markdown
Collaborator

/ok to test 281721b

@ksapru ksapru force-pushed the add-small-key-type-tests branch from 281721b to 390844b Compare May 3, 2026 12:45
@sleeepyjack
Copy link
Copy Markdown
Collaborator

/ok to test 61993aa

@ksapru ksapru force-pushed the add-small-key-type-tests branch from 61993aa to 22ad7ae Compare May 3, 2026 17:01
@sleeepyjack
Copy link
Copy Markdown
Collaborator

/ok to test 22ad7ae

@sleeepyjack
Copy link
Copy Markdown
Collaborator

/ok to test 6dc9527

@sleeepyjack
Copy link
Copy Markdown
Collaborator

CI passes. Nice work! I'll give it a review on Monday or Tuesday depending on my workload. In the meantime, we can prepare follow up PRs: (1) replicate the same test extensions to all data structures, also sparsely testing combinations of differenty sized key/value types (e.g. 1B keys and 4B payloads), (2) go over the docs, readme, and asserts to make sure they reflect the new type constraints.

@ksapru
Copy link
Copy Markdown
Contributor Author

ksapru commented May 4, 2026

Thanks @sleeepyjack. Is there an easier way to reproduce CI locally in the future?

cuda::proclaim_return_type<Key>(build_fn{}));

set.insert_async(keys_begin, keys_begin + num_keys);
CUCO_CUDA_TRY(cudaDeviceSynchronize());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: why is this needed?

Copy link
Copy Markdown
Collaborator

@sleeepyjack sleeepyjack May 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't be required since it is running on the default stream IIUC. It's probably a red herring trying to fix open bug #804.

Comment on lines +53 to +56
template <>
struct packed<sizeof(uint8_t)> {
using type = uint8_t; ///< Packed type as `uint8_t` if the size of the object is 1
};
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We pack paired data (e.g., a 32-bit key and 32-bit value) into a single 64-bit unit for CAS, used only in map scenarios. Therefore, the smallest CAS we need is 8 bits (1B), and the smallest packed type is 2B, making the 1B overload unnecessary.

Suggested change
template <>
struct packed<sizeof(uint8_t)> {
using type = uint8_t; ///< Packed type as `uint8_t` if the size of the object is 1
};

Comment on lines +61 to +64
template <>
struct packed<sizeof(uint16_t)> {
using type = uint16_t; ///< Packed type as `uint16_t` if the size of the object is 2
};
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we add a 2B key-value pair in the map unit test to exercise this code path? One test case is sufficient.

@PointKernel
Copy link
Copy Markdown
Member

Thanks @sleeepyjack. Is there an easier way to reproduce CI locally in the future?

@ksapru All the CI environment is shared via devcontainers. You could use them to repro the issue.

@PointKernel PointKernel added the type: feature request New feature request label May 4, 2026
@sleeepyjack
Copy link
Copy Markdown
Collaborator

sleeepyjack commented May 5, 2026

Thanks @sleeepyjack. Is there an easier way to reproduce CI locally in the future?

@ksapru , yes, @PointKernel 's suggestion is the right way to go. Note that if CI fails, the associated test/build log will tell you at the end how to spin up the corresponding devcontainer environment to reproduce the bug. If you don't have access to a GPU system, you can still use the base images of the devcontainers (e.g., this image) directly to at least compile the code locally by running the ci/build.sh script.

cuda::proclaim_return_type<Key>(build_fn{}));

set.insert_async(keys_begin, keys_begin + num_keys);
CUCO_CUDA_TRY(cudaDeviceSynchronize());
Copy link
Copy Markdown
Collaborator

@sleeepyjack sleeepyjack May 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't be required since it is running on the default stream IIUC. It's probably a red herring trying to fix open bug #804.

Comment on lines +80 to +85
(uint8_t, cuco::test::probe_sequence::double_hashing, 1),
(uint8_t, cuco::test::probe_sequence::linear_probing, 1),
(uint16_t, cuco::test::probe_sequence::double_hashing, 1),
(uint16_t, cuco::test::probe_sequence::double_hashing, 2),
(uint16_t, cuco::test::probe_sequence::linear_probing, 1),
(uint16_t, cuco::test::probe_sequence::linear_probing, 2),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if using singed integers (as in the rest of the tests) would make sense for consistency. Plus, it would probably also catch some unintended overflow/wrap-around errors. Just a suggestion though - no hard requirement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

type: feature request New feature request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants