[cccl.c] Introduce minimal compilation support for host freestanding mode by shwina · Pull Request #8437 · NVIDIA/cccl

shwina · 2026-04-15T12:28:01Z

Description

This PR introduces the minimal infrastructure for compiling and testing CCCL in "host freestanding" mode. The infrastructure being introduced suffices to compile the following code using the included JIT compiler:

std::string source = R"(
__global__ void device(int* ptr) {
  *ptr = 42;
}
void host(int* ptr) {
  device<<<1, 1>>>(ptr);
}
)";

A CI job has been added which builds the host JIT infrastructure for a specific combination of arch/compiler/CUDA. Eventually, we will want the entire matrix to build with host JIT, but during initial iteration this is sufficient:

Note that an important consideration/blocker is that nvfatbin (a dependency for host JIT) is available only on CUDA 12.4+.

Checklist

New or existing tests cover these changes.
The documentation is up to date with these changes.

copy-pr-bot · 2026-04-15T12:28:06Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

shwina · 2026-04-15T12:28:16Z

/ok to test 238c422

shwina · 2026-04-15T12:31:18Z

/ok to test 238c422

shwina · 2026-04-15T13:46:46Z

/ok to test 3979052

shwina · 2026-04-15T13:51:38Z

/ok to test 3979052

shwina · 2026-04-15T13:58:44Z

/ok to test github.com//pull/8437/commits/dd0aed4fefc65b68c378a6c0b38762866578090d

shwina · 2026-04-15T13:58:53Z

/ok to test dd0aed4

bernhardmgruber · 2026-04-15T15:32:54Z

Q: Why do we need this header? Why is <cuda/std/execution> not enough?

shwina · 2026-04-15T16:25:34Z

/ok to test d9e65a1

shwina · 2026-04-15T18:20:46Z

/ok to test 3a9db84

shwina · 2026-04-15T19:06:17Z

/ok to test efd5bdf

shwina · 2026-04-15T21:10:09Z

/ok to test 8969631

shwina · 2026-04-16T10:40:53Z

Things left to be done here:

Enable ClangJIT builds on gcc for all CUDA versions that suport them (>=12.4)
Enable ClangJIT builds on MSVC, once Add nvfatbin to CUDA installation script rapidsai/devcontainers#689 is merged
Make "with ClangJIT" the default CI build configuration (instead of separate jobs)

shwina · 2026-04-17T21:12:41Z

/ok to test 7136dc0

shwina · 2026-04-20T12:26:00Z

/ok to test 079740b

shwina · 2026-04-20T17:05:55Z

Drop codegen altogether

shwina · 2026-04-20T17:17:33Z

Rename "clangjit" to something else like "hostjit".

shwina · 2026-04-20T17:24:49Z

+    # Eventually we will want building with ClangJIT to be the
+    # default, and will do it across the entire matrix. Currently
+    # blocked on libnvfatbin availability on Windows containers, and for CUDA <12.4.
+    - {jobs: ['test'], project: 'cccl_c_parallel_clangjit', ctk: '13.X', cxx: ['gcc13'], gpu: 'rtx2080'}


Add corresponding entry in project_files_and_dependencies.yml

Ensure that libcudacxx and c_parallel_internal are dependencies of it.

shwina · 2026-04-20T17:26:48Z

+endif()
+
+# Link against LLVM/Clang/LLD
+target_link_libraries(


Might be worth trying installing libclang/llvm in the devcontainer if build times are too large.

shwina · 2026-04-20T17:27:02Z

We shouldn't need any libcudacxx changes.

miscco · 2026-04-21T10:36:37Z

+#ifndef _HOSTJIT_CLIMITS
+#define _HOSTJIT_CLIMITS
+
+#include "limits.h"


We should always avoid relative includes. Please add the proper include path and use <limits.h> Applies throughout

miscco · 2026-04-21T10:36:49Z

+
+

Suggested change

miscco · 2026-04-21T10:37:10Z

+#include "cstddef"
+#define EXIT_SUCCESS 0
+#define EXIT_FAILURE 1
+#define RAND_MAX 2147483647
+extern "C" {
+void* malloc(size_t); void* calloc(size_t, size_t);
+void* realloc(void*, size_t); void free(void*);
+void abort(void); void exit(int); void _Exit(int);
+}


Suggested change

#include "cstddef"

#define EXIT_SUCCESS 0

#define EXIT_FAILURE 1

#define RAND_MAX 2147483647

extern "C" {

void* malloc(size_t); void* calloc(size_t, size_t);

void* realloc(void*, size_t); void free(void*);

void abort(void); void exit(int); void _Exit(int);

}

#include "cstddef"

#define EXIT_SUCCESS 0

#define EXIT_FAILURE 1

#define RAND_MAX 2147483647

extern "C" {

void* malloc(size_t); void* calloc(size_t, size_t);

void* realloc(void*, size_t); void free(void*);

void abort(void); void exit(int); void _Exit(int);

}

miscco · 2026-04-21T10:38:11Z

+// resolve to libcudacxx/include/cuda/std/limits, which cascades through
+// numeric_limits, bit_cast, popcount, etc. — incompatible with freestanding.
+//
+// This stub (found first on -internal-isystem) stops that cascade, providing


@gevtushenko why do we need those? the cuda::std ones should work perfectly fine

@miscco you can try dropping it to check, I don't remember

We can't drop the limits or utility stubs. That makes clang include the libcudacxx versions of those headers, which are not host freestanding. (I tried empirically and got runtime errors).

miscco · 2026-04-21T10:40:02Z

+#define HUGE_VALL        __builtin_huge_vall()
+#define INFINITY         __builtin_inff()
+#define NAN              __builtin_nanf("")
+#define MATH_ERRNO       1


We already define a bunch of those internally in fp_classify.h and cmath

We should see whether we can just reuse those or move them to a different header

miscco · 2026-04-21T10:40:42Z

+
+namespace std {
+
+template <typename _Tp> struct remove_reference       { using type = _Tp; };


Why do we need this when we have a fully working remove_reference at home?

@miscco please try using it. The more things you can drop from this PR the better. The only criteria is that test keeps passing.

lets keep it as is an file an issue when merged

miscco · 2026-04-21T10:41:21Z

+template <typename _Tp>
+__host__ __device__ constexpr _Tp&&
+forward(remove_reference_t<_Tp>& __t) noexcept { return static_cast<_Tp&&>(__t); }
+
+template <typename _Tp>
+__host__ __device__ constexpr _Tp&&
+forward(remove_reference_t<_Tp>&& __t) noexcept { return static_cast<_Tp&&>(__t); }
+
+template <typename _Tp>
+__host__ __device__ constexpr remove_reference_t<_Tp>&&
+move(_Tp&& __t) noexcept { return static_cast<remove_reference_t<_Tp>&&>(__t); }


We should just include the necessary headers, or are those required to be in namespace std?

github-actions · 2026-04-23T12:12:51Z

🥳 CI Workflow Results

🟩 Finished in 1h 23m: Pass: 100%/480 | Total: 4d 14h | Max: 1h 23m | Hits: 94%/569464

See results here.

NaderAlAwar · 2026-04-24T15:08:57Z

+  std::unordered_map<std::string, std::string> macro_definitions; // key=macro name, value=macro value (empty for flag
+                                                                  // macros)
+  int sm_version         = 70;
+  int optimization_level = 2;


Suggestion: use 3 instead

NaderAlAwar · 2026-04-24T15:13:49Z

+  std::vector<std::string> device_bitcode_files; // Paths to .bc files to link into device code
+  std::unordered_map<std::string, std::string> macro_definitions; // key=macro name, value=macro value (empty for flag
+                                                                  // macros)
+  int sm_version         = 70;


Suggestion: should probably be 75

Absolutely! We do not support anything < sm75 officially.

NaderAlAwar · 2026-04-24T15:15:59Z

+    return 1;
+  }
+
+  int* d_ptr = nullptr;


Suggestion: use pointer_t

…mode (NVIDIA#8437) * HostJIT minimal infra * Fix? * Apply stylistic changes --------- Co-authored-by: Ashwin Srinath <shwina@users.noreply.github.com>

github-project-automation Bot added this to CCCL Apr 15, 2026

github-project-automation Bot moved this to Todo in CCCL Apr 15, 2026

cccl-authenticator-app Bot moved this from Todo to In Progress in CCCL Apr 15, 2026

shwina force-pushed the clangjit-minimal-infra branch from 238c422 to 3979052 Compare April 15, 2026 13:36

This comment has been minimized.

Sign in to view

bernhardmgruber reviewed Apr 15, 2026

View reviewed changes

This comment has been minimized.

Sign in to view

shwina force-pushed the clangjit-minimal-infra branch from 8969631 to 7136dc0 Compare April 17, 2026 21:11

shwina changed the title ~~[cccl.c] Introduce host freestanding mode and minimal compilation support for it~~ [cccl.c] Introduce minimal compilation support for host freestanding mode Apr 17, 2026

This comment has been minimized.

Sign in to view

shwina marked this pull request as ready for review April 20, 2026 14:57

shwina requested review from a team as code owners April 20, 2026 14:57

shwina requested a review from a team as a code owner April 20, 2026 14:57

shwina requested review from alliepiper, bernhardmgruber and robertmaynard April 20, 2026 14:57

cccl-authenticator-app Bot moved this from In Progress to In Review in CCCL Apr 20, 2026

shwina commented Apr 20, 2026

View reviewed changes

Comment thread c/parallel/src/clangjit/include/clangjit/codegen/bitcode.hpp Outdated

Copy link
Copy Markdown

Contributor Author

shwina Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drop codegen altogether

shwina commented Apr 20, 2026

View reviewed changes

shwina force-pushed the clangjit-minimal-infra branch from 079740b to b3e8366 Compare April 20, 2026 17:25

shwina commented Apr 20, 2026

View reviewed changes

This comment has been minimized.

Sign in to view

shwina force-pushed the clangjit-minimal-infra branch from b3e8366 to 0aca07f Compare April 21, 2026 10:12

miscco reviewed Apr 21, 2026

View reviewed changes

shwina force-pushed the clangjit-minimal-infra branch from ba50f21 to 55f9365 Compare April 21, 2026 15:38

This comment has been minimized.

Sign in to view

miscco approved these changes Apr 23, 2026

View reviewed changes

shwina added 3 commits April 23, 2026 06:46

HostJIT minimal infra

2e0fd67

Fix?

faefe83

Apply stylistic changes

597e259

shwina force-pushed the clangjit-minimal-infra branch from 55f9365 to 597e259 Compare April 23, 2026 10:46

shwina enabled auto-merge (squash) April 23, 2026 10:47

NaderAlAwar reviewed Apr 24, 2026

View reviewed changes

NaderAlAwar approved these changes Apr 24, 2026

View reviewed changes

shwina merged commit cb93dee into NVIDIA:main Apr 24, 2026
552 checks passed

github-project-automation Bot moved this from In Review to Done in CCCL Apr 24, 2026


		namespace std {

		template <typename _Tp> struct remove_reference { using type = _Tp; };

Conversation

shwina commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist

Uh oh!

copy-pr-bot Bot commented Apr 15, 2026

Uh oh!

shwina commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shwina commented Apr 15, 2026

Uh oh!

shwina commented Apr 15, 2026

Uh oh!

shwina commented Apr 15, 2026

Uh oh!

shwina commented Apr 15, 2026

Uh oh!

shwina commented Apr 15, 2026

Uh oh!

This comment has been minimized.

Choose a reason for hiding this comment

Uh oh!

shwina commented Apr 15, 2026

Uh oh!

shwina commented Apr 15, 2026

Uh oh!

shwina commented Apr 15, 2026

Uh oh!

This comment has been minimized.

shwina commented Apr 15, 2026

Uh oh!

This comment has been minimized.

shwina commented Apr 16, 2026

Uh oh!

shwina commented Apr 17, 2026

Uh oh!

This comment has been minimized.

shwina commented Apr 20, 2026

Uh oh!

This comment has been minimized.

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

This comment has been minimized.

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

shwina Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

This comment has been minimized.

shwina commented Apr 15, 2026 •

edited

Loading

shwina commented Apr 15, 2026 •

edited

Loading

shwina Apr 23, 2026 •

edited

Loading