Skip to content

Commit a56e2d3

Browse files
Rollup merge of #151071 - gen-openmp-metadata, r=nnethercote
Generate openmp metadata LLVM has an openmp-opt pass, which is part of the default O3 pipeline. The pass bails if we don't have a global called openmp, so let's generate it if people enable our experimental offload feature. openmp is a superset of the offload feature, so they share optimizations. In follow-up PRs I'll start verifying that LLVM optimizes Rust the way we want it. r? compiler
2 parents 3d087e6 + 5c85d52 commit a56e2d3

2 files changed

Lines changed: 6 additions & 0 deletions

File tree

compiler/rustc_codegen_llvm/src/builder/gpu_offload.rs

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,10 @@ impl<'ll> OffloadGlobals<'ll> {
5555
let init_ty = cx.type_func(&[], cx.type_void());
5656
let init_rtls = declare_offload_fn(cx, "__tgt_init_all_rtls", init_ty);
5757

58+
// We want LLVM's openmp-opt pass to pick up and optimize this module, since it covers both
59+
// openmp and offload optimizations.
60+
llvm::add_module_flag_u32(cx.llmod(), llvm::ModuleFlagMergeBehavior::Max, "openmp", 51);
61+
5862
OffloadGlobals {
5963
launcher_fn,
6064
launcher_ty,

tests/codegen-llvm/gpu_offload/gpu_host.rs

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -104,3 +104,5 @@ pub fn _kernel_1(x: &mut [f32; 256]) {
104104
// CHECK-NEXT: call void @__tgt_unregister_lib(ptr nonnull %EmptyDesc)
105105
// CHECK-NEXT: ret void
106106
// CHECK-NEXT: }
107+
108+
// CHECK: !{i32 7, !"openmp", i32 51}

0 commit comments

Comments
 (0)