Skip to content

Commit 43dc78c

Browse files
Rollup merge of #157560 - scottmcm:mul_nuw_nsw_in_memcpy, r=saethlin
In `copy_nonoverlapping`, use `mul nuw nsw` to compute the byte size Seems like we might as well? Adding these flags means the optimizer can tell the limited range on the count of items -- like how we use these flags (#136575) when calculating `size_of_val` for a slice. Today we use a wrapping multiplication, which mean that `copy_nonoverlapping::<u32>(src, dst, 0x40000000_00000001)` appears like 4 bytes -- a perfectly reasonable size! -- once it gets to the `memcpy` call. If I'm understanding <https://doc.rust-lang.org/std/ptr/fn.copy_nonoverlapping.html#safety> properly, this is just exploiting existing UB, since `src` and `dst` must each be inside an allocation, and those allocations can be at most `isize::MAX` bytes. (Plus, fundamentally, to be non-overlapping there's not enough space in the address space to be bigger than `isize::MAX`.) cc @RalfJung to make sure this is ok, as requested last he found out I was newly exploiting some UB in codegen 🙃 r? codegen
2 parents 59015fa + 4af2ca0 commit 43dc78c

2 files changed

Lines changed: 18 additions & 1 deletion

File tree

compiler/rustc_codegen_ssa/src/mir/statement.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -105,7 +105,7 @@ impl<'a, 'tcx, Bx: BuilderMethods<'a, 'tcx>> FunctionCx<'a, 'tcx, Bx> {
105105
.layout_of(bx.typing_env().as_query_input(pointee))
106106
.expect("expected pointee to have a layout");
107107
let elem_size = pointee_layout.layout.size().bytes();
108-
let bytes = bx.mul(count, bx.const_usize(elem_size));
108+
let bytes = bx.unchecked_sumul(count, bx.const_usize(elem_size));
109109

110110
let align = pointee_layout.layout.align.abi;
111111
let dst = dst_val.immediate();
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
//@ compile-flags: -Copt-level=3 -C no-prepopulate-passes
2+
//@ only-64bit (so I don't need to worry about usize)
3+
4+
#![crate_type = "lib"]
5+
#![feature(core_intrinsics)]
6+
7+
// This deals in a count of elements, not bytes, so we need to multiply.
8+
// Ensure we preserve UB from a count too high to be valid.
9+
use std::intrinsics::copy_nonoverlapping;
10+
11+
// CHECK-LABEL: @copy_u16(
12+
#[no_mangle]
13+
pub unsafe fn copy_u16(src: *const u16, dst: *mut u16, count: usize) {
14+
// CHECK: [[BYTES:%.+]] = mul nuw nsw i64 %count, 2
15+
// CHECK: call void @llvm.memcpy.p0.p0.i64(ptr align 2 %dst, ptr align 2 %src, i64 [[BYTES]], i1 false)
16+
copy_nonoverlapping(src, dst, count)
17+
}

0 commit comments

Comments
 (0)