We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
1 parent da00388 commit da552e1Copy full SHA for da552e1
1 file changed
problems/p32/p32.mojo
@@ -72,16 +72,16 @@ def two_way_conflict_kernel(
72
Each bank serves 2 threads, doubling access time.
73
"""
74
75
- # Shared memory buffer - stride-2 access pattern creates conflicts
+ # Sized to 2*TPB so stride-2 writes don't alias (threads i and i+TPB/2).
76
var shared_buf = stack_allocation[
77
dtype=dtype, address_space=AddressSpace.SHARED
78
- ](row_major[TPB]())
+ ](row_major[2 * TPB]())
79
80
var global_i = block_dim.x * block_idx.x + thread_idx.x
81
var local_i = thread_idx.x
82
83
# CONFLICT: stride-2 access creates 2-way bank conflicts
84
- var conflict_index = (local_i * 2) % TPB
+ var conflict_index = local_i * 2
85
86
# Load with bank conflicts
87
if global_i < size:
0 commit comments