Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions book/i18n/ko/src/puzzle_27/puzzle_27.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,10 +48,12 @@ GPU 스레드 블록 (128 스레드, 4개 또는 2개 워프, 하드웨어 조
# 복잡한 블록 전체 리덕션 (기존 방식 - Puzzle 12에서):
shared_memory[local_i] = my_value
barrier()
for stride in range(64, 0, -1):
stride = 64
while stride > 0:
if local_i < stride:
shared_memory[local_i] += shared_memory[local_i + stride]
barrier()
stride //= 2
if local_i == 0:
output[block_idx.x] = shared_memory[0]

Expand Down Expand Up @@ -81,10 +83,12 @@ if local_i == 0:
shared_memory[local_i] = my_value
barrier()
# 스트라이드 기반 인덱싱을 사용한 트리 리덕션...
for stride in range(64, 0, -1):
stride = 64
while stride > 0:
if local_i < stride:
shared_memory[local_i] += shared_memory[local_i + stride]
barrier()
stride //= 2
```

### **중간 단계: 워프 프로그래밍 (Puzzle 24)**
Expand Down
8 changes: 6 additions & 2 deletions book/src/puzzle_27/puzzle_27.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,10 +46,12 @@ Learn the complete parallel programming toolkit from `gpu.primitives.block`:
# Complex block-wide reduction (traditional approach - from Puzzle 12):
shared_memory[local_i] = my_value
barrier()
for stride in range(64, 0, -1):
stride = 64
while stride > 0:
if local_i < stride:
shared_memory[local_i] += shared_memory[local_i + stride]
barrier()
stride //= 2
if local_i == 0:
output[block_idx.x] = shared_memory[0]

Expand Down Expand Up @@ -79,10 +81,12 @@ Complex but educational - explicit shared memory, barriers, and tree reduction:
shared_memory[local_i] = my_value
barrier()
# Tree reduction with stride-based indexing...
for stride in range(64, 0, -1):
stride = 64
while stride > 0:
if local_i < stride:
shared_memory[local_i] += shared_memory[local_i + stride]
barrier()
stride //= 2
```

### **The intermediate step: Warp programming (Puzzle 24)**
Expand Down
Loading