Skip to content

Commit fd46aff

Browse files
[docs] update tma description (#2154)
* update tma description * update * add lint * refine the statement * add T.tma_copy description in Instruction Reference
1 parent 7d8e983 commit fd46aff

1 file changed

Lines changed: 4 additions & 4 deletions

File tree

docs/programming_guides/instructions.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ This page summarizes the core TileLang “instructions” available at the DSL
44
level, how they map to hardware concepts, and how to use them correctly.
55

66
## Quick Categories
7-
- Data movement: `T.copy`, `T.async_copy`, `T.c2d_im2col`, staging Global ↔ Shared ↔ Fragment
7+
- Data movement: `T.copy`, `T.async_copy`, `T.tma_copy`, `T.c2d_im2col`, staging Global ↔ Shared ↔ Fragment
88
- Compute primitives: `T.gemm`/`T.gemm_sp`, elementwise math (`T.exp`, `T.max`),
99
reductions (`T.reduce_sum`, `T.cumsum`, warp reducers)
1010
- Control helpers: `T.clear`/`T.fill`, `T.reshape`/`T.view`
@@ -33,14 +33,13 @@ Semantics
3333
- Safety: the LegalizeSafeMemoryAccess pass inserts boundary guards when an
3434
access may be out‑of‑bounds and drops them when proven safe.
3535

36-
### `T.copy` vs `T.async_copy`
36+
### Lowering `T.copy` to variants of copy mechanisms
3737

3838
TileLang supports both synchronous and explicitly-asynchronous copies.
3939

4040
`T.copy(src, dst, ...)` (synchronous semantics)
4141
- Intended default for most TileLang programs.
42-
- The compiler is free to lower it to different mechanisms (SIMT copy, `ldmatrix`,
43-
TMA, `cp.async`, etc.) depending on target/hints, but the observable semantics
42+
- The compiler is free to lower it to different mechanisms (synchronous SIMT copy `ld.global`, warp-level copy `ldmatrix`, async copy via TMA `cp.async.bulk`, old async copy `cp.async`, etc.) depending on target/hints, but the observable semantics
4443
are *synchronous*: after the statement, it is safe to use `dst`.
4544
- If `T.copy` lowers to `cp.async`, TileLang will still preserve synchronous
4645
semantics by emitting the required `commit`/`wait` (and any required
@@ -145,6 +144,7 @@ signatures, behaviors, constraints, and examples, refer to API Reference
145144
Data movement
146145
- `T.copy(src, dst, ...)`: Move tiles between Global/Shared/Fragment.
147146
- `T.async_copy(src, dst, ...)`: Explicit async global→shared copy via `cp.async`.
147+
- `T.tma_copy(src, dst, ...)`: Explicit async global→shared copy via `cp.async.bulk`
148148
- `T.transpose(src, dst)`: Transpose a 2D shared buffer: `dst[j, i] = src[i, j]`.
149149
- `T.c2d_im2col(img, col, ...)`: 2D im2col transform for conv.
150150

0 commit comments

Comments
 (0)