Skip to content

Commit 4a6f9d9

Browse files
yushangdiclaude
andcommitted
Add CUDA graph kernel annotations tutorial
This tutorial demonstrates how to use CUDA graph kernel annotations for semantic profiling traces with custom visualization lanes. Features: - End-to-end workflow from graph capture to visualization - Transformer block example with annotated regions - Post-processing to merge annotations into profiler traces - Custom stream assignments for semantic organization - Version checking for cuda-bindings compatibility - Clear error messages with upgrade instructions The tutorial includes: - mark_kernels() context manager usage - Graph capture with enable_annotations=True - Profiling and trace post-processing - Before/after comparison - Troubleshooting guide Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
1 parent cdc645a commit 4a6f9d9

3 files changed

Lines changed: 565 additions & 0 deletions

File tree

0 commit comments

Comments
 (0)