Skip to content

feat(hslm): gradient checkpointing for memory efficiency#562

Merged
gHashTag merged 1 commit into
mainfrom
feat/317-gradient-checkpointing
Apr 30, 2026
Merged

feat(hslm): gradient checkpointing for memory efficiency#562
gHashTag merged 1 commit into
mainfrom
feat/317-gradient-checkpointing

Conversation

@gHashTag

Copy link
Copy Markdown
Owner

Summary

Gradient checkpointing to reduce memory from O(L) to O(L/checkpoint_every) activations.

New file

  • src/b2t/gradient_checkpointing.zig — 170 LOC

Memory savings

  • Full: store all L layer activations (e.g., 9 layers × 243-dim × 4B = 8.7KB/batch)
  • Checkpoint every 2: store ~L/2 activations, recompute others during backward
  • Tradeoff: ~30% more compute for ~50% memory reduction

Features

  • CheckpointStore: save activations every N layers, FIFO eviction when full
  • MemoryBudget: compute checkpoint interval from total RAM budget
  • memoryUsedMB(), savedMemoryMB(): memory tracking
  • Configurable: checkpoint frequency, max checkpoints

Tests (5)

  • Save and count checkpoints
  • Skip non-checkpoint layers (every_n=2)
  • FIFO eviction when max reached
  • Memory tracking accuracy
  • Budget-based interval recommendation

Closes #317

- Add src/b2t/gradient_checkpointing.zig
- CheckpointStore: save activations every N layers, evict oldest
- MemoryBudget: compute checkpoint interval from available RAM
- Memory tracking: MB used, MB saved vs full activation storage
- 5 tests: save/count, skip non-checkpoint layers, eviction,
  memory tracking, budget interval

Closes #317
@gHashTag gHashTag merged commit f71cfda into main Apr 30, 2026
9 of 19 checks passed
@gHashTag gHashTag deleted the feat/317-gradient-checkpointing branch April 30, 2026 00:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(hslm): gradient checkpointing for memory efficiency

1 participant