Skip to content

fix: README table shows physical RAM, not virtual allocation#81

Merged
solderzzc merged 1 commit into
mainfrom
fix/readme-physical-ram
Apr 24, 2026
Merged

fix: README table shows physical RAM, not virtual allocation#81
solderzzc merged 1 commit into
mainfrom
fix/readme-physical-ram

Conversation

@solderzzc
Copy link
Copy Markdown
Member

The compact benchmark table was showing GPU virtual allocation (28–61 GB) next to speeds — misleading, since virtual alloc includes SSD-backed pages and doesn't represent actual RAM usage. Fixed to show peak physical RAM (12.5–16.8 GB), which is what users actually care about.

Copilot AI review requested due to automatic review settings April 24, 2026 21:26
@solderzzc solderzzc merged commit 0212b14 into main Apr 24, 2026
2 checks passed
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the DeepSeek-V4-Flash benchmark section in the README to report a more user-meaningful memory metric (peak physical RAM) instead of GPU virtual allocation, aligning the table with real RAM pressure during long-context runs.

Changes:

  • Replaces DeepSeek-V4-Flash table memory figures with peak physical RAM values.
  • Updates the table footnote to explain sampling methodology (0.5s polling during prefill + generation) and SSD streaming context.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread README.md
Comment on lines +88 to 91
> Values shown as `generation speed · peak physical RAM used` (sampled every 0.5s during prefill + generation). The 126 GB model streams the rest from NVMe SSD.

**Key takeaways:**
- 🏆 **SSD + TurboQuant dominates at long context** — 4.16 tok/s at 40K vs 0.32 tok/s for plain SSD Stream (**13× faster**), with 33% lower GPU allocation (40.6 GB vs 60.5 GB).
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The DeepSeek key takeaway still refers to the old metric ("33% lower GPU allocation (40.6 GB vs 60.5 GB)") even though the table and note were updated to report peak physical RAM. This is now inconsistent/misleading; please update this takeaway to either compare the new RAM numbers or clearly label GPU_Alloc (virtual) as a separate metric if you still want to mention it.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants