Skip to content

perf: optimize guided decoding with xgrammar upgrade, batched API, and async D2H overlap#4605

Open
windreamer wants to merge 6 commits into
InternLM:mainfrom
windreamer:feat/guided-decoding-optimization
Open

perf: optimize guided decoding with xgrammar upgrade, batched API, and async D2H overlap#4605
windreamer wants to merge 6 commits into
InternLM:mainfrom
windreamer:feat/guided-decoding-optimization

fix: move FillMask vector allocs into rank-0 block, use PRIVATE for i…

84a90a2
Select commit
Loading
Failed to load commit list.
Sign in for the full log view