-
Notifications
You must be signed in to change notification settings - Fork 178
Pull requests: Luce-Org/lucebox-hub
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
docs(hip): perf diagnosis + 4-tier optimization plan (rocprofv3 evidence on gfx1100/1151/1201)
#156
opened May 11, 2026 by
Kaden-Schutt
Loading…
2 of 4 tasks
WIP: configs/profiles + configs/backends — declarative deployment configuration
#155
opened May 11, 2026 by
dusterbloom
Contributor
•
Draft
2 of 4 tasks
feat(dflash): linear native MTP integrated decode CLI (stacked on #153)
#154
opened May 11, 2026 by
javierpazo
Contributor
Loading…
feat(dflash): native Qwen3.6 MTP (NextN) runtime + contract test
#153
opened May 11, 2026 by
javierpazo
Contributor
Loading…
Preserve exact tool-call text during prompt replay
#151
opened May 11, 2026 by
howard0su
Contributor
Loading…
docs(pflash): clarify drafter dtype in operator notes benchmark
#149
opened May 10, 2026 by
javierpazo
Contributor
Loading…
feat(dflash): add NVFP4 per-tensor scale2 support
#146
opened May 10, 2026 by
phazei
Contributor
Loading…
feat(dflash): accept FP16 safetensors drafter alongside BF16
#142
opened May 9, 2026 by
javierpazo
Contributor
Loading…
fix(dflash): derive n_target_layers fallback in gguf_draft_loader
#138
opened May 9, 2026 by
javierpazo
Contributor
Loading…
chore(dflash): enforce sm_89 user override and keep BSA enabled
#137
opened May 9, 2026 by
javierpazo
Contributor
Loading…
feat(dflash): native multi-request scheduler with batched target step
#135
opened May 9, 2026 by
javierpazo
Contributor
Loading…
Gemma4 support: pFlash + DFlash + chunked prefill, daemon mode, server routing
#131
opened May 8, 2026 by
dusterbloom
Contributor
Loading…
5 of 6 tasks
Add HIP/ROCm support for Strix Halo (gfx1151)
#119
opened May 7, 2026 by
smpurkis
Loading…
3 of 4 tasks
feat(dflash): support Qwen3.6-27B-DFlash draft (SWA layers) — 106 t/s on RTX 4090
#94
opened May 4, 2026 by
Quitetall
Contributor
Loading…
perf(pflash): add SM75 target-resident TTFT path
#72
opened May 1, 2026 by
weicj
Contributor
Loading…
dflash: split target/draft StepGraphs to fix ggml_gallocr realloc per spec-decode step (issue #55)
#62
opened Apr 29, 2026 by
dusterbloom
Contributor
Loading…
4 of 5 tasks
fix(dflash): auto-detect GPU arch to prevent sm_120a on consumer Blackwell
#48
opened Apr 27, 2026 by
easel
Contributor
Loading…
2 tasks
feat(dflash): MoE 35B-A3B support + DDTree CUDA graph reuse
#39
opened Apr 27, 2026 by
dusterbloom
Contributor
Loading…
4 of 5 tasks
ProTip!
Updated in the last three days: updated:>2026-05-08.