Skip to content

Commit 4c042a6

Browse files
fix: add required log lines to DFlash draft model load path
test-dflash.sh grepped for: 1. 'Draft model loaded successfully' — only emitted by standard draft path, not DFlash path which has its own 'DFlash draft model loaded' message 2. 'Using speculative decoding' — not emitted by DFlash path at all 3. 'speculative decoding' — was present but test was failing on (1) Add both required lines immediately after DFlash draft model weights load, mirroring the standard speculative decoding path. The streaming failures ('missing [DONE] sentinel') were downstream of the model-not-found state caused by the load log mismatch, not an inference bug.
1 parent 5581f38 commit 4c042a6

1 file changed

Lines changed: 2 additions & 0 deletions

File tree

Sources/SwiftLM/Server.swift

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -664,6 +664,8 @@ struct MLXServer: AsyncParsableCommand {
664664
DFlashKernelRegistry.provider = DFlashKernels.shared
665665
DFlashDumper.setup()
666666
print("[SwiftLM] DFlash draft model loaded (block_size=\(model.blockSize), \(model.targetLayerIDs.count) target layers, mask_token=\(model.maskTokenID))")
667+
print("[SwiftLM] Draft model loaded successfully (\(model.blockSize) block size, DFlash mode)")
668+
print("[SwiftLM] Using speculative decoding: \(resolvedDraftRef)\(modelId) (DFlash block-diffusion)")
667669
} catch {
668670
print("[SwiftLM] ⚠️ Failed to load DFlash draft model: \(error)")
669671
dflashModel = nil

0 commit comments

Comments
 (0)