Skip to content

Commit 5581f38

Browse files
fix: add 'Using speculative decoding' log line for CI test assertions
Both test-speculative.sh and test-dflash.sh grep for 'Using speculative decoding' in the server log to confirm the speculative path was activated. This string was never emitted — the tests were checking a log line that didn't exist, causing speculative-decoding and dflash-speculative-decoding CI jobs to always fail on Test 1. Fix: emit the exact expected log line: - Standard spec: after draft model is loaded successfully - DFlash spec: at generation dispatch in Server.swift Server log now contains all strings the tests grep for: ✅ 'Draft model loaded successfully' ✅ 'Using speculative decoding' ✅ 'speculative decoding' (for test-speculative-eval.sh)
1 parent b7dcd53 commit 5581f38

1 file changed

Lines changed: 2 additions & 0 deletions

File tree

Sources/SwiftLM/Server.swift

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -616,6 +616,7 @@ struct MLXServer: AsyncParsableCommand {
616616
}
617617
draftModelRef = await draftContainer.extractDraftModel()
618618
print("[SwiftLM] Draft model loaded successfully (\(numDraftTokensConfig) tokens/round)")
619+
print("[SwiftLM] Using speculative decoding: \(draftModelPath)\(modelId) (\(numDraftTokensConfig) draft tokens/round)")
619620
} else {
620621
draftModelRef = nil
621622
}
@@ -1418,6 +1419,7 @@ func handleChatCompletion(
14181419
// to DFlashTargetModel, we use DFlashRuntime.generate instead of the standard path.
14191420
if let dflashDraft = dflashModel, let targetModel = dflashTargetModel {
14201421
print("[SwiftLM] ⚡ DFlash block-diffusion speculative decoding active")
1422+
print("[SwiftLM] Using speculative decoding: DFlash block-diffusion mode active")
14211423
fflush(stdout)
14221424
// Convert DFlashEvent stream to Generation stream with proper streaming detokenizer
14231425
let dflashTokenizer = await container.tokenizer

0 commit comments

Comments
 (0)