Skip to content

High results fluctuation in repeating experiments with same setup #550

@Rachelcoll

Description

@Rachelcoll

I am using the command CUDA_VISIBLE_DEVICES=0,1 python3 SpecForge/benchmarks/bench_eagle3.py --model-path Qwen/Qwen3-8B --speculative-draft-model-path path/to/model --port 30000 --trust-remote-code --mem-fraction-static 0.8 --tp-size 1 --attention-backend fa3 --config-list 1,6,10,32 --benchmark-list mtbench --dtype bfloat16 to test 2 methods, however the first time I tested, baselineA results in TPS 106, baselineB results in TPS 100, the second time, baselineA results in TPS93, baseline B results in TPS 107, the third and later time I tested, both baselines are TPS 93......I don't think the randomness in the experiment is the cause, and the GPU and parameters for each experiment is all the same, does anyone faced the same issue or know about possible reasons?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions