2nd GPU for PFlash?

I happen to have both RTX3080 (10GB) & 3090.
You mentioned that a [drafter on 2nd GPU would prevent loading & unloading of the draft model](https://www.lucebox.com/blog/pflash).
How exactly can I do this?
& could other small LLMs be used, like `Qwen Coder 1.5B` or `Qwen3.5 4b`?