Skip to content

Parallel scan performance gap between Vulkan and CUDA #17

@qiao-bo

Description

@qiao-bo

Currently we support warp-based parallel scan for Vulkan and CUDA. Lets use this issue to track some performance data:

ENV: RTX3080 with Driver 510. CUDA 11.6.

Number of elements Vulkan CUDA
131072 0.348 ms 0.160 ms
65536 0.308 ms 0.111 ms
32768 0.311 ms 0.114 ms
16384 0.232 ms 0.082 ms
8192 0.222 ms 0.075 ms
4096 0.183 ms 0.075 ms

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions