Skip to content

Supernet training is too slow #18

@Spark001

Description

@Spark001

Thanks for your implementation of SPOS by MXNET^_^. But I found the supernet training was too slow when I trained my own network. I profiled the training procedure and found some problems as follows.

At first, the imperative mode is slower than hybrid mode so much. Then I tried to use more GPUs to train, however, get no acceleration. Instead, the GPU utility decreased dramatically when GPU numbers increase. I guess the calculation in different GPUs is serial but not parallel in imperative mode. Have you ever encountered these problems above?

Furthermore, anything can be improved to accelerate the training? Could we set the mode to be imperative when sampling subnet, then change the mode to be hybrid when training subnet?

Waiting for your reply!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions