Skip to content

pfork performance tips? #4

@timholy

Description

@timholy

Hi Amit,

I'm finally getting around to trying the approach in your gist---many thanks for that example, it makes it very easy to use.

Currently I'm getting the result that for small-ish datasets (all I have room for on my laptop), using pfork with 4 workers is slower than just doing it the regular way. I tried profiling, but the profiler doesn't seem to work with pfork, unfortunately. Given the cautions about I/O, I'm a little unsure of the best approach for figuring out where the time is going---is it the fork itself, or some other aspect?

The overall task is one where I take a single 3D array as input, and each worker should work on a separate chunk of a 2D output (this is a 3D image-rendering problem). I'm allocating each output chunk with your anon_map, and just using a plain array (optionally, mmapped to a file) as an input. The running time for a small data set in single-threaded mode is 0.2 s, whereas a pfork with 4 workers is approximately 4 times slower.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions