Skip to content

Commit 2a49f3e

Browse files
authored
Merge pull request #62 from JuliaFolds2/doc_index_chunks
document index_chunks in example
2 parents 23ed671 + 1f698b5 commit 2a49f3e

2 files changed

Lines changed: 47 additions & 0 deletions

File tree

CHANGELOG.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,10 @@
11
ChunkSplitters.jl Changelog
22
===========================
33

4+
Version 3.1.3-DEV
5+
-------------
6+
- ![INFO][badge-info] Improve documentation of multithreading by adding `index_chunks` examples.
7+
48
Version 3.1.2
59
-------------
610
- ![ENHANCEMENT][badge-enhancement] Return a single chunk if `minsize > length(collection)`.

docs/src/multithreading.md

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,27 @@ julia> @btime parallel_sum(x -> log(x)^7, $x; n=Threads.nthreads());
3232
321.083 μs (44 allocations: 3.42 KiB)
3333
```
3434

35+
Equivalently, we can use `index_chunks` to iterate over the *index ranges* (instead of the elements) and manually create views:
36+
37+
```julia-repl
38+
julia> using ChunkSplitters: index_chunks
39+
40+
julia> using Base.Threads: nthreads, @spawn
41+
42+
julia> function parallel_sum(f, x; n=nthreads())
43+
tasks = map(index_chunks(x; n=n)) do inds
44+
@spawn sum(f, @view(x[inds]))
45+
end
46+
return sum(fetch, tasks)
47+
end
48+
parallel_sum (generic function with 1 method)
49+
50+
julia> x = rand(10^5);
51+
52+
julia> parallel_sum(identity, x) ≈ sum(identity, x) # true
53+
true
54+
```
55+
3556
Note that by chunking `x` we can readily control how many tasks we will use for the parallelisation. One reason why this is useful is that we can reduce the (large) overhead that we would have to pay if we would simply spawn `length(x)` tasks:
3657

3758
```julia-repl
@@ -82,6 +103,28 @@ julia> @btime parallel_sum(x -> log(x)^7, $x; n=Threads.nthreads());
82103
319.000 μs (35 allocations: 3.42 KiB)
83104
```
84105

106+
Alternatively, we can use `index_chunks` instead of `enumerate(chunks(...))`. Since `index_chunks` directly provides index ranges, it is a natural fit for this use case:
107+
108+
```julia-repl
109+
julia> using ChunkSplitters: index_chunks
110+
111+
julia> using Base.Threads: nthreads, @threads
112+
113+
julia> function parallel_sum(f, x; n=nthreads())
114+
psums = Vector{eltype(x)}(undef, n)
115+
@threads for (i, inds) in enumerate(index_chunks(x; n=n))
116+
psums[i] = sum(f, @view(x[inds]))
117+
end
118+
return sum(psums)
119+
end
120+
parallel_sum (generic function with 1 method)
121+
122+
julia> x = rand(10^5);
123+
124+
julia> parallel_sum(identity, x) ≈ sum(identity, x) # true
125+
true
126+
```
127+
85128
However, the fact that this works is that we actively support it. In general, `@threads` isn't compatible with `enumerate`:
86129

87130
```julia-repl

0 commit comments

Comments
 (0)