Skip to content

Commit a5c78c5

Browse files
Sebastien Loiselclaude
andcommitted
Fix docs: CUDA requires NCCL+CUDSS_jll; MPI.Init() after using statements
- CUDA extension requires: using CUDA, NCCL, CUDSS_jll - MPI.Init() should be called after all using statements, not before - Fixed all occurrences across guide.md, examples.md, index.md, installation.md Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
1 parent 1533ddc commit a5c78c5

4 files changed

Lines changed: 27 additions & 47 deletions

File tree

docs/src/examples.md

Lines changed: 17 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -8,9 +8,8 @@ This page provides detailed examples of using LinearAlgebraMPI.jl for various di
88

99
```julia
1010
using MPI
11-
MPI.Init()
12-
1311
using LinearAlgebraMPI
12+
MPI.Init()
1413
using SparseArrays
1514
using LinearAlgebra
1615

@@ -45,9 +44,8 @@ println(io0(), "Multiplication error: $err")
4544

4645
```julia
4746
using MPI
48-
MPI.Init()
49-
5047
using LinearAlgebraMPI
48+
MPI.Init()
5149
using SparseArrays
5250
using LinearAlgebra
5351

@@ -79,9 +77,8 @@ println(io0(), "Result size: $(size(Cdist))")
7977

8078
```julia
8179
using MPI
82-
MPI.Init()
83-
8480
using LinearAlgebraMPI
81+
MPI.Init()
8582
using SparseArrays
8683
using LinearAlgebra
8784

@@ -121,9 +118,8 @@ println(io0(), "Complex matrix operations completed")
121118

122119
```julia
123120
using MPI
124-
MPI.Init()
125-
126121
using LinearAlgebraMPI
122+
MPI.Init()
127123
using SparseArrays
128124
using LinearAlgebra
129125

@@ -167,9 +163,8 @@ The `transpose` function creates a lazy wrapper without transposing the data. Th
167163

168164
```julia
169165
using MPI
170-
MPI.Init()
171-
172166
using LinearAlgebraMPI
167+
MPI.Init()
173168
using SparseArrays
174169
using LinearAlgebra
175170

@@ -199,9 +194,8 @@ println(io0(), "Lazy transpose multiplication completed")
199194

200195
```julia
201196
using MPI
202-
MPI.Init()
203-
204197
using LinearAlgebraMPI
198+
MPI.Init()
205199
using SparseArrays
206200
using LinearAlgebra
207201

@@ -238,9 +232,8 @@ println(io0(), "transpose(A) * B error: $err")
238232

239233
```julia
240234
using MPI
241-
MPI.Init()
242-
243235
using LinearAlgebraMPI
236+
MPI.Init()
244237
using SparseArrays
245238
using LinearAlgebra
246239

@@ -274,9 +267,8 @@ println(io0(), "Scalar multiplication errors: $err1, $err2")
274267

275268
```julia
276269
using MPI
277-
MPI.Init()
278-
279270
using LinearAlgebraMPI
271+
MPI.Init()
280272
using SparseArrays
281273
using LinearAlgebra
282274

@@ -313,9 +305,8 @@ Here's an example of using LinearAlgebraMPI.jl for power iteration to find the d
313305

314306
```julia
315307
using MPI
316-
MPI.Init()
317-
318308
using LinearAlgebraMPI
309+
MPI.Init()
319310
using SparseArrays
320311
using LinearAlgebra
321312

@@ -351,9 +342,8 @@ LinearAlgebraMPI provides distributed sparse direct solvers using the multifront
351342

352343
```julia
353344
using MPI
354-
MPI.Init()
355-
356345
using LinearAlgebraMPI
346+
MPI.Init()
357347
using SparseArrays
358348
using LinearAlgebra
359349

@@ -391,9 +381,8 @@ println(io0(), "LDLT solve residual: $residual")
391381

392382
```julia
393383
using MPI
394-
MPI.Init()
395-
396384
using LinearAlgebraMPI
385+
MPI.Init()
397386
using SparseArrays
398387
using LinearAlgebra
399388

@@ -426,9 +415,8 @@ LDLT uses Bunch-Kaufman pivoting to handle symmetric indefinite matrices:
426415

427416
```julia
428417
using MPI
429-
MPI.Init()
430-
431418
using LinearAlgebraMPI
419+
MPI.Init()
432420
using SparseArrays
433421
using LinearAlgebra
434422

@@ -459,9 +447,8 @@ For sequences of matrices with the same sparsity pattern, the symbolic factoriza
459447

460448
```julia
461449
using MPI
462-
MPI.Init()
463-
464450
using LinearAlgebraMPI
451+
MPI.Init()
465452
using SparseArrays
466453
using LinearAlgebra
467454

@@ -501,9 +488,8 @@ println(io0(), "F2 residual: ", norm(A2 * x2_full - ones(n), Inf))
501488

502489
```julia
503490
using MPI
504-
MPI.Init()
505-
506491
using LinearAlgebraMPI
492+
MPI.Init()
507493
using SparseArrays
508494

509495
n = 100
@@ -543,9 +529,8 @@ Row-wise operations are local - no MPI communication is needed since rows are al
543529

544530
```julia
545531
using MPI
546-
MPI.Init()
547-
548532
using LinearAlgebraMPI
533+
MPI.Init()
549534
using LinearAlgebra
550535

551536
# Create a deterministic dense matrix (same on all ranks)
@@ -569,9 +554,8 @@ Column-wise operations require MPI communication to gather each full column:
569554

570555
```julia
571556
using MPI
572-
MPI.Init()
573-
574557
using LinearAlgebraMPI
558+
MPI.Init()
575559
using LinearAlgebra
576560

577561
# Create a deterministic dense matrix
@@ -594,9 +578,8 @@ The standard Julia pattern `vcat(f.(eachrow(A))...)` doesn't work with distribut
594578

595579
```julia
596580
using MPI
597-
MPI.Init()
598-
599581
using LinearAlgebraMPI
582+
MPI.Init()
600583
using LinearAlgebra
601584

602585
# Standard Julia pattern (for comparison):

docs/src/guide.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -32,10 +32,9 @@ In Julia, `SparseMatrixCSR{T,Ti}` is a type alias for `Transpose{T, SparseMatrix
3232

3333
```julia
3434
using MPI
35-
MPI.Init()
36-
3735
using LinearAlgebraMPI
3836
using SparseArrays
37+
MPI.Init()
3938

4039
# Create from native types (data is distributed automatically)
4140
v = VectorMPI(randn(100))
@@ -371,14 +370,14 @@ Load the GPU package **before** MPI for proper detection:
371370
# For Metal (macOS)
372371
using Metal
373372
using MPI
374-
MPI.Init()
375373
using LinearAlgebraMPI
374+
MPI.Init()
376375

377376
# For CUDA (Linux/Windows)
378-
using CUDA
377+
using CUDA, NCCL, CUDSS_jll
379378
using MPI
380-
MPI.Init()
381379
using LinearAlgebraMPI
380+
MPI.Init()
382381
```
383382

384383
### Converting Between CPU and GPU
@@ -460,9 +459,10 @@ For multi-GPU distributed sparse direct solves, LinearAlgebraMPI provides `CuDSS
460459
### Basic Usage
461460

462461
```julia
463-
using CUDA, MPI
464-
MPI.Init()
462+
using CUDA, NCCL, CUDSS_jll
463+
using MPI
465464
using LinearAlgebraMPI
465+
MPI.Init()
466466

467467
# Each MPI rank should use a different GPU
468468
CUDA.device!(MPI.Comm_rank(MPI.COMM_WORLD) % length(CUDA.devices()))

docs/src/index.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -28,9 +28,8 @@ LinearAlgebraMPI.jl provides distributed matrix and vector types for parallel co
2828

2929
```julia
3030
using MPI
31-
MPI.Init()
32-
3331
using LinearAlgebraMPI
32+
MPI.Init()
3433
using SparseArrays
3534

3635
# Create distributed sparse matrix

docs/src/installation.md

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -46,9 +46,8 @@ mpiexec -n 2 julia --project test/runtests.jl
4646
```julia
4747
# CORRECT
4848
using MPI
49-
MPI.Init()
50-
5149
using LinearAlgebraMPI
50+
MPI.Init()
5251
# Now you can use the package
5352
```
5453

@@ -58,9 +57,8 @@ Create a script file (e.g., `my_program.jl`):
5857

5958
```julia
6059
using MPI
61-
MPI.Init()
62-
6360
using LinearAlgebraMPI
61+
MPI.Init()
6462
using SparseArrays
6563

6664
# Create distributed matrix

0 commit comments

Comments
 (0)