Skip to content

Commit 3c95930

Browse files
committed
Merge remote-tracking branch 'upstream/develop' into develop
2 parents 0e0b8e5 + abbfece commit 3c95930

19 files changed

Lines changed: 309 additions & 17 deletions

File tree

CMakeLists.txt

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ endif()
3838
if( POLICY CMP0048 )
3939
cmake_policy( SET CMP0048 NEW )
4040

41-
project( SuperBuild.clSPARSE VERSION 0.7.2.0 )
41+
project( SuperBuild.clSPARSE VERSION 0.8.0.0 )
4242
else( )
4343
project( SuperBuild.clSPARSE )
4444

@@ -48,11 +48,11 @@ else( )
4848
endif( )
4949

5050
if( NOT DEFINED SuperBuild.clSPARSE_VERSION_MINOR )
51-
set( SuperBuild.clSPARSE_VERSION_MINOR 7 )
51+
set( SuperBuild.clSPARSE_VERSION_MINOR 8 )
5252
endif( )
5353

5454
if( NOT DEFINED SuperBuild.clSPARSE_VERSION_PATCH )
55-
set( SuperBuild.clSPARSE_VERSION_PATCH 2 )
55+
set( SuperBuild.clSPARSE_VERSION_PATCH 0 )
5656
endif( )
5757

5858
if( NOT DEFINED SuperBuild.clSPARSE_VERSION_TWEAK )

README.md

Lines changed: 13 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -4,17 +4,25 @@ Pre-built binaries are available on our [releases page](https://github.com/clMat
44
| Build branch | master | develop |
55
|-----|-----|-----|
66
| GCC/Clang x64 | [![Build Status](https://travis-ci.org/clMathLibraries/clSPARSE.svg?branch=master)](https://travis-ci.org/clMathLibraries/clSPARSE/branches) | [![Build Status](https://travis-ci.org/clMathLibraries/clSPARSE.svg?branch=develop)](https://travis-ci.org/clMathLibraries/clSPARSE/branches) |
7-
| Visual Studio x64 | |[![Build status](https://ci.appveyor.com/api/projects/status/93518qe0efy6n7fy/branch/develop?svg=true)](https://ci.appveyor.com/project/kknox/clsparse-otonj/branch/develop) |
7+
| Visual Studio x64 |[![Build status](https://ci.appveyor.com/api/projects/status/93518qe0efy6n7fy/branch/master?svg=true)](https://ci.appveyor.com/project/kknox/clsparse-otonj/branch/master) |[![Build status](https://ci.appveyor.com/api/projects/status/93518qe0efy6n7fy/branch/develop?svg=true)](https://ci.appveyor.com/project/kknox/clsparse-otonj/branch/develop) |
88

99
# clSPARSE
10-
an OpenCL© library implementing Sparse linear algebra. This project is a result of
10+
an OpenCL™ library implementing Sparse linear algebra routines. This project is a result of
1111
a collaboration between [AMD Inc.](http://www.amd.com/) and
1212
[Vratis Ltd.](http://www.vratis.com/).
1313

14-
## Introduction to clSPARSE
15-
At this time, clSPARSE provides these fundamental sparse operations for OpenCL:
14+
### What's new in clSPARSE **v0.8**
15+
- New single precision SpM-SpM (SpGEMM) function
16+
- Optimizations to the sparse matrix conversion routines
17+
- [API documentation](http://clmathlibraries.github.io/clSPARSE/) available
18+
- SpM-dV routines now provide [higher precision accuracy] (https://github.com/clMathLibraries/clSPARSE/wiki/Precision)
19+
- Various bug fixes integrated
20+
21+
22+
## clSPARSE features
1623
- Sparse Matrix - dense Vector multiply (SpM-dV)
1724
- Sparse Matrix - dense Matrix multiply (SpM-dM)
25+
- Sparse Matrix - Sparse Matrix multiply Sparse Matrix Multiply(SpGEMM) - Single Precision
1826
- Iterative conjugate gradient solver (CG)
1927
- Iterative biconjugate gradient stabilized solver (BiCGStab)
2028
- Dense to CSR conversions (& converse)
@@ -28,10 +36,6 @@ compared to the older clMath libraries. OpenCL state is not explicitly passed
2836
through the API, which enables the library to be forward compatible when users are
2937
ready to switch from OpenCL 1.2 to OpenCL 2.0 <sup>[1](#opencl-2)</sup>
3038

31-
The API’s are designed such that users are in control of where input and output
32-
buffers live, and they maintain control of when data transfers to/from device
33-
memory happen, so that there are no performance surprises.
34-
3539
### Google Groups
3640
Two mailing lists have been created for the clMath projects:
3741

@@ -58,7 +62,7 @@ script for clSPARSE also builds the samples as an external project, to demonstra
5862
how an application would find and link to clSPARSE with cmake.
5963

6064
### clSPARSE library documentation
61-
**API documentation** is not yet available, but the samples above give an excellent
65+
**API documentation** is now available http://clmathlibraries.github.io/clSPARSE/ . The included samples will give an excellent
6266
starting point to basic library operations.
6367

6468
### Contributing code

docs/Doxyfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ PROJECT_NAME = clSPARSE
3838
# could be handy for archiving the generated documentation or if some version
3939
# control system is used.
4040

41-
PROJECT_NUMBER = v0.6.2.0
41+
PROJECT_NUMBER = v0.8.0.0
4242

4343
# Using the PROJECT_BRIEF tag one can provide an optional one line description
4444
# for a project that appears at the top of each page and should give viewer a
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
Coo2Csr,
2+
,
3+
OpenCL runtime,1642.5 (VM)
4+
OpenCL Device,w9100
5+
,
6+
Matrix Name,Gi-Elements/s
7+
cant,0.825084
8+
consph,0.934099
9+
cop20k_A,0.731746
10+
mac_econ_fwd500,0.502334
11+
mc2depi,0.586327
12+
pdb1HYS,0.849179
13+
pwtk,1.14935
14+
rail4284,1.08582
15+
rma10,0.696027
16+
scircuit,0.441488
17+
shipsec1,2.08336
18+
webbase_1M,0.773179
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
Csr2Coo,
2+
,
3+
OpenCL runtime,1642.5 (VM)
4+
OpenCL Device,w9100
5+
,
6+
Matrix Name,Gi-Elements/s
7+
cant,4.22181
8+
consph,4.92386
9+
cop20k_A,3.71303
10+
mac_econ_fwd500,2.16156
11+
mc2depi,3.24346
12+
pdb1HYS,4.52208
13+
pwtk,5.42111
14+
rail4284,5.30094
15+
rma10,3.55844
16+
scircuit,1.93272
17+
shipsec1,4.31598
18+
webbase_1M,2.96583
Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
Csr2Dense,
2+
,
3+
OpenCL runtime,1642.5 (VM)
4+
OpenCL Device,w9100
5+
,
6+
Matrix Name,Gi-Elements/s
7+
Dubcova1,0.025644
8+
hydr1c_A_11,0.0169621
9+
hydr1c_A_72,0.0170693
10+
hydr1c_A_76,0.0179919
11+
Maragal_6,0.0623863
12+
Na5,0.165182
13+
psse1,0.00876455
14+
Reuters911,0.0398191
15+
Si10H16,0.0777071
16+
tomography,0.101941
Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
Dense2Csr,
2+
,
3+
OpenCL runtime,1642.5 (VM)
4+
OpenCL Device,w9100
5+
,
6+
Matrix Name,Gi-Elements/s
7+
Dubcova1,3.97644
8+
hydr1c_A_11,2.69697
9+
hydr1c_A_72,2.70712
10+
hydr1c_A_76,2.68411
11+
Maragal_6,3.93096
12+
Na5,2.70179
13+
psse1,3.80035
14+
Reuters911,3.81953
15+
Si10H16,4.00379
16+
tomography,0.0410165
Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
# Benchmarking
2+
## Hardware
3+
w9100
4+
5+
## Environment
6+
Ubuntu 14.04
7+
8+
clSPARSE v0.8.0.0
9+
10+
[Catalyst FirePro](http://support.amd.com/en-us/download/workstation?os=Linux%20x86_64#catalyst-pro) 14.502.1040
11+
12+
## Tool
13+
[clsparse-bench](clSPARSE\src\benchmarks\clsparse-bench)
14+
15+
## Methodology
16+
For each data point, we took 20 samples. Each sample consists of 20 calls
17+
with a wait afterward. We benchmark with respect to the API, utilizing host timers
18+
(not pure kernel time with ).
19+
Outlying samples beyond 1 standard deviation were removed.
20+
21+
Conversion routines benchmarked as number of Gi-Elements/s converted
22+
23+
SpM-dV routine calculated as Gi-Bytes/s
24+
```c
25+
( sizeof( cl_int )*( csrMtx.num_nonzeros + csrMtx.num_rows ) + sizeof( T ) * ( csrMtx.num_nonzeros + csrMtx.num_cols + csrMtx.num_rows ) ) / time_in_ns( );
26+
```
27+
SpGEMM routine calculated as Mega-Flops/s
28+
```c
29+
(2 * (upper bound of number of nonzeros of result matrix))/ time_in_ms( ) ;
30+
```
31+
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
SpGemm,
2+
,
3+
OpenCL runtime,1642.5 (VM)
4+
OpenCL Device,w9100
5+
,
6+
Matrix Name,G-Flop/s
7+
2cubes_sphere,0.928289
8+
cage12,1.06262
9+
cant,3.41712
10+
consph,3.47913
11+
cop20k_A,1.79591
12+
filter3D,1.73384
13+
hood,4.59367
14+
m133-b3,0.499885
15+
mac_econ_fwd500,0.36654
16+
majorbasis,1.69818
17+
mario002,0.647947
18+
mc2depi,0.564147
19+
offshore,1.15078
20+
patents_main,0.0800872
21+
pdb1HYS,3.18338
22+
poisson3Da,1.02704
23+
pwtk,4.93038
24+
rma10,3.59276
25+
scircuit,0.247432
26+
shipsec1,1.75098
27+
webbase_1M,0.278071
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
SpM-dV,
2+
,
3+
OpenCL runtime,1642.5 (VM)
4+
OpenCL Device,w9100
5+
,
6+
Matrix Name,Gi-Bytes/s
7+
cant,133.421
8+
consph,187.416
9+
cop20k_A,92.3342
10+
mac_econ_fwd500,80.9133
11+
mc2depi,112.604
12+
pdb1HYS,144.192
13+
pwtk,182.1
14+
rail4284,65.8575
15+
rma10,109.81
16+
scircuit,61.9386
17+
shipsec1,128.57
18+
webbase_1M,129.624

0 commit comments

Comments
 (0)