Skip to content

Commit 1f708c9

Browse files
committed
This increases achieved occupancy
This increases achieved occupancy by almost a factor four. For the "dev_evaluate_gravity" kernel, the occupancy is increased from 16.66% to 61.65%. The grid size was 16, which yielded the following recommendation from profiling " The grid for this launch is configured to execute only 16 blocks, which is less than the GPU's 64 multiprocessors. This can underutilize some multiprocessors." Now it is 256. The improvement in occupancy was investigated using ncu: " ncu --kernel-name dev_evaluate_gravity -c 100 --print-summary per-kernel -o dev_evaluate_gravity.prof python scripts/Erwan_1207_imp.py ncu -i dev_evaluate_gravity.prof.ncu-rep | less "
1 parent b726e4f commit 1f708c9

1 file changed

Lines changed: 1 addition & 1 deletion

File tree

lib/sapporo_light/sapporo_defs.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33

44
#define MAXCUDADEVICES 4
55
#define NBODIES_MAX 524288
6-
#define NBLOCKS 16 /* number of block which can be run simultaneously */
6+
#define NBLOCKS 256 /* number of block which can be run simultaneously */
77

88
#ifdef NGB
99
#define NTHREADS 256 /* max number of threads which can run per block */

0 commit comments

Comments
 (0)