Skip to content

Commit 8b3f1db

Browse files
authored
Merge pull request #2 from Ricks-Lab/v1.1.0-CFG_Features
V1.1.0 cfg features
2 parents eff7fa3 + 12ac0da commit 8b3f1db

13 files changed

Lines changed: 23544 additions & 24187 deletions

APPS_REF/REF_RESULTS/ref-result.setiathome_8.00_x86_64-pc-linux-gnu.blc04_2bit_guppi_57976_10365_HIP74315_0035.16799.409.22.45.56.vlar.wu.sah

Lines changed: 0 additions & 570 deletions
This file was deleted.

APPS_REF/REF_RESULTS/ref-result.setiathome_8.00_x86_64-pc-linux-gnu.blc13_2bit_guppi_58405_85972_GJ687_0028.13304.818.22.45.142.vlar.wu.sah

Lines changed: 582 additions & 0 deletions
Large diffs are not rendered by default.

APPS_REF/REF_RESULTS/ref-stderr.setiathome_8.00_x86_64-pc-linux-gnu.blc04_2bit_guppi_57976_10365_HIP74315_0035.16799.409.22.45.56.vlar.wu.txt renamed to APPS_REF/REF_RESULTS/ref-stderr.setiathome_8.00_x86_64-pc-linux-gnu.blc13_2bit_guppi_58405_85972_GJ687_0028.13304.818.22.45.142.vlar.wu.txt

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,25 +1,25 @@
11
shmget in attach_shmem: Invalid argument
2-
18:44:07 (55031): Can't set up shared mem: -1. Will run in standalone mode.
2+
13:10:42 (101437): Can't set up shared mem: -1. Will run in standalone mode.
33
setiathome_v8 8.00 Revision: 3290 g++ (GCC) 4.4.7 20120313 (Red Hat 4.4.7-4)
44
libboinc: BOINC 7.7.0
55

66
Work Unit Info:
77
...............
8-
WU true angle range is : 0.010899
8+
WU true angle range is : 0.006654
99
Optimal function choices:
1010
--------------------------------------------------------
1111
name timing error
1212
--------------------------------------------------------
1313
v_BaseLineSmooth (no other)
1414
v_vGetPowerSpectrumUnrolled2 0.000034 0.00000
15-
avx_ChirpData_d 0.002838 0.00000
16-
AK SSE folding 0.000532 0.00000
15+
avx_ChirpData_d 0.002794 0.00000
16+
AK SSE folding 0.000539 0.00000
1717

18-
Flopcounter: 23944820171561.273438
18+
Flopcounter: 26027631425407.781250
1919

2020
Spike count: 0
21-
Autocorr count: 0
22-
Pulse count: 6
23-
Triplet count: 0
21+
Autocorr count: 1
22+
Pulse count: 4
23+
Triplet count: 1
2424
Gaussian count: 0
25-
19:36:29 (55031): called boinc_finish(0)
25+
14:06:17 (101437): called boinc_finish(0)

BenchCFG

Lines changed: 40 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -2,37 +2,67 @@
22
## Blank lines and any part of a line beginning with # are ignored
33
#############################################################################
44
##
5-
## List of applications with desired arguments to replace the
6-
## default arguments in the bench script.
5+
## List of applications with desired arguments.
76
##
87
## Format as would be used for executing the application from a
98
## command line, app -arg -arg etc. Multiple instances of the
109
## same app with different (or same) arguments will run that
11-
## many if the app is in the APPS_[C,G]PU directories.
10+
## many if the app is in the APPS_[C,G]PU directories, although
11+
## the --num_repetitions argument is the preferred way of running
12+
## an entry more than once.
1213
##
1314
## Needs the full application name with extension. Zero to many
14-
## arguments are possible.
15+
## arguments are possible. The -device N option of a GPU app will
16+
## be ingored, as this command line option is used to manage slot
17+
## assignment. Specifing physical GPUs can be accomplished with the
18+
## --max_gpus X and --gpu_devices 0,1 options. The value X must be
19+
## equal to the number of devices specified.
1520
##
1621
##
1722
##############################################################################
18-
## Set benchMT command line options - to be implemented
23+
## Set benchMT command line options
1924
##############################################################################
20-
## TODO
21-
## In a future the release, the capability of specifying benchMT
22-
## command line options from BenchCFG file will be added.
23-
#mode arg argv
25+
##
26+
## Command line options can be specified as modes in the BenchCFG file or an
27+
## alternate CFG file specified on the command line. Options specified on
28+
## the command line will override those specified with mode in a CFG file.
29+
##
30+
##
31+
#Don't ask confirmation before running jobs
32+
#mode yes False
33+
#Specify name of this run
34+
#mode run_name test
35+
#Specify path for BOINC
36+
#mode boinc_home /home/boinc/BOINC/
37+
#Do not suspend BOINC
38+
#mode noBS False
39+
#Display compact run status
40+
#mode display_compact False
41+
#Display run status by slots instead of jobs
42+
#mode display_slots False
43+
#Specify number of times to run benchmark
44+
#mode num_repetitions 2
45+
#Specify max number of threads to load
46+
#mode max_threads 2
47+
#Specify max number of GPUs to load
48+
#mode max_gpus 2
49+
#Specify GPU devices to use
50+
#mode gpu_devices 0,1
51+
#Use standard signal WUs instead of Test WUs
52+
#mode std_signals True
2453
##
2554
##
2655
##############################################################################
2756
## Entries to define benchmark run
2857
##############################################################################
58+
##
2959
MBv8_8.22r3584_sse2_clAMD_HD5_x86_64-pc-linux-gnu -v 1 -instances_per_device 1 -sbs 2048 -period_iterations_num 1 -tt 500 -spike_fft_thresh 8192 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 -hp -high_perf -no_defaults_scaling
3060
#MBv8_8.22r3584_sse2_clAMD_HD5_x86_64-pc-linux-gnu -v 1
3161

3262
#MBv8_8.04r3306_sse2_linux64 --nographics
3363
#MBv8_8.04r3306_sse41_linux64 --nographics
3464
##MBv8_8.04r3306_ssse3_linux64 --nographics
35-
#MBv8_8.22r3711_sse41_x86_64-pc-linux-gnu --nographics
65+
MBv8_8.22r3711_sse41_x86_64-pc-linux-gnu --nographics
3666
#MBv8_8.22r3712_avx2_x86_64-pc-linux-gnu --nographics
3767
#MBv8_8.04r3306_sse42_linux64 --nographics
3868
#MBv8_8.05r3345_avx_linux64 --nographics

README.md

Lines changed: 17 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -5,17 +5,18 @@
55
less than total number of CPU threads can be specified in the command line. This tool will read a
66
list of MB apps/args from the benchCFG file and search for the specified MB apps in the APP_CPU
77
and APP_GPU directories to validate and determine platform. It will then leverage allocated
8-
threads, as specified, to run all benchmark jobs storing results in the testData directory. Use
8+
threads, as specified, to run all benchmark jobs, storing results in the testData directory. Use
99
the *--help* option to get a description of valid command line arguments.
1010

1111
By default, a summary list of all jobs will update in the display as the program progresses. If
12-
there are a large number of jobs, then this display may not be useful and the --display_slots
12+
there are a large number of jobs, then this display may not be useful and the *--display_slots*
1313
option can be used to display the status of each slot as the program progresses. In some cases,
14-
there will be too many slots to display, and the --display_compact option can used to further
14+
there will be too many slots to display, and the *--display_compact* option can used to further
1515
optimize the progress display.
1616

1717
You may need to use the *--boinc_home* command option to specify the boinc home directory, which
18-
is required, since boinccmd is used.
18+
is required, since boinccmd is used. An alternative BenchCFG file can be specified with the
19+
command line option *--cfg_file filename*.
1920

2021
All WUs in the WU_test directory will be used in the creation of jobs to be run, unless the
2122
*--ref_signals* option is used, in which case, WUs in the WU_std_signal will be used. The
@@ -30,13 +31,19 @@
3031
files for each job run. A run name can be specified with the *--run_name* commane line option. This
3132
name will be included in the name of the testData subdirectory for the current run.
3233

33-
34-
## Development plans
35-
* GPU multi-threaded implementation. Currently total_gpu_threads = total_gpu_count, a future development opportunity is to implement a max number of threads per GPU
34+
## New in this Release - V1.1.0
35+
* Command line options can now be specified in mode lines of the BenchCFG file. Options given on the command line will override modes specified in the CFG file.
36+
* An alternative CFG file can now be specified as a command line option.
37+
* Signal counts and Angle Range are now included in the psv and txt summary files.
38+
* Remove app *-device N* arg if specified, since -device is automatically added based on slot assignment.
39+
* Added *--gpu_devices x,y* command line option to specify which GPU devices the user would like to include in the benchmark run.
40+
* Added a lock_file in the working directory to prevent a second occurrence of benchMT from using the same directory.
41+
* Updated the 15 reference WUs in the *WU_test/safe* directory.
42+
* Changed *--ref_signals* option to *--std_signals* for clarity.
43+
44+
## Development Plans and Known Limitations
45+
* Currently, running more than one job at a time on a single GPU is not supported.
3646
* Consider using opencl instead of lshw to get valid GPU compute platforms, but maybe won't work for cuda apps
37-
* Read benchMT command line options from mode lines in BenchCFG file
38-
* Remove -device arg if specified, since -device is automatically added based on slot assignment
39-
* Need to make a lock_file in the working directory to prevent a second occurrence of benchMT from using the same directory
4047
* Should consider executing job with time command. This should give total and CPU time metrics
4148
* Need to figure out how to run a job without a shell
4249
* Deal with an immediate fail to spawn a process when executing a job

WU_std_signal/README.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
# WU_std_signals - Standard Signal Work Units
2+
3+
This directory contains 4 standard signal Work Units for which there are reference results already provided
4+
in the *APPS_REF/REF_RESULTS* directory. These WUs are for testing functionality and do not represent the
5+
complexity and variability of real Work Units. Their processing time is approximately 10x faster than a
6+
real work units and are meant for testing only.
7+
8+
## Known Issues
9+
* None of these WUs contain a Gaussian type signal.
10+

0 commit comments

Comments
 (0)