Commit Graph

376 Commits

Author SHA1 Message Date
Tom Deakin
2738e75b04 Print out array sizes 2016-05-03 11:20:39 +01:00
Tom Deakin
fd121c2467 Use device info to select CUDA device 2016-05-03 11:15:38 +01:00
Tom Deakin
3462e61c16 Check device support float 2016-05-03 11:05:21 +01:00
Tom Deakin
d7c17d72d5 Use device index from CLI in OpenCL 2016-05-03 11:02:33 +01:00
Tom Deakin
77b521f5f0 Use float or double from CLI 2016-05-03 10:52:27 +01:00
Tom Deakin
ac55358964 Implement device info functions 2016-05-03 10:51:16 +01:00
Tom Deakin
72ddd05f61 Add parse arguments code 2016-04-29 18:45:57 +01:00
Tom Deakin
2cb4fe74b1 Use original parseUInt function 2016-04-29 18:38:49 +01:00
Tom Deakin
d557915007 Remove static keyword 2016-04-29 18:36:47 +01:00
Tom Deakin
3c394b9db0 Move device functions outside class 2016-04-29 18:28:21 +01:00
Tom Deakin
1a96b71935 First attempt at parse args 2016-04-29 13:59:31 +01:00
Tom Deakin
00305ba120 Write to std err 2016-04-28 23:37:53 +01:00
Tom Deakin
f5ba77f4bd List CUDA devices function 2016-04-28 23:20:10 +01:00
Tom Deakin
d1f8cd1b48 Implement some CUDA routines for device info 2016-04-28 23:06:06 +01:00
Tom Deakin
a1cab96c57 Define the implementaiton strings in each implementation header 2016-04-28 17:20:40 +01:00
Tom Deakin
7006871cbe Get version from CMake configued header and only build implementations which have the runtime around 2016-04-28 17:10:14 +01:00
Tom Deakin
b9e70e11ab Add CMakeLists.txt file with CUDA and OCL builds 2016-04-28 16:58:32 +01:00
Tom Deakin
088778977b Add OCL copy functions 2016-04-28 15:11:02 +01:00
Tom Deakin
b514969193 Create OCL device buffers 2016-04-28 15:08:12 +01:00
Tom Deakin
77f6df856c Call kernels in OCL 2016-04-28 15:05:01 +01:00
Tom Deakin
eeaf9358ab Create OCL kernel functors 2016-04-28 15:01:43 +01:00
Tom Deakin
38e1e3b704 Add starts of OpenCL implementation 2016-04-28 12:59:14 +01:00
Tom Deakin
a745ffc724 Add more keywords to CUDA header 2016-04-28 12:07:09 +01:00
Tom Deakin
59fe9738b6 Add a templated run function to make double/float switch easy 2016-04-28 12:03:50 +01:00
Tom Deakin
8d88afdedb Tidy up timing printing to reduce code duplication 2016-04-28 11:57:09 +01:00
Tom Deakin
377b348748 Move implementation string to the common header file 2016-04-28 11:15:25 +01:00
Tom Deakin
daa7f643b9 Print out timing results 2016-04-27 13:18:06 +01:00
Tom Deakin
3d5a49317e Free CUDA buffers in destructor 2016-04-27 12:11:19 +01:00
Tom Deakin
c28e70ae70 Add timers and run multiple times 2016-04-27 12:08:49 +01:00
Tom Deakin
40c787d040 Check bufers fit on CUDA device 2016-04-27 11:52:15 +01:00
Tom Deakin
9aa27cd91d Print out average error on check if there is an error 2016-04-27 11:42:23 +01:00
Tom Deakin
6225ae90a7 Add start of check results function 2016-04-27 11:35:12 +01:00
Tom Deakin
6522d9114a Add new line at end of file 2016-04-27 11:35:04 +01:00
Tom Deakin
9730cd071e Overridden functions should have more keywords 2016-04-27 11:34:42 +01:00
pensun
a8ebdc1438 change the warning, stating the rounding error on float does not apply to AMD devices 2016-04-26 14:21:52 -05:00
pensun
9989852401 Remove CLUMP_SIZE options; update warning messege regarding round errors on float that does not apply to HIP version 2016-04-26 14:10:32 -05:00
Tom Deakin
9c673317a7 Store array size in class so can use it for kernel launches 2016-04-26 16:09:51 +01:00
Tom Deakin
319e11011c Add triad kernel 2016-04-26 16:07:32 +01:00
Tom Deakin
7a3a546a6e Add mul CUDA kernel 2016-04-26 16:06:17 +01:00
Tom Deakin
dec0237353 Add mul kernel 2016-04-26 16:03:28 +01:00
Tom Deakin
c22b74ba47 Add read_arrays definition for CUDA 2016-04-26 15:30:37 +01:00
Tom Deakin
8e534daf8b Add methods to copy data between host and device 2016-04-26 15:02:41 +01:00
Tom Deakin
ae679a5775 Fix indentation in Stream.h 2016-04-26 14:50:58 +01:00
Tom Deakin
ee4820b5e4 Create CUDA device buffers 2016-04-26 14:50:22 +01:00
Tom Deakin
03b01e190f Add cuda constructor declaration and error checking function 2016-04-26 14:49:04 +01:00
Tom Deakin
6169bdb7b5 Add some global variables 2016-04-26 14:40:49 +01:00
Tom Deakin
0bf68f9909 Make a copy kernel using the private variables 2016-04-26 14:34:25 +01:00
Tom Deakin
1a259d4fc8 Add a copy kernel 2016-04-26 14:24:04 +01:00
Tom Deakin
2234841b16 Initial commit of new design with classes 2016-04-26 14:08:59 +01:00
pensun
066f667e4a Merge branch 'pull-request-HIP' of https://github.com/sunway513/GPU-STREAM into pull-request-HIP 2016-04-03 06:53:34 -05:00