Commit Graph

207 Commits

Author SHA1 Message Date
Tom Deakin
207fd8f784 Default to power of two array size 2016-05-11 12:04:19 +01:00
Tom Deakin
0f8f191d0e Require number of iterations to be at least 2 2016-05-11 11:55:33 +01:00
Tom Deakin
75ef78495c Add print out of number of iterations 2016-05-11 11:53:51 +01:00
Tom Deakin
3227e5dbf0 Print out data type for float or double 2016-05-11 11:52:17 +01:00
Tom Deakin
5c8b07262b Default to 100 iterations to get over any warm up times 2016-05-11 11:49:44 +01:00
James Price
084d7417b9 [SYCL] Remove unneeded cl_device_info line 2016-05-09 15:20:11 +01:00
James Price
6d913bab4b [SYCL] Actually use device_index to select device 2016-05-08 21:35:24 +01:00
James Price
3b3f6dfc26 [SYCL] Implement device list/selection functionality 2016-05-08 19:22:09 +01:00
James Price
58fa72dee0 Merge branch 'refactor' of github.com:UoB-HPC/GPU-STREAM into refactor 2016-05-06 22:42:24 +01:00
James Price
54834e05f4 [SYCL] Use nd_range instead of range to specify work-group size 2016-05-06 22:41:10 +01:00
James Price
fb8f06e683 [SYCL] Pass -no-serial-memop to compute++ to squelch warning 2016-05-06 22:33:18 +01:00
Matthew Martineau
6e9b85bb26 Fixed deep copy ordering, which was reversed 2016-05-06 21:08:23 +01:00
Matthew Martineau
0f0454ec29 Added CUDA device syncs to force proper timing 2016-05-06 21:05:20 +01:00
Matthew Martineau
894829cb05 Adjusted the Kokkos implementation to fix view initialisation, and store local copies of views for lambda scoping 2016-05-06 21:02:44 +01:00
Matthew Martineau
1a60f130eb Fixed memory management for GPU, now working with OpenMP and CUDA 2016-05-06 13:17:04 +01:00
Matthew Martineau
57189e7ca5 Merge branch 'refactor' of https://github.com/UoB-hpc/gpu-stream into refactor 2016-05-06 10:54:18 +01:00
Matthew Martineau
3b266b8266 Fix for namespace collision with #define RAJA 2016-05-06 10:53:12 +01:00
Matthew Martineau
45381da0b2 Initial commit of in progress developments of RAJA and KOKKOS stream 2016-05-06 10:46:35 +01:00
James Price
d4b3b3533c Update SYCL version to work with ComputeCpp
Still needs proper CMake rules and kernel names need to be fixed for
multiple template instantiations.
2016-05-06 00:38:30 +01:00
Matthew Martineau
0a738efa54 Merging in changes from trunk 2016-05-05 17:23:47 +01:00
Matthew Martineau
7c28a6386b Added the Kokkos and RAJA implementations 2016-05-05 17:22:29 +01:00
Tom Deakin
f0afa0c1e4 Add reference OpenMP 3.0 version 2016-05-04 10:41:41 +01:00
Tom Deakin
48cae0cbc1 Make sure CUDA nvcc builds with C++11 2016-05-03 15:20:26 +01:00
Tom Deakin
31819b7778 Add bones of OpenACC in CMake config 2016-05-03 15:07:51 +01:00
Tom Deakin
0b0de4e0c3 Implement the OpenACC device string functions, and device selector 2016-05-03 14:50:09 +01:00
James Price
b45f311e0d Add missing SYCL source files 2016-05-03 14:48:35 +01:00
James Price
40a0a6551d Remove extra -std=c++11 from CMake build 2016-05-03 14:46:08 +01:00
James Price
da4f918788 Add initial SYCL implementation 2016-05-03 14:45:13 +01:00
Tom Deakin
1a38b18954 Add OpenACC version 2016-05-03 14:36:08 +01:00
Tom Deakin
530b2adda2 Add License text to all files 2016-05-03 12:32:03 +01:00
Tom Deakin
95a10511ec Update LICENSE date 2016-05-03 12:28:44 +01:00
Tom Deakin
21c9022a3f Keep C++11 flag explicitely defined in Cmake 2016-05-03 12:24:33 +01:00
Tom Deakin
662fcaf4b5 Add CMake things to gitignore 2016-05-03 12:18:41 +01:00
Tom Deakin
57ea4b8cae Require CMake 3.2 so can check for C++11 nicely 2016-05-03 12:17:33 +01:00
Tom Deakin
1bd27428bd Require CUDA 7 for C++11 support 2016-05-03 12:17:21 +01:00
Tom Deakin
8ce15a28aa Update CMake with better binary name and source location 2016-05-03 11:45:25 +01:00
Tom Deakin
a355acf2ee Move source files to top level directory 2016-05-03 11:43:25 +01:00
Tom Deakin
fcc9588c94 Change cl2.hpp include 2016-05-03 11:41:40 +01:00
Tom Deakin
83516ae352 Update cl2.hpp 2016-05-03 11:41:00 +01:00
Tom Deakin
95f9efb7d9 Remove old version 2016-05-03 11:40:46 +01:00
Tom Deakin
e91c31b44a Tidy up delete of object with correct deconstructors and delete 2016-05-03 11:37:35 +01:00
Tom Deakin
26bb912646 Check OCL device has enough memory for buffers 2016-05-03 11:23:36 +01:00
Tom Deakin
2738e75b04 Print out array sizes 2016-05-03 11:20:39 +01:00
Tom Deakin
fd121c2467 Use device info to select CUDA device 2016-05-03 11:15:38 +01:00
Tom Deakin
3462e61c16 Check device support float 2016-05-03 11:05:21 +01:00
Tom Deakin
d7c17d72d5 Use device index from CLI in OpenCL 2016-05-03 11:02:33 +01:00
Tom Deakin
77b521f5f0 Use float or double from CLI 2016-05-03 10:52:27 +01:00
Tom Deakin
ac55358964 Implement device info functions 2016-05-03 10:51:16 +01:00
Tom Deakin
72ddd05f61 Add parse arguments code 2016-04-29 18:45:57 +01:00
Tom Deakin
2cb4fe74b1 Use original parseUInt function 2016-04-29 18:38:49 +01:00