Commit Graph

632 Commits

Author SHA1 Message Date
Tom Deakin
eae8da57ac Delete commented out C++ flag for OpenACC as no longer needed 2016-05-11 15:57:20 +01:00
Tom Deakin
31cb567e21 Switch data from 1.0, 2.0 and 3.0 to 0.1, 0.2, and 0.3 resp.
Using integers for maths gets unstable past 38 interations even
in double precision. Using the original values/10 is safe up to
the default 100 iterations.
2016-05-11 15:51:19 +01:00
Tom Deakin
55a858e0c0 Use 2^25 as default size because 2^26 gives too many thread blocks for CUDA 2016-05-11 15:43:52 +01:00
Tom Deakin
4954ef7cf0 Add map clauses to OpenMP 4.5 kernels 2016-05-11 15:17:06 +01:00
Tom Deakin
eb10c716f2 First attempt at OpenMP 4.5 2016-05-11 15:08:08 +01:00
Tom Deakin
e095cb67f8 Remove ugly CMake endif text in parenthesis 2016-05-11 13:37:12 +01:00
Tom Deakin
bf29b02d35 Add banners in CMakeLists file so easy to spot build rules for versions 2016-05-11 13:35:24 +01:00
Tom Deakin
8a195b6416 Remove printout of compiler id in cmake 2016-05-11 13:35:12 +01:00
Tom Deakin
e0ca56bd67 Set the C++11 flag when using the Cray compiler 2016-05-11 13:33:01 +01:00
Tom Deakin
8d45e61f6c Check for OpenACC support by checking the various compiler flags 2016-05-11 13:20:15 +01:00
Tom Deakin
1a9225ca95 If building CUDA on Darwin with Xcode 7.3.1 skip becuase CUDA doesn't work this version 2016-05-11 12:54:12 +01:00
Tom Deakin
81fa9e1922 Require SYCL array size to be multiple of WGSIZE 2016-05-11 12:23:21 +01:00
Tom Deakin
2462023ed9 Set thread block size in CUDA with a #define, and check that array size is multiple of it 2016-05-11 12:21:29 +01:00
Tom Deakin
207fd8f784 Default to power of two array size 2016-05-11 12:04:19 +01:00
Tom Deakin
0f8f191d0e Require number of iterations to be at least 2 2016-05-11 11:55:33 +01:00
Tom Deakin
75ef78495c Add print out of number of iterations 2016-05-11 11:53:51 +01:00
Tom Deakin
3227e5dbf0 Print out data type for float or double 2016-05-11 11:52:17 +01:00
Tom Deakin
5c8b07262b Default to 100 iterations to get over any warm up times 2016-05-11 11:49:44 +01:00
Tom Deakin
b53fb1b3b2 Merge pull request #7 from sunway513/pull-request-HIP
Pull request hip
2016-05-09 17:03:49 +01:00
James Price
084d7417b9 [SYCL] Remove unneeded cl_device_info line 2016-05-09 15:20:11 +01:00
James Price
6d913bab4b [SYCL] Actually use device_index to select device 2016-05-08 21:35:24 +01:00
James Price
3b3f6dfc26 [SYCL] Implement device list/selection functionality 2016-05-08 19:22:09 +01:00
James Price
58fa72dee0 Merge branch 'refactor' of github.com:UoB-HPC/GPU-STREAM into refactor 2016-05-06 22:42:24 +01:00
James Price
54834e05f4 [SYCL] Use nd_range instead of range to specify work-group size 2016-05-06 22:41:10 +01:00
James Price
fb8f06e683 [SYCL] Pass -no-serial-memop to compute++ to squelch warning 2016-05-06 22:33:18 +01:00
Matthew Martineau
6e9b85bb26 Fixed deep copy ordering, which was reversed 2016-05-06 21:08:23 +01:00
Matthew Martineau
0f0454ec29 Added CUDA device syncs to force proper timing 2016-05-06 21:05:20 +01:00
Matthew Martineau
894829cb05 Adjusted the Kokkos implementation to fix view initialisation, and store local copies of views for lambda scoping 2016-05-06 21:02:44 +01:00
Matthew Martineau
1a60f130eb Fixed memory management for GPU, now working with OpenMP and CUDA 2016-05-06 13:17:04 +01:00
Matthew Martineau
57189e7ca5 Merge branch 'refactor' of https://github.com/UoB-hpc/gpu-stream into refactor 2016-05-06 10:54:18 +01:00
Matthew Martineau
3b266b8266 Fix for namespace collision with #define RAJA 2016-05-06 10:53:12 +01:00
Matthew Martineau
45381da0b2 Initial commit of in progress developments of RAJA and KOKKOS stream 2016-05-06 10:46:35 +01:00
James Price
d4b3b3533c Update SYCL version to work with ComputeCpp
Still needs proper CMake rules and kernel names need to be fixed for
multiple template instantiations.
2016-05-06 00:38:30 +01:00
Matthew Martineau
0a738efa54 Merging in changes from trunk 2016-05-05 17:23:47 +01:00
Matthew Martineau
7c28a6386b Added the Kokkos and RAJA implementations 2016-05-05 17:22:29 +01:00
Tom Deakin
f0afa0c1e4 Add reference OpenMP 3.0 version 2016-05-04 10:41:41 +01:00
Tom Deakin
48cae0cbc1 Make sure CUDA nvcc builds with C++11 2016-05-03 15:20:26 +01:00
Tom Deakin
31819b7778 Add bones of OpenACC in CMake config 2016-05-03 15:07:51 +01:00
Tom Deakin
0b0de4e0c3 Implement the OpenACC device string functions, and device selector 2016-05-03 14:50:09 +01:00
James Price
b45f311e0d Add missing SYCL source files 2016-05-03 14:48:35 +01:00
James Price
40a0a6551d Remove extra -std=c++11 from CMake build 2016-05-03 14:46:08 +01:00
James Price
da4f918788 Add initial SYCL implementation 2016-05-03 14:45:13 +01:00
Tom Deakin
1a38b18954 Add OpenACC version 2016-05-03 14:36:08 +01:00
Tom Deakin
530b2adda2 Add License text to all files 2016-05-03 12:32:03 +01:00
Tom Deakin
95a10511ec Update LICENSE date 2016-05-03 12:28:44 +01:00
Tom Deakin
21c9022a3f Keep C++11 flag explicitely defined in Cmake 2016-05-03 12:24:33 +01:00
Tom Deakin
662fcaf4b5 Add CMake things to gitignore 2016-05-03 12:18:41 +01:00
Tom Deakin
57ea4b8cae Require CMake 3.2 so can check for C++11 nicely 2016-05-03 12:17:33 +01:00
Tom Deakin
1bd27428bd Require CUDA 7 for C++11 support 2016-05-03 12:17:21 +01:00
Tom Deakin
8ce15a28aa Update CMake with better binary name and source location 2016-05-03 11:45:25 +01:00