Commit Graph

29 Commits

Author SHA1 Message Date
Tom Deakin
f32cf3bad3 Merge branch 'master' into kernel-dot
Conflicts:
	main.cpp
2016-10-24 13:53:58 +01:00
Tom Deakin
5ae613519d Change the value of scalar, and specify in a #define 2016-10-24 13:19:31 +01:00
James Price
1e94870859 Fix verification of dot kernel 2016-10-24 12:47:01 +01:00
Tom Deakin
28c2660b52 Merge branch 'master' into kernel-dot 2016-10-24 12:21:16 +01:00
Tom Deakin
08fe695d51 Fix typo in main file 2016-10-14 15:04:04 +01:00
Tom Deakin
275bfb2066 Check result of the final reduction 2016-10-14 14:45:28 +01:00
Tom Deakin
04ca357159 Call the Dot kernel and print out results 2016-10-14 14:40:28 +01:00
pensun
a1f9d9ece7 Add support of HIP version of GPU-STREAM.
This commit was tested with HIP developer preview branch.
2016-09-05 23:41:01 -05:00
Tom Deakin
d420032c66 Remove warning about iteration count when using floats as new data values work for 100 iterations 2016-05-11 17:15:43 +01:00
Tom Deakin
31cb567e21 Switch data from 1.0, 2.0 and 3.0 to 0.1, 0.2, and 0.3 resp.
Using integers for maths gets unstable past 38 interations even
in double precision. Using the original values/10 is safe up to
the default 100 iterations.
2016-05-11 15:51:19 +01:00
Tom Deakin
55a858e0c0 Use 2^25 as default size because 2^26 gives too many thread blocks for CUDA 2016-05-11 15:43:52 +01:00
Tom Deakin
eb10c716f2 First attempt at OpenMP 4.5 2016-05-11 15:08:08 +01:00
Tom Deakin
207fd8f784 Default to power of two array size 2016-05-11 12:04:19 +01:00
Tom Deakin
0f8f191d0e Require number of iterations to be at least 2 2016-05-11 11:55:33 +01:00
Tom Deakin
75ef78495c Add print out of number of iterations 2016-05-11 11:53:51 +01:00
Tom Deakin
3227e5dbf0 Print out data type for float or double 2016-05-11 11:52:17 +01:00
Tom Deakin
5c8b07262b Default to 100 iterations to get over any warm up times 2016-05-11 11:49:44 +01:00
Matthew Martineau
894829cb05 Adjusted the Kokkos implementation to fix view initialisation, and store local copies of views for lambda scoping 2016-05-06 21:02:44 +01:00
Matthew Martineau
57189e7ca5 Merge branch 'refactor' of https://github.com/UoB-hpc/gpu-stream into refactor 2016-05-06 10:54:18 +01:00
Matthew Martineau
3b266b8266 Fix for namespace collision with #define RAJA 2016-05-06 10:53:12 +01:00
James Price
d4b3b3533c Update SYCL version to work with ComputeCpp
Still needs proper CMake rules and kernel names need to be fixed for
multiple template instantiations.
2016-05-06 00:38:30 +01:00
Matthew Martineau
0a738efa54 Merging in changes from trunk 2016-05-05 17:23:47 +01:00
Matthew Martineau
7c28a6386b Added the Kokkos and RAJA implementations 2016-05-05 17:22:29 +01:00
Tom Deakin
f0afa0c1e4 Add reference OpenMP 3.0 version 2016-05-04 10:41:41 +01:00
Tom Deakin
0b0de4e0c3 Implement the OpenACC device string functions, and device selector 2016-05-03 14:50:09 +01:00
James Price
da4f918788 Add initial SYCL implementation 2016-05-03 14:45:13 +01:00
Tom Deakin
1a38b18954 Add OpenACC version 2016-05-03 14:36:08 +01:00
Tom Deakin
530b2adda2 Add License text to all files 2016-05-03 12:32:03 +01:00
Tom Deakin
a355acf2ee Move source files to top level directory 2016-05-03 11:43:25 +01:00