Commit Graph

56 Commits

Author SHA1 Message Date
Tom Deakin
87eb4361b4 Version bump 2017-08-02 16:35:40 +01:00
James Price
6a2da4c862 Implement --triad-only switch 2017-08-02 15:43:56 +01:00
Tom Deakin
5ad8341b39 Merge pull request #35 from psteinb/adding_csv_output
Adding csv output
2017-07-31 15:03:00 +01:00
Peter Steinbach
01d4eea7b7 removed obsolete spaces 2017-07-31 14:52:18 +02:00
Peter Steinbach
f9ffa712cf removed doublicate spaces 2017-07-31 14:46:50 +02:00
Peter Steinbach
df6fff1d2e added missing space for consistency 2017-07-31 14:30:08 +02:00
Peter Steinbach
2dbb693761 renamed nreps to be more consistent with the naming scheme 2017-07-31 14:23:39 +02:00
Peter Steinbach
7ed0308cb7 code formatting fixed 2017-07-31 14:14:52 +02:00
Peter Steinbach
2415bdc7c0 fixed if-clause formatting 2017-07-31 14:00:44 +02:00
Peter Steinbach
7911e6a0ae fixed compilation error due to unpropagated typo fix 2017-07-26 17:28:41 +02:00
Peter Steinbach
add9973b67 fixed typo 2017-07-26 17:21:17 +02:00
Peter Steinbach
99fad100c6 added csv-output-sentinals and output 2017-07-26 14:22:24 +02:00
Peter Steinbach
ee8ab08eaf added csv flag 2017-07-26 14:02:32 +02:00
Peter Steinbach
26279688d1 Merge branch 'master' of https://github.com/UoB-HPC/BabelStream into rocm_hc_support 2017-07-25 17:05:31 +02:00
Tom Deakin
dafc63030f Rename to BabelStream 2017-04-08 12:16:29 +01:00
Tom Deakin
9c08fdd184 Minor version bump 2017-04-06 10:38:48 +01:00
Peter Steinbach
62ea5e3ed6 Merge remote-tracking branch 'upstream/master' into bare_hc
Conflicts:
	CMakeLists.txt
2017-02-27 14:35:11 +01:00
Tom Deakin
cc90cefeeb Minor version bump to signal build system update 2017-02-25 14:14:59 +00:00
Peter Steinbach
c9a45600c8 Merge branch 'master' into bare_hc 2017-01-30 16:06:34 +01:00
Tom Deakin
ec2bf50e75 Version bump 2017-01-30 13:52:45 +00:00
Peter Steinbach
7621f86701 added pure HC gpu stream implmentation 2017-01-03 11:43:12 +01:00
Tom Deakin
d0dd48406c Move version string to main removing common dependency 2016-12-09 12:36:25 +00:00
Tom Deakin
e6615944f4 Use a compiler switch to select OpenMP directives (target or parallel for) 2016-12-09 12:24:08 +00:00
James Price
1e976ff150 [SYCL] Fix multiple template specializations 2016-11-18 00:14:46 +00:00
Tom Deakin
d42bcd4675 Merge remote-tracking branch 'origin/init-arrays' into devel 2016-11-04 09:17:54 +00:00
James Price
7f4761ae52 Replace write_arrays with init_arrays
This allows each model to initialise their arrays with a parallel
approach, which yields the first touch required for good performance
on NUMA architectures.
2016-11-02 11:22:01 +00:00
Tom Deakin
644ebc40ef Verify reduction result to 8 decimal places 2016-10-24 16:22:35 +01:00
Tom Deakin
f32cf3bad3 Merge branch 'master' into kernel-dot
Conflicts:
	main.cpp
2016-10-24 13:53:58 +01:00
Tom Deakin
5ae613519d Change the value of scalar, and specify in a #define 2016-10-24 13:19:31 +01:00
James Price
1e94870859 Fix verification of dot kernel 2016-10-24 12:47:01 +01:00
Tom Deakin
28c2660b52 Merge branch 'master' into kernel-dot 2016-10-24 12:21:16 +01:00
Tom Deakin
08fe695d51 Fix typo in main file 2016-10-14 15:04:04 +01:00
Tom Deakin
275bfb2066 Check result of the final reduction 2016-10-14 14:45:28 +01:00
Tom Deakin
04ca357159 Call the Dot kernel and print out results 2016-10-14 14:40:28 +01:00
pensun
a1f9d9ece7 Add support of HIP version of GPU-STREAM.
This commit was tested with HIP developer preview branch.
2016-09-05 23:41:01 -05:00
Tom Deakin
d420032c66 Remove warning about iteration count when using floats as new data values work for 100 iterations 2016-05-11 17:15:43 +01:00
Tom Deakin
31cb567e21 Switch data from 1.0, 2.0 and 3.0 to 0.1, 0.2, and 0.3 resp.
Using integers for maths gets unstable past 38 interations even
in double precision. Using the original values/10 is safe up to
the default 100 iterations.
2016-05-11 15:51:19 +01:00
Tom Deakin
55a858e0c0 Use 2^25 as default size because 2^26 gives too many thread blocks for CUDA 2016-05-11 15:43:52 +01:00
Tom Deakin
eb10c716f2 First attempt at OpenMP 4.5 2016-05-11 15:08:08 +01:00
Tom Deakin
207fd8f784 Default to power of two array size 2016-05-11 12:04:19 +01:00
Tom Deakin
0f8f191d0e Require number of iterations to be at least 2 2016-05-11 11:55:33 +01:00
Tom Deakin
75ef78495c Add print out of number of iterations 2016-05-11 11:53:51 +01:00
Tom Deakin
3227e5dbf0 Print out data type for float or double 2016-05-11 11:52:17 +01:00
Tom Deakin
5c8b07262b Default to 100 iterations to get over any warm up times 2016-05-11 11:49:44 +01:00
Matthew Martineau
894829cb05 Adjusted the Kokkos implementation to fix view initialisation, and store local copies of views for lambda scoping 2016-05-06 21:02:44 +01:00
Matthew Martineau
57189e7ca5 Merge branch 'refactor' of https://github.com/UoB-hpc/gpu-stream into refactor 2016-05-06 10:54:18 +01:00
Matthew Martineau
3b266b8266 Fix for namespace collision with #define RAJA 2016-05-06 10:53:12 +01:00
James Price
d4b3b3533c Update SYCL version to work with ComputeCpp
Still needs proper CMake rules and kernel names need to be fixed for
multiple template instantiations.
2016-05-06 00:38:30 +01:00
Matthew Martineau
0a738efa54 Merging in changes from trunk 2016-05-05 17:23:47 +01:00
Matthew Martineau
7c28a6386b Added the Kokkos and RAJA implementations 2016-05-05 17:22:29 +01:00