James Price
6a2da4c862
Implement --triad-only switch
2017-08-02 15:43:56 +01:00
Tom Deakin
5ad8341b39
Merge pull request #35 from psteinb/adding_csv_output
...
Adding csv output
2017-07-31 15:03:00 +01:00
Peter Steinbach
01d4eea7b7
removed obsolete spaces
2017-07-31 14:52:18 +02:00
Peter Steinbach
f9ffa712cf
removed doublicate spaces
2017-07-31 14:46:50 +02:00
Peter Steinbach
df6fff1d2e
added missing space for consistency
2017-07-31 14:30:08 +02:00
Peter Steinbach
2dbb693761
renamed nreps to be more consistent with the naming scheme
2017-07-31 14:23:39 +02:00
Peter Steinbach
7ed0308cb7
code formatting fixed
2017-07-31 14:14:52 +02:00
Peter Steinbach
2415bdc7c0
fixed if-clause formatting
2017-07-31 14:00:44 +02:00
Peter Steinbach
7911e6a0ae
fixed compilation error due to unpropagated typo fix
2017-07-26 17:28:41 +02:00
Peter Steinbach
add9973b67
fixed typo
2017-07-26 17:21:17 +02:00
Peter Steinbach
99fad100c6
added csv-output-sentinals and output
2017-07-26 14:22:24 +02:00
Peter Steinbach
ee8ab08eaf
added csv flag
2017-07-26 14:02:32 +02:00
Peter Steinbach
26279688d1
Merge branch 'master' of https://github.com/UoB-HPC/BabelStream into rocm_hc_support
2017-07-25 17:05:31 +02:00
Tom Deakin
dafc63030f
Rename to BabelStream
2017-04-08 12:16:29 +01:00
Tom Deakin
9c08fdd184
Minor version bump
2017-04-06 10:38:48 +01:00
Peter Steinbach
62ea5e3ed6
Merge remote-tracking branch 'upstream/master' into bare_hc
...
Conflicts:
CMakeLists.txt
2017-02-27 14:35:11 +01:00
Tom Deakin
cc90cefeeb
Minor version bump to signal build system update
2017-02-25 14:14:59 +00:00
Peter Steinbach
c9a45600c8
Merge branch 'master' into bare_hc
2017-01-30 16:06:34 +01:00
Tom Deakin
ec2bf50e75
Version bump
2017-01-30 13:52:45 +00:00
Peter Steinbach
7621f86701
added pure HC gpu stream implmentation
2017-01-03 11:43:12 +01:00
Tom Deakin
d0dd48406c
Move version string to main removing common dependency
2016-12-09 12:36:25 +00:00
Tom Deakin
e6615944f4
Use a compiler switch to select OpenMP directives (target or parallel for)
2016-12-09 12:24:08 +00:00
James Price
1e976ff150
[SYCL] Fix multiple template specializations
2016-11-18 00:14:46 +00:00
Tom Deakin
d42bcd4675
Merge remote-tracking branch 'origin/init-arrays' into devel
2016-11-04 09:17:54 +00:00
James Price
7f4761ae52
Replace write_arrays with init_arrays
...
This allows each model to initialise their arrays with a parallel
approach, which yields the first touch required for good performance
on NUMA architectures.
2016-11-02 11:22:01 +00:00
Tom Deakin
644ebc40ef
Verify reduction result to 8 decimal places
2016-10-24 16:22:35 +01:00
Tom Deakin
f32cf3bad3
Merge branch 'master' into kernel-dot
...
Conflicts:
main.cpp
2016-10-24 13:53:58 +01:00
Tom Deakin
5ae613519d
Change the value of scalar, and specify in a #define
2016-10-24 13:19:31 +01:00
James Price
1e94870859
Fix verification of dot kernel
2016-10-24 12:47:01 +01:00
Tom Deakin
28c2660b52
Merge branch 'master' into kernel-dot
2016-10-24 12:21:16 +01:00
Tom Deakin
08fe695d51
Fix typo in main file
2016-10-14 15:04:04 +01:00
Tom Deakin
275bfb2066
Check result of the final reduction
2016-10-14 14:45:28 +01:00
Tom Deakin
04ca357159
Call the Dot kernel and print out results
2016-10-14 14:40:28 +01:00
pensun
a1f9d9ece7
Add support of HIP version of GPU-STREAM.
...
This commit was tested with HIP developer preview branch.
2016-09-05 23:41:01 -05:00
Tom Deakin
d420032c66
Remove warning about iteration count when using floats as new data values work for 100 iterations
2016-05-11 17:15:43 +01:00
Tom Deakin
31cb567e21
Switch data from 1.0, 2.0 and 3.0 to 0.1, 0.2, and 0.3 resp.
...
Using integers for maths gets unstable past 38 interations even
in double precision. Using the original values/10 is safe up to
the default 100 iterations.
2016-05-11 15:51:19 +01:00
Tom Deakin
55a858e0c0
Use 2^25 as default size because 2^26 gives too many thread blocks for CUDA
2016-05-11 15:43:52 +01:00
Tom Deakin
eb10c716f2
First attempt at OpenMP 4.5
2016-05-11 15:08:08 +01:00
Tom Deakin
207fd8f784
Default to power of two array size
2016-05-11 12:04:19 +01:00
Tom Deakin
0f8f191d0e
Require number of iterations to be at least 2
2016-05-11 11:55:33 +01:00
Tom Deakin
75ef78495c
Add print out of number of iterations
2016-05-11 11:53:51 +01:00
Tom Deakin
3227e5dbf0
Print out data type for float or double
2016-05-11 11:52:17 +01:00
Tom Deakin
5c8b07262b
Default to 100 iterations to get over any warm up times
2016-05-11 11:49:44 +01:00
Matthew Martineau
894829cb05
Adjusted the Kokkos implementation to fix view initialisation, and store local copies of views for lambda scoping
2016-05-06 21:02:44 +01:00
Matthew Martineau
57189e7ca5
Merge branch 'refactor' of https://github.com/UoB-hpc/gpu-stream into refactor
2016-05-06 10:54:18 +01:00
Matthew Martineau
3b266b8266
Fix for namespace collision with #define RAJA
2016-05-06 10:53:12 +01:00
James Price
d4b3b3533c
Update SYCL version to work with ComputeCpp
...
Still needs proper CMake rules and kernel names need to be fixed for
multiple template instantiations.
2016-05-06 00:38:30 +01:00
Matthew Martineau
0a738efa54
Merging in changes from trunk
2016-05-05 17:23:47 +01:00
Matthew Martineau
7c28a6386b
Added the Kokkos and RAJA implementations
2016-05-05 17:22:29 +01:00
Tom Deakin
f0afa0c1e4
Add reference OpenMP 3.0 version
2016-05-04 10:41:41 +01:00