Tom Deakin
2001ab5fb1
Build against a RAJA installation in the CMake build system
2016-05-12 15:40:22 +01:00
Tom Deakin
942188d836
Add copyright header to source with it missing
2016-05-12 12:53:26 +01:00
Tom Deakin
d75084b753
Fix Kokkos CMake so it works..
2016-05-12 12:35:47 +01:00
Tom Deakin
2381f059ed
Set KOKKOS_PATH to build Kokkos version
2016-05-12 12:31:16 +01:00
Tom Deakin
88d194b75c
Use a variable to get Kokkos Path
2016-05-12 12:30:35 +01:00
Tom Deakin
f6fca3ac06
Add Kokkos building to CMake config
2016-05-12 12:30:06 +01:00
James Price
3ebad06bd4
[SYCL] Fix detection of CL/sycl.hpp for C++14 versions
2016-05-11 22:22:20 +01:00
James Price
7cd14f480d
[SYCL] Auto-detect presence of CL/sycl.hpp and ComputeCpp
2016-05-11 22:00:04 +01:00
Tom Deakin
d4e74a88e9
Add binary names to gitignore
2016-05-11 17:53:33 +01:00
Tom Deakin
5638cbb283
Check for OpenMP support and build OMP3 version
2016-05-11 17:49:48 +01:00
Tom Deakin
bf9c6fb6cd
Add -fopenacc flag on linking with GCC compiler
2016-05-11 17:21:52 +01:00
Tom Deakin
d420032c66
Remove warning about iteration count when using floats as new data values work for 100 iterations
2016-05-11 17:15:43 +01:00
Tom Deakin
9449e08886
update readme
2016-05-11 16:23:14 +01:00
Tom Deakin
494e89d16b
Add placeholder banners for CMake build systems to fix
2016-05-11 16:02:34 +01:00
Tom Deakin
9b2a586e08
Add rule to build OMP4.5 on Cray
2016-05-11 15:57:39 +01:00
Tom Deakin
eae8da57ac
Delete commented out C++ flag for OpenACC as no longer needed
2016-05-11 15:57:20 +01:00
Tom Deakin
31cb567e21
Switch data from 1.0, 2.0 and 3.0 to 0.1, 0.2, and 0.3 resp.
...
Using integers for maths gets unstable past 38 interations even
in double precision. Using the original values/10 is safe up to
the default 100 iterations.
2016-05-11 15:51:19 +01:00
Tom Deakin
55a858e0c0
Use 2^25 as default size because 2^26 gives too many thread blocks for CUDA
2016-05-11 15:43:52 +01:00
Tom Deakin
4954ef7cf0
Add map clauses to OpenMP 4.5 kernels
2016-05-11 15:17:06 +01:00
Tom Deakin
eb10c716f2
First attempt at OpenMP 4.5
2016-05-11 15:08:08 +01:00
Tom Deakin
e095cb67f8
Remove ugly CMake endif text in parenthesis
2016-05-11 13:37:12 +01:00
Tom Deakin
bf29b02d35
Add banners in CMakeLists file so easy to spot build rules for versions
2016-05-11 13:35:24 +01:00
Tom Deakin
8a195b6416
Remove printout of compiler id in cmake
2016-05-11 13:35:12 +01:00
Tom Deakin
e0ca56bd67
Set the C++11 flag when using the Cray compiler
2016-05-11 13:33:01 +01:00
Tom Deakin
8d45e61f6c
Check for OpenACC support by checking the various compiler flags
2016-05-11 13:20:15 +01:00
Tom Deakin
1a9225ca95
If building CUDA on Darwin with Xcode 7.3.1 skip becuase CUDA doesn't work this version
2016-05-11 12:54:12 +01:00
Tom Deakin
81fa9e1922
Require SYCL array size to be multiple of WGSIZE
2016-05-11 12:23:21 +01:00
Tom Deakin
2462023ed9
Set thread block size in CUDA with a #define, and check that array size is multiple of it
2016-05-11 12:21:29 +01:00
Tom Deakin
207fd8f784
Default to power of two array size
2016-05-11 12:04:19 +01:00
Tom Deakin
0f8f191d0e
Require number of iterations to be at least 2
2016-05-11 11:55:33 +01:00
Tom Deakin
75ef78495c
Add print out of number of iterations
2016-05-11 11:53:51 +01:00
Tom Deakin
3227e5dbf0
Print out data type for float or double
2016-05-11 11:52:17 +01:00
Tom Deakin
5c8b07262b
Default to 100 iterations to get over any warm up times
2016-05-11 11:49:44 +01:00
Tom Deakin
b53fb1b3b2
Merge pull request #7 from sunway513/pull-request-HIP
...
Pull request hip
2016-05-09 17:03:49 +01:00
James Price
084d7417b9
[SYCL] Remove unneeded cl_device_info line
2016-05-09 15:20:11 +01:00
James Price
6d913bab4b
[SYCL] Actually use device_index to select device
2016-05-08 21:35:24 +01:00
James Price
3b3f6dfc26
[SYCL] Implement device list/selection functionality
2016-05-08 19:22:09 +01:00
James Price
58fa72dee0
Merge branch 'refactor' of github.com:UoB-HPC/GPU-STREAM into refactor
2016-05-06 22:42:24 +01:00
James Price
54834e05f4
[SYCL] Use nd_range instead of range to specify work-group size
2016-05-06 22:41:10 +01:00
James Price
fb8f06e683
[SYCL] Pass -no-serial-memop to compute++ to squelch warning
2016-05-06 22:33:18 +01:00
Matthew Martineau
6e9b85bb26
Fixed deep copy ordering, which was reversed
2016-05-06 21:08:23 +01:00
Matthew Martineau
0f0454ec29
Added CUDA device syncs to force proper timing
2016-05-06 21:05:20 +01:00
Matthew Martineau
894829cb05
Adjusted the Kokkos implementation to fix view initialisation, and store local copies of views for lambda scoping
2016-05-06 21:02:44 +01:00
Matthew Martineau
1a60f130eb
Fixed memory management for GPU, now working with OpenMP and CUDA
2016-05-06 13:17:04 +01:00
Matthew Martineau
57189e7ca5
Merge branch 'refactor' of https://github.com/UoB-hpc/gpu-stream into refactor
2016-05-06 10:54:18 +01:00
Matthew Martineau
3b266b8266
Fix for namespace collision with #define RAJA
2016-05-06 10:53:12 +01:00
Matthew Martineau
45381da0b2
Initial commit of in progress developments of RAJA and KOKKOS stream
2016-05-06 10:46:35 +01:00
James Price
d4b3b3533c
Update SYCL version to work with ComputeCpp
...
Still needs proper CMake rules and kernel names need to be fixed for
multiple template instantiations.
2016-05-06 00:38:30 +01:00
Matthew Martineau
0a738efa54
Merging in changes from trunk
2016-05-05 17:23:47 +01:00
Matthew Martineau
7c28a6386b
Added the Kokkos and RAJA implementations
2016-05-05 17:22:29 +01:00