Tom Deakin
31cb567e21
Switch data from 1.0, 2.0 and 3.0 to 0.1, 0.2, and 0.3 resp.
...
Using integers for maths gets unstable past 38 interations even
in double precision. Using the original values/10 is safe up to
the default 100 iterations.
2016-05-11 15:51:19 +01:00
Tom Deakin
55a858e0c0
Use 2^25 as default size because 2^26 gives too many thread blocks for CUDA
2016-05-11 15:43:52 +01:00
Tom Deakin
4954ef7cf0
Add map clauses to OpenMP 4.5 kernels
2016-05-11 15:17:06 +01:00
Tom Deakin
eb10c716f2
First attempt at OpenMP 4.5
2016-05-11 15:08:08 +01:00
Tom Deakin
e095cb67f8
Remove ugly CMake endif text in parenthesis
2016-05-11 13:37:12 +01:00
Tom Deakin
bf29b02d35
Add banners in CMakeLists file so easy to spot build rules for versions
2016-05-11 13:35:24 +01:00
Tom Deakin
8a195b6416
Remove printout of compiler id in cmake
2016-05-11 13:35:12 +01:00
Tom Deakin
e0ca56bd67
Set the C++11 flag when using the Cray compiler
2016-05-11 13:33:01 +01:00
Tom Deakin
8d45e61f6c
Check for OpenACC support by checking the various compiler flags
2016-05-11 13:20:15 +01:00
Tom Deakin
1a9225ca95
If building CUDA on Darwin with Xcode 7.3.1 skip becuase CUDA doesn't work this version
2016-05-11 12:54:12 +01:00
Tom Deakin
81fa9e1922
Require SYCL array size to be multiple of WGSIZE
2016-05-11 12:23:21 +01:00
Tom Deakin
2462023ed9
Set thread block size in CUDA with a #define, and check that array size is multiple of it
2016-05-11 12:21:29 +01:00
Tom Deakin
207fd8f784
Default to power of two array size
2016-05-11 12:04:19 +01:00
Tom Deakin
0f8f191d0e
Require number of iterations to be at least 2
2016-05-11 11:55:33 +01:00
Tom Deakin
75ef78495c
Add print out of number of iterations
2016-05-11 11:53:51 +01:00
Tom Deakin
3227e5dbf0
Print out data type for float or double
2016-05-11 11:52:17 +01:00
Tom Deakin
5c8b07262b
Default to 100 iterations to get over any warm up times
2016-05-11 11:49:44 +01:00
James Price
084d7417b9
[SYCL] Remove unneeded cl_device_info line
2016-05-09 15:20:11 +01:00
James Price
6d913bab4b
[SYCL] Actually use device_index to select device
2016-05-08 21:35:24 +01:00
James Price
3b3f6dfc26
[SYCL] Implement device list/selection functionality
2016-05-08 19:22:09 +01:00
James Price
58fa72dee0
Merge branch 'refactor' of github.com:UoB-HPC/GPU-STREAM into refactor
2016-05-06 22:42:24 +01:00
James Price
54834e05f4
[SYCL] Use nd_range instead of range to specify work-group size
2016-05-06 22:41:10 +01:00
James Price
fb8f06e683
[SYCL] Pass -no-serial-memop to compute++ to squelch warning
2016-05-06 22:33:18 +01:00
Matthew Martineau
6e9b85bb26
Fixed deep copy ordering, which was reversed
2016-05-06 21:08:23 +01:00
Matthew Martineau
0f0454ec29
Added CUDA device syncs to force proper timing
2016-05-06 21:05:20 +01:00
Matthew Martineau
894829cb05
Adjusted the Kokkos implementation to fix view initialisation, and store local copies of views for lambda scoping
2016-05-06 21:02:44 +01:00
Matthew Martineau
1a60f130eb
Fixed memory management for GPU, now working with OpenMP and CUDA
2016-05-06 13:17:04 +01:00
Matthew Martineau
57189e7ca5
Merge branch 'refactor' of https://github.com/UoB-hpc/gpu-stream into refactor
2016-05-06 10:54:18 +01:00
Matthew Martineau
3b266b8266
Fix for namespace collision with #define RAJA
2016-05-06 10:53:12 +01:00
Matthew Martineau
45381da0b2
Initial commit of in progress developments of RAJA and KOKKOS stream
2016-05-06 10:46:35 +01:00
James Price
d4b3b3533c
Update SYCL version to work with ComputeCpp
...
Still needs proper CMake rules and kernel names need to be fixed for
multiple template instantiations.
2016-05-06 00:38:30 +01:00
Matthew Martineau
0a738efa54
Merging in changes from trunk
2016-05-05 17:23:47 +01:00
Matthew Martineau
7c28a6386b
Added the Kokkos and RAJA implementations
2016-05-05 17:22:29 +01:00
Tom Deakin
f0afa0c1e4
Add reference OpenMP 3.0 version
2016-05-04 10:41:41 +01:00
Tom Deakin
48cae0cbc1
Make sure CUDA nvcc builds with C++11
2016-05-03 15:20:26 +01:00
Tom Deakin
31819b7778
Add bones of OpenACC in CMake config
2016-05-03 15:07:51 +01:00
Tom Deakin
0b0de4e0c3
Implement the OpenACC device string functions, and device selector
2016-05-03 14:50:09 +01:00
James Price
b45f311e0d
Add missing SYCL source files
2016-05-03 14:48:35 +01:00
James Price
40a0a6551d
Remove extra -std=c++11 from CMake build
2016-05-03 14:46:08 +01:00
James Price
da4f918788
Add initial SYCL implementation
2016-05-03 14:45:13 +01:00
Tom Deakin
1a38b18954
Add OpenACC version
2016-05-03 14:36:08 +01:00
Tom Deakin
530b2adda2
Add License text to all files
2016-05-03 12:32:03 +01:00
Tom Deakin
95a10511ec
Update LICENSE date
2016-05-03 12:28:44 +01:00
Tom Deakin
21c9022a3f
Keep C++11 flag explicitely defined in Cmake
2016-05-03 12:24:33 +01:00
Tom Deakin
662fcaf4b5
Add CMake things to gitignore
2016-05-03 12:18:41 +01:00
Tom Deakin
57ea4b8cae
Require CMake 3.2 so can check for C++11 nicely
2016-05-03 12:17:33 +01:00
Tom Deakin
1bd27428bd
Require CUDA 7 for C++11 support
2016-05-03 12:17:21 +01:00
Tom Deakin
8ce15a28aa
Update CMake with better binary name and source location
2016-05-03 11:45:25 +01:00
Tom Deakin
a355acf2ee
Move source files to top level directory
2016-05-03 11:43:25 +01:00
Tom Deakin
fcc9588c94
Change cl2.hpp include
2016-05-03 11:41:40 +01:00