Tom Deakin
|
4954ef7cf0
|
Add map clauses to OpenMP 4.5 kernels
|
2016-05-11 15:17:06 +01:00 |
|
Tom Deakin
|
eb10c716f2
|
First attempt at OpenMP 4.5
|
2016-05-11 15:08:08 +01:00 |
|
Tom Deakin
|
e095cb67f8
|
Remove ugly CMake endif text in parenthesis
|
2016-05-11 13:37:12 +01:00 |
|
Tom Deakin
|
bf29b02d35
|
Add banners in CMakeLists file so easy to spot build rules for versions
|
2016-05-11 13:35:24 +01:00 |
|
Tom Deakin
|
8a195b6416
|
Remove printout of compiler id in cmake
|
2016-05-11 13:35:12 +01:00 |
|
Tom Deakin
|
e0ca56bd67
|
Set the C++11 flag when using the Cray compiler
|
2016-05-11 13:33:01 +01:00 |
|
Tom Deakin
|
8d45e61f6c
|
Check for OpenACC support by checking the various compiler flags
|
2016-05-11 13:20:15 +01:00 |
|
Tom Deakin
|
1a9225ca95
|
If building CUDA on Darwin with Xcode 7.3.1 skip becuase CUDA doesn't work this version
|
2016-05-11 12:54:12 +01:00 |
|
Tom Deakin
|
81fa9e1922
|
Require SYCL array size to be multiple of WGSIZE
|
2016-05-11 12:23:21 +01:00 |
|
Tom Deakin
|
2462023ed9
|
Set thread block size in CUDA with a #define, and check that array size is multiple of it
|
2016-05-11 12:21:29 +01:00 |
|
Tom Deakin
|
207fd8f784
|
Default to power of two array size
|
2016-05-11 12:04:19 +01:00 |
|
Tom Deakin
|
0f8f191d0e
|
Require number of iterations to be at least 2
|
2016-05-11 11:55:33 +01:00 |
|
Tom Deakin
|
75ef78495c
|
Add print out of number of iterations
|
2016-05-11 11:53:51 +01:00 |
|
Tom Deakin
|
3227e5dbf0
|
Print out data type for float or double
|
2016-05-11 11:52:17 +01:00 |
|
Tom Deakin
|
5c8b07262b
|
Default to 100 iterations to get over any warm up times
|
2016-05-11 11:49:44 +01:00 |
|
Tom Deakin
|
b53fb1b3b2
|
Merge pull request #7 from sunway513/pull-request-HIP
Pull request hip
|
2016-05-09 17:03:49 +01:00 |
|
James Price
|
084d7417b9
|
[SYCL] Remove unneeded cl_device_info line
|
2016-05-09 15:20:11 +01:00 |
|
James Price
|
6d913bab4b
|
[SYCL] Actually use device_index to select device
|
2016-05-08 21:35:24 +01:00 |
|
James Price
|
3b3f6dfc26
|
[SYCL] Implement device list/selection functionality
|
2016-05-08 19:22:09 +01:00 |
|
James Price
|
58fa72dee0
|
Merge branch 'refactor' of github.com:UoB-HPC/GPU-STREAM into refactor
|
2016-05-06 22:42:24 +01:00 |
|
James Price
|
54834e05f4
|
[SYCL] Use nd_range instead of range to specify work-group size
|
2016-05-06 22:41:10 +01:00 |
|
James Price
|
fb8f06e683
|
[SYCL] Pass -no-serial-memop to compute++ to squelch warning
|
2016-05-06 22:33:18 +01:00 |
|
Matthew Martineau
|
6e9b85bb26
|
Fixed deep copy ordering, which was reversed
|
2016-05-06 21:08:23 +01:00 |
|
Matthew Martineau
|
0f0454ec29
|
Added CUDA device syncs to force proper timing
|
2016-05-06 21:05:20 +01:00 |
|
Matthew Martineau
|
894829cb05
|
Adjusted the Kokkos implementation to fix view initialisation, and store local copies of views for lambda scoping
|
2016-05-06 21:02:44 +01:00 |
|
Matthew Martineau
|
1a60f130eb
|
Fixed memory management for GPU, now working with OpenMP and CUDA
|
2016-05-06 13:17:04 +01:00 |
|
Matthew Martineau
|
57189e7ca5
|
Merge branch 'refactor' of https://github.com/UoB-hpc/gpu-stream into refactor
|
2016-05-06 10:54:18 +01:00 |
|
Matthew Martineau
|
3b266b8266
|
Fix for namespace collision with #define RAJA
|
2016-05-06 10:53:12 +01:00 |
|
Matthew Martineau
|
45381da0b2
|
Initial commit of in progress developments of RAJA and KOKKOS stream
|
2016-05-06 10:46:35 +01:00 |
|
James Price
|
d4b3b3533c
|
Update SYCL version to work with ComputeCpp
Still needs proper CMake rules and kernel names need to be fixed for
multiple template instantiations.
|
2016-05-06 00:38:30 +01:00 |
|
Matthew Martineau
|
0a738efa54
|
Merging in changes from trunk
|
2016-05-05 17:23:47 +01:00 |
|
Matthew Martineau
|
7c28a6386b
|
Added the Kokkos and RAJA implementations
|
2016-05-05 17:22:29 +01:00 |
|
Tom Deakin
|
f0afa0c1e4
|
Add reference OpenMP 3.0 version
|
2016-05-04 10:41:41 +01:00 |
|
Tom Deakin
|
48cae0cbc1
|
Make sure CUDA nvcc builds with C++11
|
2016-05-03 15:20:26 +01:00 |
|
Tom Deakin
|
31819b7778
|
Add bones of OpenACC in CMake config
|
2016-05-03 15:07:51 +01:00 |
|
Tom Deakin
|
0b0de4e0c3
|
Implement the OpenACC device string functions, and device selector
|
2016-05-03 14:50:09 +01:00 |
|
James Price
|
b45f311e0d
|
Add missing SYCL source files
|
2016-05-03 14:48:35 +01:00 |
|
James Price
|
40a0a6551d
|
Remove extra -std=c++11 from CMake build
|
2016-05-03 14:46:08 +01:00 |
|
James Price
|
da4f918788
|
Add initial SYCL implementation
|
2016-05-03 14:45:13 +01:00 |
|
Tom Deakin
|
1a38b18954
|
Add OpenACC version
|
2016-05-03 14:36:08 +01:00 |
|
Tom Deakin
|
530b2adda2
|
Add License text to all files
|
2016-05-03 12:32:03 +01:00 |
|
Tom Deakin
|
95a10511ec
|
Update LICENSE date
|
2016-05-03 12:28:44 +01:00 |
|
Tom Deakin
|
21c9022a3f
|
Keep C++11 flag explicitely defined in Cmake
|
2016-05-03 12:24:33 +01:00 |
|
Tom Deakin
|
662fcaf4b5
|
Add CMake things to gitignore
|
2016-05-03 12:18:41 +01:00 |
|
Tom Deakin
|
57ea4b8cae
|
Require CMake 3.2 so can check for C++11 nicely
|
2016-05-03 12:17:33 +01:00 |
|
Tom Deakin
|
1bd27428bd
|
Require CUDA 7 for C++11 support
|
2016-05-03 12:17:21 +01:00 |
|
Tom Deakin
|
8ce15a28aa
|
Update CMake with better binary name and source location
|
2016-05-03 11:45:25 +01:00 |
|
Tom Deakin
|
a355acf2ee
|
Move source files to top level directory
|
2016-05-03 11:43:25 +01:00 |
|
Tom Deakin
|
fcc9588c94
|
Change cl2.hpp include
|
2016-05-03 11:41:40 +01:00 |
|
Tom Deakin
|
83516ae352
|
Update cl2.hpp
|
2016-05-03 11:41:00 +01:00 |
|