James Price
b09b90f6fc
Merge remote-tracking branch 'origin' into kernel-dot
2016-10-28 21:07:57 +01:00
James Price
cce8e78cae
Merge pull request #12 from Ruyk/master
...
Updated the SYCL Stream benchmark with latest ComputeCpp CE 0.1.1 Edition
2016-10-28 21:06:34 +01:00
Tom Deakin
4e39572966
Add O3 to Kokkos CPU
2016-10-27 15:41:23 +01:00
Tom Deakin
98962c4aee
Add Kokkos CPU Makefile
2016-10-27 15:19:16 +01:00
James Price
d7c48c5063
Slight tweak to dot config output to fix parsing scripts
2016-10-26 15:47:10 +01:00
James Price
cbf97dc7d9
[SYCL] Automatically determine dot NDRange config
2016-10-26 15:19:14 +01:00
James Price
21556af500
[OCL] Automatically determine dot NDRange config
2016-10-26 15:19:14 +01:00
Tom Deakin
bd0ba4dee9
Merge branch 'master' into kernel-dot
2016-10-26 14:37:15 +01:00
Tom Deakin
188165d729
[Kokkos] Add -O3 flag to Kokkos makefile
2016-10-26 14:37:00 +01:00
James Price
ed630e7dbc
[SYCL] Implement dot kernel
2016-10-25 16:39:23 +01:00
Tom Deakin
e5b67ac969
Version bump
2016-10-25 12:22:01 +01:00
James Price
dfc79eeb4d
Improve performance of CUDA dot implementation
2016-10-24 21:42:39 +01:00
James Price
d5482b74f4
Improve performance of OpenCL dot implementation
2016-10-24 21:26:09 +01:00
Tom Deakin
644ebc40ef
Verify reduction result to 8 decimal places
2016-10-24 16:22:35 +01:00
Tom Deakin
f32cf3bad3
Merge branch 'master' into kernel-dot
...
Conflicts:
main.cpp
2016-10-24 13:53:58 +01:00
Tom Deakin
963f3abfa0
Version bump
2016-10-24 13:52:03 +01:00
Tom Deakin
0bed614734
[OpenCL] Use global defined scalar value
2016-10-24 13:51:47 +01:00
Tom Deakin
ce5152fefd
[HIP] Use global defined scalar value
2016-10-24 13:45:07 +01:00
Tom Deakin
47128d47c0
[SYCL] Use global defined scalar value
2016-10-24 13:37:43 +01:00
Tom Deakin
b54d94b82d
[RAJA] Use global defined scalar value
2016-10-24 13:30:55 +01:00
Tom Deakin
d1bebf12d9
[Kokkos] Use global defined scalar value
2016-10-24 13:25:54 +01:00
Tom Deakin
ac6158fa31
[OpenACC] Use global defined scalar value
2016-10-24 13:24:11 +01:00
Tom Deakin
b120acaf87
[OMP45] Use global defined scalar value
2016-10-24 13:23:20 +01:00
Tom Deakin
7a81b63fbf
[OMP3] Use global defined scalar value
2016-10-24 13:21:47 +01:00
Tom Deakin
5b1e67f666
[CUDA] Use new value of scalar
2016-10-24 13:19:54 +01:00
Tom Deakin
5ae613519d
Change the value of scalar, and specify in a #define
2016-10-24 13:19:31 +01:00
James Price
cfc1aba2c0
Use WGSIZE=256 for dot for compatability with AMD
2016-10-24 12:51:01 +01:00
James Price
c9b3d07b84
Fix OpenCL host code for dot kernel
...
Wrong number of blocks was being copied and summed, and the host sums
vector didn't have the correct size.
2016-10-24 12:49:58 +01:00
James Price
8a8f44b4ce
Fix CUDA host code for dot kernel
...
Wrong number of blocks was being copied and summed.
2016-10-24 12:47:25 +01:00
James Price
1e94870859
Fix verification of dot kernel
2016-10-24 12:47:01 +01:00
Tom Deakin
28c2660b52
Merge branch 'master' into kernel-dot
2016-10-24 12:21:16 +01:00
Tom Deakin
7408ab0366
Add RAJA dot kernel
2016-10-24 11:34:40 +01:00
James Price
fe41771bd4
Move HIP results into new directory structure
2016-10-21 12:57:31 +01:00
James Price
856a520687
Merge pull request #10 from sunway513/pull-request-HIP
...
Pull request hip
2016-10-21 12:53:50 +01:00
Tom Deakin
823e12708f
Add dot kernel to Kokkos
2016-10-21 10:58:26 +01:00
Tom Deakin
7203f96f6b
Restructure results directory
2016-10-20 17:35:22 +01:00
Ruyman Reyes
d562283cde
Minor performance tuning for SYCL benchmark
...
* Pre-compiling kernel binaries when setting up the benchmark,
like OpenCL equivalent
* Using the linear access syntax for buffers
2016-10-18 13:09:19 +01:00
Tom Deakin
d3b497a9ca
Add a CUDA dot kernel
2016-10-14 17:51:40 +01:00
Tom Deakin
2085cacea0
Add an OpenCL dot kernel
...
We have to name the kernel stream_dot (for example) because the
"dot" kernel already exists.
2016-10-14 17:07:55 +01:00
Tom Deakin
8a100f07b4
Add dot kernel to OpenMP 4.5 - tested with clang-ykt
2016-10-14 15:19:25 +01:00
Tom Deakin
abe423ac6b
Implement dot kernel in OpenMP 3
2016-10-14 15:05:06 +01:00
Tom Deakin
08fe695d51
Fix typo in main file
2016-10-14 15:04:04 +01:00
Tom Deakin
275bfb2066
Check result of the final reduction
2016-10-14 14:45:28 +01:00
Tom Deakin
04ca357159
Call the Dot kernel and print out results
2016-10-14 14:40:28 +01:00
Tom Deakin
0ef9b6691b
Implement the reduction in OpenACC
2016-10-14 14:40:08 +01:00
Tom Deakin
614613e7d4
Add the dot routine to the abstract class
2016-10-14 14:39:48 +01:00
Ruyman Reyes
23a43bfa6d
Adding support for ComputeCpp CE
...
This patch updates the CMake building options to support
the ComputeCpp Community Edition 0.1, including the FindComputeCpp
CMake module provided with the ComputeCpp SDK.
In order to build with ComputeCpp, only the standard CMake flags from
the SDK are required:
cmake ../
-DHAS_SYCL=ON \
-DCOMPUTECPP_PACKAGE_ROOT_DIR=/path/to/computecpp/package \
-DCMAKE_MODULE_PATH=GPU-STREAM/cmake/Modules/
2016-10-12 17:51:41 +01:00
pensun
300ccd01d9
move hip_runtime.h after copyright info
2016-10-12 10:41:50 -05:00
pensun
f9fea7f8b5
Add performance results for HIP version.
2016-09-06 11:19:33 -05:00
pensun
a1f9d9ece7
Add support of HIP version of GPU-STREAM.
...
This commit was tested with HIP developer preview branch.
2016-09-05 23:41:01 -05:00