Commit Graph

668 Commits

Author SHA1 Message Date
James Price
7f4761ae52 Replace write_arrays with init_arrays
This allows each model to initialise their arrays with a parallel
approach, which yields the first touch required for good performance
on NUMA architectures.
2016-11-02 11:22:01 +00:00
Tom Deakin
6f9512e5b5 Merge branch 'kernel-dot' of github.com:uob-hpc/gpu-stream into kernel-dot 2016-11-01 16:25:23 +00:00
James Price
829d21c1d6 Merge branch 'master' into kernel-dot 2016-11-01 16:20:27 +00:00
James Price
3045208aae [RAJA] Parallel first touch 2016-11-01 16:18:43 +00:00
Tom Deakin
54a966f99a Merge branch 'master' into kernel-dot 2016-10-31 19:04:59 +00:00
Tom Deakin
9acdba8b76 Add link to website 2016-10-31 18:31:33 +00:00
Tom Deakin
7150e047dd Add link to website 2016-10-31 18:31:07 +00:00
James Price
dd296d2231 [SYCL] Prebuild dot kernel like the others 2016-10-28 21:15:12 +01:00
James Price
b09b90f6fc Merge remote-tracking branch 'origin' into kernel-dot 2016-10-28 21:07:57 +01:00
James Price
cce8e78cae Merge pull request #12 from Ruyk/master
Updated the SYCL Stream benchmark with latest ComputeCpp CE 0.1.1 Edition
2016-10-28 21:06:34 +01:00
Tom Deakin
4e39572966 Add O3 to Kokkos CPU 2016-10-27 15:41:23 +01:00
Tom Deakin
98962c4aee Add Kokkos CPU Makefile 2016-10-27 15:19:16 +01:00
James Price
d7c48c5063 Slight tweak to dot config output to fix parsing scripts 2016-10-26 15:47:10 +01:00
James Price
cbf97dc7d9 [SYCL] Automatically determine dot NDRange config 2016-10-26 15:19:14 +01:00
James Price
21556af500 [OCL] Automatically determine dot NDRange config 2016-10-26 15:19:14 +01:00
Tom Deakin
bd0ba4dee9 Merge branch 'master' into kernel-dot 2016-10-26 14:37:15 +01:00
Tom Deakin
188165d729 [Kokkos] Add -O3 flag to Kokkos makefile 2016-10-26 14:37:00 +01:00
James Price
ed630e7dbc [SYCL] Implement dot kernel 2016-10-25 16:39:23 +01:00
Tom Deakin
e5b67ac969 Version bump 2016-10-25 12:22:01 +01:00
James Price
dfc79eeb4d Improve performance of CUDA dot implementation 2016-10-24 21:42:39 +01:00
James Price
d5482b74f4 Improve performance of OpenCL dot implementation 2016-10-24 21:26:09 +01:00
Tom Deakin
644ebc40ef Verify reduction result to 8 decimal places 2016-10-24 16:22:35 +01:00
Tom Deakin
f32cf3bad3 Merge branch 'master' into kernel-dot
Conflicts:
	main.cpp
2016-10-24 13:53:58 +01:00
Tom Deakin
963f3abfa0 Version bump 2016-10-24 13:52:03 +01:00
Tom Deakin
0bed614734 [OpenCL] Use global defined scalar value 2016-10-24 13:51:47 +01:00
Tom Deakin
ce5152fefd [HIP] Use global defined scalar value 2016-10-24 13:45:07 +01:00
Tom Deakin
47128d47c0 [SYCL] Use global defined scalar value 2016-10-24 13:37:43 +01:00
Tom Deakin
b54d94b82d [RAJA] Use global defined scalar value 2016-10-24 13:30:55 +01:00
Tom Deakin
d1bebf12d9 [Kokkos] Use global defined scalar value 2016-10-24 13:25:54 +01:00
Tom Deakin
ac6158fa31 [OpenACC] Use global defined scalar value 2016-10-24 13:24:11 +01:00
Tom Deakin
b120acaf87 [OMP45] Use global defined scalar value 2016-10-24 13:23:20 +01:00
Tom Deakin
7a81b63fbf [OMP3] Use global defined scalar value 2016-10-24 13:21:47 +01:00
Tom Deakin
5b1e67f666 [CUDA] Use new value of scalar 2016-10-24 13:19:54 +01:00
Tom Deakin
5ae613519d Change the value of scalar, and specify in a #define 2016-10-24 13:19:31 +01:00
James Price
cfc1aba2c0 Use WGSIZE=256 for dot for compatability with AMD 2016-10-24 12:51:01 +01:00
James Price
c9b3d07b84 Fix OpenCL host code for dot kernel
Wrong number of blocks was being copied and summed, and the host sums
vector didn't have the correct size.
2016-10-24 12:49:58 +01:00
James Price
8a8f44b4ce Fix CUDA host code for dot kernel
Wrong number of blocks was being copied and summed.
2016-10-24 12:47:25 +01:00
James Price
1e94870859 Fix verification of dot kernel 2016-10-24 12:47:01 +01:00
Tom Deakin
28c2660b52 Merge branch 'master' into kernel-dot 2016-10-24 12:21:16 +01:00
Tom Deakin
7408ab0366 Add RAJA dot kernel 2016-10-24 11:34:40 +01:00
James Price
fe41771bd4 Move HIP results into new directory structure 2016-10-21 12:57:31 +01:00
James Price
856a520687 Merge pull request #10 from sunway513/pull-request-HIP
Pull request hip
2016-10-21 12:53:50 +01:00
Tom Deakin
823e12708f Add dot kernel to Kokkos 2016-10-21 10:58:26 +01:00
Tom Deakin
7203f96f6b Restructure results directory 2016-10-20 17:35:22 +01:00
Ruyman Reyes
d562283cde Minor performance tuning for SYCL benchmark
* Pre-compiling kernel binaries when setting up the benchmark,
   like OpenCL equivalent

* Using the linear access syntax for buffers
2016-10-18 13:09:19 +01:00
Tom Deakin
d3b497a9ca Add a CUDA dot kernel 2016-10-14 17:51:40 +01:00
Tom Deakin
2085cacea0 Add an OpenCL dot kernel
We have to name the kernel stream_dot (for example) because the
"dot" kernel already exists.
2016-10-14 17:07:55 +01:00
Tom Deakin
8a100f07b4 Add dot kernel to OpenMP 4.5 - tested with clang-ykt 2016-10-14 15:19:25 +01:00
Tom Deakin
abe423ac6b Implement dot kernel in OpenMP 3 2016-10-14 15:05:06 +01:00
Tom Deakin
08fe695d51 Fix typo in main file 2016-10-14 15:04:04 +01:00