Tom Deakin
|
6f9512e5b5
|
Merge branch 'kernel-dot' of github.com:uob-hpc/gpu-stream into kernel-dot
|
2016-11-01 16:25:23 +00:00 |
|
James Price
|
829d21c1d6
|
Merge branch 'master' into kernel-dot
|
2016-11-01 16:20:27 +00:00 |
|
James Price
|
3045208aae
|
[RAJA] Parallel first touch
|
2016-11-01 16:18:43 +00:00 |
|
Tom Deakin
|
54a966f99a
|
Merge branch 'master' into kernel-dot
|
2016-10-31 19:04:59 +00:00 |
|
Tom Deakin
|
9acdba8b76
|
Add link to website
|
2016-10-31 18:31:33 +00:00 |
|
Tom Deakin
|
7150e047dd
|
Add link to website
|
2016-10-31 18:31:07 +00:00 |
|
James Price
|
dd296d2231
|
[SYCL] Prebuild dot kernel like the others
|
2016-10-28 21:15:12 +01:00 |
|
James Price
|
b09b90f6fc
|
Merge remote-tracking branch 'origin' into kernel-dot
|
2016-10-28 21:07:57 +01:00 |
|
James Price
|
cce8e78cae
|
Merge pull request #12 from Ruyk/master
Updated the SYCL Stream benchmark with latest ComputeCpp CE 0.1.1 Edition
|
2016-10-28 21:06:34 +01:00 |
|
Tom Deakin
|
4e39572966
|
Add O3 to Kokkos CPU
|
2016-10-27 15:41:23 +01:00 |
|
Tom Deakin
|
98962c4aee
|
Add Kokkos CPU Makefile
|
2016-10-27 15:19:16 +01:00 |
|
James Price
|
d7c48c5063
|
Slight tweak to dot config output to fix parsing scripts
|
2016-10-26 15:47:10 +01:00 |
|
James Price
|
cbf97dc7d9
|
[SYCL] Automatically determine dot NDRange config
|
2016-10-26 15:19:14 +01:00 |
|
James Price
|
21556af500
|
[OCL] Automatically determine dot NDRange config
|
2016-10-26 15:19:14 +01:00 |
|
Tom Deakin
|
bd0ba4dee9
|
Merge branch 'master' into kernel-dot
|
2016-10-26 14:37:15 +01:00 |
|
Tom Deakin
|
188165d729
|
[Kokkos] Add -O3 flag to Kokkos makefile
|
2016-10-26 14:37:00 +01:00 |
|
James Price
|
ed630e7dbc
|
[SYCL] Implement dot kernel
|
2016-10-25 16:39:23 +01:00 |
|
Tom Deakin
|
e5b67ac969
|
Version bump
|
2016-10-25 12:22:01 +01:00 |
|
James Price
|
dfc79eeb4d
|
Improve performance of CUDA dot implementation
|
2016-10-24 21:42:39 +01:00 |
|
James Price
|
d5482b74f4
|
Improve performance of OpenCL dot implementation
|
2016-10-24 21:26:09 +01:00 |
|
Tom Deakin
|
644ebc40ef
|
Verify reduction result to 8 decimal places
|
2016-10-24 16:22:35 +01:00 |
|
Tom Deakin
|
f32cf3bad3
|
Merge branch 'master' into kernel-dot
Conflicts:
main.cpp
|
2016-10-24 13:53:58 +01:00 |
|
Tom Deakin
|
963f3abfa0
|
Version bump
|
2016-10-24 13:52:03 +01:00 |
|
Tom Deakin
|
0bed614734
|
[OpenCL] Use global defined scalar value
|
2016-10-24 13:51:47 +01:00 |
|
Tom Deakin
|
ce5152fefd
|
[HIP] Use global defined scalar value
|
2016-10-24 13:45:07 +01:00 |
|
Tom Deakin
|
47128d47c0
|
[SYCL] Use global defined scalar value
|
2016-10-24 13:37:43 +01:00 |
|
Tom Deakin
|
b54d94b82d
|
[RAJA] Use global defined scalar value
|
2016-10-24 13:30:55 +01:00 |
|
Tom Deakin
|
d1bebf12d9
|
[Kokkos] Use global defined scalar value
|
2016-10-24 13:25:54 +01:00 |
|
Tom Deakin
|
ac6158fa31
|
[OpenACC] Use global defined scalar value
|
2016-10-24 13:24:11 +01:00 |
|
Tom Deakin
|
b120acaf87
|
[OMP45] Use global defined scalar value
|
2016-10-24 13:23:20 +01:00 |
|
Tom Deakin
|
7a81b63fbf
|
[OMP3] Use global defined scalar value
|
2016-10-24 13:21:47 +01:00 |
|
Tom Deakin
|
5b1e67f666
|
[CUDA] Use new value of scalar
|
2016-10-24 13:19:54 +01:00 |
|
Tom Deakin
|
5ae613519d
|
Change the value of scalar, and specify in a #define
|
2016-10-24 13:19:31 +01:00 |
|
James Price
|
cfc1aba2c0
|
Use WGSIZE=256 for dot for compatability with AMD
|
2016-10-24 12:51:01 +01:00 |
|
James Price
|
c9b3d07b84
|
Fix OpenCL host code for dot kernel
Wrong number of blocks was being copied and summed, and the host sums
vector didn't have the correct size.
|
2016-10-24 12:49:58 +01:00 |
|
James Price
|
8a8f44b4ce
|
Fix CUDA host code for dot kernel
Wrong number of blocks was being copied and summed.
|
2016-10-24 12:47:25 +01:00 |
|
James Price
|
1e94870859
|
Fix verification of dot kernel
|
2016-10-24 12:47:01 +01:00 |
|
Tom Deakin
|
28c2660b52
|
Merge branch 'master' into kernel-dot
|
2016-10-24 12:21:16 +01:00 |
|
Tom Deakin
|
7408ab0366
|
Add RAJA dot kernel
|
2016-10-24 11:34:40 +01:00 |
|
James Price
|
fe41771bd4
|
Move HIP results into new directory structure
|
2016-10-21 12:57:31 +01:00 |
|
James Price
|
856a520687
|
Merge pull request #10 from sunway513/pull-request-HIP
Pull request hip
|
2016-10-21 12:53:50 +01:00 |
|
Tom Deakin
|
823e12708f
|
Add dot kernel to Kokkos
|
2016-10-21 10:58:26 +01:00 |
|
Tom Deakin
|
7203f96f6b
|
Restructure results directory
|
2016-10-20 17:35:22 +01:00 |
|
Ruyman Reyes
|
d562283cde
|
Minor performance tuning for SYCL benchmark
* Pre-compiling kernel binaries when setting up the benchmark,
like OpenCL equivalent
* Using the linear access syntax for buffers
|
2016-10-18 13:09:19 +01:00 |
|
Tom Deakin
|
d3b497a9ca
|
Add a CUDA dot kernel
|
2016-10-14 17:51:40 +01:00 |
|
Tom Deakin
|
2085cacea0
|
Add an OpenCL dot kernel
We have to name the kernel stream_dot (for example) because the
"dot" kernel already exists.
|
2016-10-14 17:07:55 +01:00 |
|
Tom Deakin
|
8a100f07b4
|
Add dot kernel to OpenMP 4.5 - tested with clang-ykt
|
2016-10-14 15:19:25 +01:00 |
|
Tom Deakin
|
abe423ac6b
|
Implement dot kernel in OpenMP 3
|
2016-10-14 15:05:06 +01:00 |
|
Tom Deakin
|
08fe695d51
|
Fix typo in main file
|
2016-10-14 15:04:04 +01:00 |
|
Tom Deakin
|
275bfb2066
|
Check result of the final reduction
|
2016-10-14 14:45:28 +01:00 |
|