Tom Deakin
e20aecd845
[SYCL 1.2.1] Add check for FP64 support
...
Fixes #98
2021-05-17 15:25:43 +01:00
Tom Deakin
bda9525b95
Add SYCL 1.2.1 nstream kernel
2021-02-02 12:29:03 +00:00
Tom Deakin
9a69d3d5d9
use signed array size for SYCL
2021-01-12 10:24:00 +00:00
Tom Deakin
0919d95aa4
[SYCL] Use SYCL runtime device discovery
...
Fixes #63
2020-05-11 17:16:47 +01:00
Tom Deakin
1d6da069b3
[SYCL] Pass explicit async_handler to queue constructor
2020-05-11 17:13:36 +01:00
Tom Deakin
7f1637d679
[SYCL] Remove unused program variable
2020-05-11 17:10:48 +01:00
Tom Deakin
8776901733
[SYCL] Use the cl::sycl::id parameter in the parallel_for kernels
...
The cl::sycl::item provides extra features for extracing global/local
ids which aren't required by the kernels.
This also means the kernels don't need to extract the id from the item.
2019-11-01 15:19:01 +00:00
GeorgeWeb
e657bfa897
based on perf comparison, and discussions, the use pre-built kernels is unnecessary in this case
2019-06-20 14:24:46 +01:00
GeorgeWeb
54737d87cb
enclosing computecpp specific code in macros, rather than removing it
2019-06-20 10:13:39 +01:00
GeorgeWeb
a2e53d6728
remove use of pre-built kernel in parallel_for as is not conformant with the SYCL spec. (yet)
2019-06-18 17:31:40 +01:00
Georgi Mirazchiyski
60817e25a1
fix deprecated use of get_global() and get_local()
2019-06-18 17:22:49 +01:00
Ruyman Reyes
63f32fcb51
Manually clearing the global device vector
...
The vector of devices is a global object, which destruction order is
undefined. In some platforms, the OpenCL library has been unloaded
before this destructor is hit, which causes a segmentation fault after
the program ends. By clearing the global vector of devices on
destruction of the OpenCL and SYCL Stream benchmarks we avoid the
problem.
2018-05-02 15:21:41 +01:00
Peter Žužek
6958f070b1
Removed host_buffer target
2018-03-09 11:07:18 +00:00
Anton Rey
b6d9795476
SYCL implementation adapted to 1.2.1 interface
2017-12-08 12:49:21 +00:00
Vanya Yaneva
b8f7a5427e
Added exception after printing the SYCL exceptions
2017-07-31 17:44:58 +01:00
Vanya Yaneva
9916a81bc5
Small formatting change
2017-07-27 17:39:13 +01:00
Vanya Yaneva
8c4af581d1
Reverted changes in kernel build
2017-07-27 17:36:12 +01:00
Vanya Yaneva
05fc803858
Updated SYCL makefile and kernel build
2017-07-25 13:49:08 +01:00
James Price
db01715806
[SYCL] Explictly use first dimension of ranges
2016-11-18 00:35:36 +00:00
James Price
1e976ff150
[SYCL] Fix multiple template specializations
2016-11-18 00:14:46 +00:00
James Price
66776d5839
[SYCL] Use consistent syntax for indexing
2016-11-17 23:52:13 +00:00
James Price
02bff60870
[SYCL] Fix start index in reduction loop
2016-11-17 21:01:30 +00:00
Tom Deakin
d42bcd4675
Merge remote-tracking branch 'origin/init-arrays' into devel
2016-11-04 09:17:54 +00:00
James Price
7f4761ae52
Replace write_arrays with init_arrays
...
This allows each model to initialise their arrays with a parallel
approach, which yields the first touch required for good performance
on NUMA architectures.
2016-11-02 11:22:01 +00:00
James Price
dd296d2231
[SYCL] Prebuild dot kernel like the others
2016-10-28 21:15:12 +01:00
James Price
b09b90f6fc
Merge remote-tracking branch 'origin' into kernel-dot
2016-10-28 21:07:57 +01:00
James Price
cce8e78cae
Merge pull request #12 from Ruyk/master
...
Updated the SYCL Stream benchmark with latest ComputeCpp CE 0.1.1 Edition
2016-10-28 21:06:34 +01:00
James Price
d7c48c5063
Slight tweak to dot config output to fix parsing scripts
2016-10-26 15:47:10 +01:00
James Price
cbf97dc7d9
[SYCL] Automatically determine dot NDRange config
2016-10-26 15:19:14 +01:00
James Price
ed630e7dbc
[SYCL] Implement dot kernel
2016-10-25 16:39:23 +01:00
Tom Deakin
47128d47c0
[SYCL] Use global defined scalar value
2016-10-24 13:37:43 +01:00
Ruyman Reyes
d562283cde
Minor performance tuning for SYCL benchmark
...
* Pre-compiling kernel binaries when setting up the benchmark,
like OpenCL equivalent
* Using the linear access syntax for buffers
2016-10-18 13:09:19 +01:00
James Price
74a4a3b0bd
[SYCL] Set WGSIZE to more sensible value for AMD Fiji
2016-07-07 09:40:16 +01:00
Tom Deakin
31cb567e21
Switch data from 1.0, 2.0 and 3.0 to 0.1, 0.2, and 0.3 resp.
...
Using integers for maths gets unstable past 38 interations even
in double precision. Using the original values/10 is safe up to
the default 100 iterations.
2016-05-11 15:51:19 +01:00
Tom Deakin
81fa9e1922
Require SYCL array size to be multiple of WGSIZE
2016-05-11 12:23:21 +01:00
James Price
084d7417b9
[SYCL] Remove unneeded cl_device_info line
2016-05-09 15:20:11 +01:00
James Price
6d913bab4b
[SYCL] Actually use device_index to select device
2016-05-08 21:35:24 +01:00
James Price
3b3f6dfc26
[SYCL] Implement device list/selection functionality
2016-05-08 19:22:09 +01:00
James Price
54834e05f4
[SYCL] Use nd_range instead of range to specify work-group size
2016-05-06 22:41:10 +01:00
James Price
d4b3b3533c
Update SYCL version to work with ComputeCpp
...
Still needs proper CMake rules and kernel names need to be fixed for
multiple template instantiations.
2016-05-06 00:38:30 +01:00
James Price
b45f311e0d
Add missing SYCL source files
2016-05-03 14:48:35 +01:00