Tom Deakin
1336400311
[SYCL 2020] Use unique pointer for queue constructor
2021-02-08 11:34:37 +00:00
Tom Deakin
1db9a6b648
[SYCL 2020] Use smart pointers instead of raw pointers
2021-02-08 11:33:03 +00:00
Tom Deakin
1517101ceb
[SYCL 2020] Remove work-group heuristic for reduction as unused
2021-02-08 11:07:38 +00:00
Tom Deakin
b825df0074
[SYCL 2020] Declare reduction inline to reduce one variable name
2021-01-21 18:18:35 +00:00
Tom Deakin
4726f3f0f1
[SYCL 2020] Specify no_init property when initalising buffers
2021-01-21 10:39:13 +00:00
Tom Deakin
42c8954789
[SYCL 2020] use new reduction for dot kernel
2021-01-21 10:38:12 +00:00
Tom Deakin
b611db8cab
[SYCL 2020] Use host accessor constructors
2021-01-12 11:58:14 +00:00
Tom Deakin
282fb1e5e3
[SYCL 2020] Use accessor constructurs using CTAD and Tags instead of get_access
2021-01-12 11:54:39 +00:00
Tom Deakin
8f5357011a
[SYCL 2020] Use sycl::id for init kernel
2021-01-12 11:16:46 +00:00
Tom Deakin
501c61cfbd
[SYCL 2020] update namespace from cl::sycl to sycl::
...
Also remove the use namespace to make it clear what comes from SYCL
2021-01-12 11:14:43 +00:00
Tom Deakin
e8faf6843d
Remove old comment
2021-01-12 11:01:11 +00:00
Tom Deakin
8c72b52f16
[SYCL 2020] Use unnamed lamdas
2021-01-12 11:00:54 +00:00
Tom Deakin
9a69d3d5d9
use signed array size for SYCL
2021-01-12 10:24:00 +00:00
Tom Deakin
0919d95aa4
[SYCL] Use SYCL runtime device discovery
...
Fixes #63
2020-05-11 17:16:47 +01:00
Tom Deakin
1d6da069b3
[SYCL] Pass explicit async_handler to queue constructor
2020-05-11 17:13:36 +01:00
Tom Deakin
7f1637d679
[SYCL] Remove unused program variable
2020-05-11 17:10:48 +01:00
Tom Deakin
8776901733
[SYCL] Use the cl::sycl::id parameter in the parallel_for kernels
...
The cl::sycl::item provides extra features for extracing global/local
ids which aren't required by the kernels.
This also means the kernels don't need to extract the id from the item.
2019-11-01 15:19:01 +00:00
GeorgeWeb
e657bfa897
based on perf comparison, and discussions, the use pre-built kernels is unnecessary in this case
2019-06-20 14:24:46 +01:00
GeorgeWeb
54737d87cb
enclosing computecpp specific code in macros, rather than removing it
2019-06-20 10:13:39 +01:00
GeorgeWeb
a2e53d6728
remove use of pre-built kernel in parallel_for as is not conformant with the SYCL spec. (yet)
2019-06-18 17:31:40 +01:00
Georgi Mirazchiyski
60817e25a1
fix deprecated use of get_global() and get_local()
2019-06-18 17:22:49 +01:00
Ruyman Reyes
63f32fcb51
Manually clearing the global device vector
...
The vector of devices is a global object, which destruction order is
undefined. In some platforms, the OpenCL library has been unloaded
before this destructor is hit, which causes a segmentation fault after
the program ends. By clearing the global vector of devices on
destruction of the OpenCL and SYCL Stream benchmarks we avoid the
problem.
2018-05-02 15:21:41 +01:00
Peter Žužek
6958f070b1
Removed host_buffer target
2018-03-09 11:07:18 +00:00
Anton Rey
b6d9795476
SYCL implementation adapted to 1.2.1 interface
2017-12-08 12:49:21 +00:00
Vanya Yaneva
b8f7a5427e
Added exception after printing the SYCL exceptions
2017-07-31 17:44:58 +01:00
Vanya Yaneva
9916a81bc5
Small formatting change
2017-07-27 17:39:13 +01:00
Vanya Yaneva
8c4af581d1
Reverted changes in kernel build
2017-07-27 17:36:12 +01:00
Vanya Yaneva
05fc803858
Updated SYCL makefile and kernel build
2017-07-25 13:49:08 +01:00
James Price
db01715806
[SYCL] Explictly use first dimension of ranges
2016-11-18 00:35:36 +00:00
James Price
1e976ff150
[SYCL] Fix multiple template specializations
2016-11-18 00:14:46 +00:00
James Price
66776d5839
[SYCL] Use consistent syntax for indexing
2016-11-17 23:52:13 +00:00
James Price
02bff60870
[SYCL] Fix start index in reduction loop
2016-11-17 21:01:30 +00:00
Tom Deakin
d42bcd4675
Merge remote-tracking branch 'origin/init-arrays' into devel
2016-11-04 09:17:54 +00:00
James Price
7f4761ae52
Replace write_arrays with init_arrays
...
This allows each model to initialise their arrays with a parallel
approach, which yields the first touch required for good performance
on NUMA architectures.
2016-11-02 11:22:01 +00:00
James Price
dd296d2231
[SYCL] Prebuild dot kernel like the others
2016-10-28 21:15:12 +01:00
James Price
b09b90f6fc
Merge remote-tracking branch 'origin' into kernel-dot
2016-10-28 21:07:57 +01:00
James Price
cce8e78cae
Merge pull request #12 from Ruyk/master
...
Updated the SYCL Stream benchmark with latest ComputeCpp CE 0.1.1 Edition
2016-10-28 21:06:34 +01:00
James Price
d7c48c5063
Slight tweak to dot config output to fix parsing scripts
2016-10-26 15:47:10 +01:00
James Price
cbf97dc7d9
[SYCL] Automatically determine dot NDRange config
2016-10-26 15:19:14 +01:00
James Price
ed630e7dbc
[SYCL] Implement dot kernel
2016-10-25 16:39:23 +01:00
Tom Deakin
47128d47c0
[SYCL] Use global defined scalar value
2016-10-24 13:37:43 +01:00
Ruyman Reyes
d562283cde
Minor performance tuning for SYCL benchmark
...
* Pre-compiling kernel binaries when setting up the benchmark,
like OpenCL equivalent
* Using the linear access syntax for buffers
2016-10-18 13:09:19 +01:00
James Price
74a4a3b0bd
[SYCL] Set WGSIZE to more sensible value for AMD Fiji
2016-07-07 09:40:16 +01:00
Tom Deakin
31cb567e21
Switch data from 1.0, 2.0 and 3.0 to 0.1, 0.2, and 0.3 resp.
...
Using integers for maths gets unstable past 38 interations even
in double precision. Using the original values/10 is safe up to
the default 100 iterations.
2016-05-11 15:51:19 +01:00
Tom Deakin
81fa9e1922
Require SYCL array size to be multiple of WGSIZE
2016-05-11 12:23:21 +01:00
James Price
084d7417b9
[SYCL] Remove unneeded cl_device_info line
2016-05-09 15:20:11 +01:00
James Price
6d913bab4b
[SYCL] Actually use device_index to select device
2016-05-08 21:35:24 +01:00
James Price
3b3f6dfc26
[SYCL] Implement device list/selection functionality
2016-05-08 19:22:09 +01:00
James Price
54834e05f4
[SYCL] Use nd_range instead of range to specify work-group size
2016-05-06 22:41:10 +01:00
James Price
d4b3b3533c
Update SYCL version to work with ComputeCpp
...
Still needs proper CMake rules and kernel names need to be fixed for
multiple template instantiations.
2016-05-06 00:38:30 +01:00