Commit Graph

791 Commits

Author SHA1 Message Date
Tom Lin
dc42388df3 Fix CXX recognition issues for rocThrust
Fix CI check for min CMake version on CUDA Thrust
Temporarily disable CUDA Thrust w/ TBB for now
2021-11-12 03:25:18 +00:00
Tom Lin
fe4007b446 Fix CI ROCm quirks
Fix CI CUDA path
2021-11-12 02:26:31 +00:00
Tom Lin
0d55a7261b Fix CI not installing rocThrust
Fix CI CUDA flag version
2021-11-12 00:14:07 +00:00
Tom Lin
a463e88895 Fix CI rocThrust build variables
Fix CI CUDA cmake module include path
Bump CI NVHPC version
2021-11-11 23:50:27 +00:00
Tom Lin
c2f75b90b3 Fix CI NVHPC path
Fix CI ROCm install sources
2021-11-11 23:30:04 +00:00
Tom Lin
a66696d971 Initial Thrust implementation 2021-11-11 23:11:04 +00:00
Tom Lin
78b52a496c Use @simd instead of @fastmath for CPU reduction 2021-08-28 11:39:08 +01:00
Tom Lin
41f1767365 Pause GC during benchmark to reduce noise 2021-08-28 11:16:19 +01:00
Tom Lin
13cb8ffced Use custom static reduction for CPU 2021-08-28 11:10:49 +01:00
Tom Lin
4853457dca Add type annotation for all kernels
Update dependencies
2021-08-27 14:04:58 +01:00
Tom Lin
c445b64690 Address CUDA comments
Drop soft=false for AMDGPU as this option was removed
Update dependencies
2021-08-18 02:00:50 +01:00
Tom Lin
bb271dd046 Update PlainStream with context 2021-08-18 01:59:06 +01:00
Tom Lin
a26699c5b5 Add oneAPI and KA implementation
Isolate projects to avoid transitive dependency
Add parameter for passing devices
Incorporate further reviews
Update all dependencies
2021-08-17 14:28:47 +01:00
Tom Deakin
8f9ca7baa7 update references in README 2021-07-28 10:37:25 +01:00
Tom Deakin
b4d01160cb
CITATION cannot yet handle external references 2021-07-28 10:31:39 +01:00
Tom Deakin
d8eba00132
Update CITATION.cff 2021-07-28 10:30:51 +01:00
Tom Deakin
064743299b
Update CITATION.cff 2021-07-28 10:30:18 +01:00
Tom Deakin
b766b0c707
Update CITATION.cff 2021-07-28 10:29:56 +01:00
Tom Deakin
537ad3650e Add CITATION file 2021-07-28 10:28:34 +01:00
Tom Lin
867a8a32ee Use older fmt-maven-plugin for Java 8 compat. 2021-07-01 06:05:10 +01:00
Tom Lin
82084d407b +x for mvnw executable 2021-07-01 06:01:29 +01:00
Tom Lin
ab41475f10 Initial Java implementation 2021-07-01 05:59:48 +01:00
Tom Lin
7c1e04a42b Add comment about blockIdx/workgroupIdx in Julia 2021-06-30 19:31:42 +01:00
Tom Lin
2e957d3f60 Inline blocks in CUDAStream 2021-06-30 19:20:37 +01:00
Tom Lin
418315543c Use -p 2 and no arg for JuliaStream in CI 2021-06-30 19:09:37 +01:00
Tom Lin
d675875dcd Switch back to -p for DistributedStream 2021-06-30 19:03:39 +01:00
Tom Lin
fe180656d1 Merge branch 'main' into julia 2021-06-30 18:44:17 +01:00
Tom Lin
4e6c56729b Inline AMDGPU's hard_wait
Show the selected implementation and not a constant "threaded"
2021-06-30 18:09:54 +01:00
Tom Lin
6fe81e1955 Update CUDA to 11.3 for CI script 2021-06-30 16:31:14 +01:00
Tom Lin
ce7f013a8e Update NVHPC to 2.15 w/ CUDA 11.3 2021-06-30 16:04:27 +01:00
Tom Lin
cd367c7da3 Mirror Fujitsu flags for CMake 2021-06-29 17:53:32 +01:00
Tom Deakin
fa6433bab1 update changelog 2021-06-25 09:45:38 -05:00
Tom Deakin
eba2e79eab [OpenMP] Add Fujitsu compiler flags
For best performance on the A64FX with the Fujitsu compiler,
the array pointers also need to be labeled __restrict and const
as appropriate.

Closes #94.
2021-06-25 09:44:16 -05:00
Tom Lin
e3bd58378f Don't debug print args 2021-06-16 01:16:10 +01:00
Tom Lin
ce4d6cfbfb Add integration tests and CI
Fix wrong nstream in plain_stream
2021-06-16 01:11:40 +01:00
Tom Lin
fdb2c181cc Add Crossbeam implementation
Add rustfmt and use target-cpu=native
Add option for libc malloc, basic thread pinning, touch-free allocation
Split modules
2021-06-15 23:13:14 +01:00
Tom Lin
c70a5da45b Merge branch 'main' into rust 2021-06-10 05:37:03 +01:00
Tom Lin
d799535c96 Larger arraysize for CI 2021-06-10 05:06:48 +01:00
Tom Lin
c5ad3f34d9 Drop -p N for DistributedStream.jl CI 2021-06-10 05:01:24 +01:00
Tom Lin
2cf8ca5f8c Use addprocs() for DistributedStream 2021-06-10 04:57:52 +01:00
Tom Lin
63f471f880 set pwd to JuliaStream.jl for CI run 2021-06-10 04:33:12 +01:00
Tom Lin
b3efa6af67 Initial Julia implementation 2021-06-10 04:20:40 +01:00
Tom Deakin
5d9e408a06 [SYCL 2020] Make array size a size_t 2021-06-04 16:42:49 +00:00
Tom Deakin
25e021caa3
Update CHANGELOG.md 2021-06-03 16:08:14 +01:00
Tom Deakin
dd90598e20
Merge pull request #105 from UoB-HPC/tbb
Initial TBB implementation
2021-06-03 16:07:38 +01:00
Tom Lin
0e3727d8f8 Make partitioner a compile option
Inline all abstractions
Add intel compilers for Make
2021-06-03 13:43:12 +01:00
Tom Lin
0867115d8d Remove references to oneapi/tbb.h 2021-05-27 10:51:45 +01:00
Tom Lin
d3b676cb37 Include CL_MEM_CHANNEL_INTEL directly to avoid header precedence issues 2021-05-27 10:47:46 +01:00
Tom Lin
7a130a59bc Don't tie implementation to oneTBB specific headers
Fix wrong TBB_ROOT detection
2021-05-27 10:23:06 +01:00
Tom Lin
4d00a8699e Don't point to the CL dir for SYCL 2021-05-27 09:41:41 +01:00