Commit Graph

6 Commits

Author SHA1 Message Date
Tom Lin
f2f7f3a3de Fix bad dot group initialiser in HIP and CUDA 2023-10-07 11:12:08 +01:00
Tom Deakin
9954b7d38c Set CUDA dot kernel to use number of blocks relative to device property
This aligns with the approach implemented in other models (SYCL 1.2.1 and HIP)

Cherry-picks the CUDA updates from lmeadows in #122
2023-10-06 17:56:42 +01:00
Tom Lin
bd6bb09b5d Fix MEM flag for CUDA, resolves #163 2023-09-25 01:39:23 +01:00
Tom Lin
3dcafd1af1 Fix max element guard overflow for CUDA, resolves #136 2023-09-22 02:31:14 +01:00
Tom Deakin
a35c7b4bea Fix CUDA memory check for large array sizes
Closes #123
2022-02-16 14:33:17 +00:00
Tom Lin
5318404249 Use ./src instead of ./cpp
Create subdir for each cpp-based implementation
2021-05-26 17:46:07 +01:00