Tom Lin
|
f2f7f3a3de
|
Fix bad dot group initialiser in HIP and CUDA
|
2023-10-07 11:12:08 +01:00 |
|
Tom Deakin
|
9954b7d38c
|
Set CUDA dot kernel to use number of blocks relative to device property
This aligns with the approach implemented in other models (SYCL 1.2.1 and HIP)
Cherry-picks the CUDA updates from lmeadows in #122
|
2023-10-06 17:56:42 +01:00 |
|
Tom Lin
|
bd6bb09b5d
|
Fix MEM flag for CUDA, resolves #163
|
2023-09-25 01:39:23 +01:00 |
|
Tom Lin
|
3dcafd1af1
|
Fix max element guard overflow for CUDA, resolves #136
|
2023-09-22 02:31:14 +01:00 |
|
Tom Deakin
|
a35c7b4bea
|
Fix CUDA memory check for large array sizes
Closes #123
|
2022-02-16 14:33:17 +00:00 |
|
Tom Lin
|
5318404249
|
Use ./src instead of ./cpp
Create subdir for each cpp-based implementation
|
2021-05-26 17:46:07 +01:00 |
|