BabelStream

Author	SHA1	Message	Date
Tom Lin	145e2a0649	Merge branch 'fix_num_type' into develop	2023-10-07 15:09:44 +01:00
Tom Lin	f2f7f3a3de	Fix bad dot group initialiser in HIP and CUDA	2023-10-07 11:12:08 +01:00
Tom Lin	ffae3ba83f	Fix CMAKE_CUDA_FLAGS, resolves #166	2023-10-07 09:45:16 +01:00
Tom Deakin	9954b7d38c	Set CUDA dot kernel to use number of blocks relative to device property This aligns with the approach implemented in other models (SYCL 1.2.1 and HIP) Cherry-picks the CUDA updates from lmeadows in #122	2023-10-06 17:56:42 +01:00
Tom Lin	bd6bb09b5d	Fix MEM flag for CUDA, resolves #163	2023-09-25 01:39:23 +01:00
Tom Lin	3dcafd1af1	Fix max element guard overflow for CUDA, resolves #136	2023-09-22 02:31:14 +01:00
Tom Deakin	092ee67764	Change CUDA DOT thread-blocks to 1024 This improves the performance on Ampere (A100) GPUs. Fixes #137.	2023-06-12 15:51:13 +01:00
Tom Deakin	a35c7b4bea	Fix CUDA memory check for large array sizes Closes #123	2022-02-16 14:33:17 +00:00
Tom Lin	f5fe55c204	[WIP] Drop CL headers and Makefiles Update README Move new models to /src	2021-11-30 18:22:55 +00:00
Tom Lin	5318404249	Use ./src instead of ./cpp Create subdir for each cpp-based implementation	2021-05-26 17:46:07 +01:00