BabelStream

Author	SHA1	Message	Date
Tom Deakin	05e3e5a127	Add CUDA nstream kernel	2021-02-02 12:32:33 +00:00
Tom Deakin	693a7e7478	use signed array size for CUDA	2021-01-12 10:20:44 +00:00
Tom Deakin	3bd65a0716	Merge branch 'master' into cuda-memory	2017-05-11 11:28:33 +01:00
James Price	94e0900377	Use static shared memory in dot for CUDA and HIP	2017-02-28 13:24:45 +00:00
Tom Deakin	8d66a27131	[CUDA] If using managed memory, use device pointer for host reduction	2016-12-19 05:08:19 -07:00
Tom Deakin	62860284b2	[CUDA] Add Managed memory and Page fault options To use managed memory, compile the code defining MANAGED To use CUDA 8 page-fault memory, compile the code defining PAGEFAULT	2016-12-19 05:00:15 -07:00
Tom Deakin	b9c514fd9b	[CUDA] Free the sum device buffer	2016-12-19 11:42:45 +00:00
Tom Deakin	d42bcd4675	Merge remote-tracking branch 'origin/init-arrays' into devel	2016-11-04 09:17:54 +00:00
James Price	7f4761ae52	Replace write_arrays with init_arrays This allows each model to initialise their arrays with a parallel approach, which yields the first touch required for good performance on NUMA architectures.	2016-11-02 11:22:01 +00:00
James Price	dfc79eeb4d	Improve performance of CUDA dot implementation	2016-10-24 21:42:39 +01:00
Tom Deakin	f32cf3bad3	Merge branch 'master' into kernel-dot Conflicts: main.cpp	2016-10-24 13:53:58 +01:00
Tom Deakin	5b1e67f666	[CUDA] Use new value of scalar	2016-10-24 13:19:54 +01:00
James Price	8a8f44b4ce	Fix CUDA host code for dot kernel Wrong number of blocks was being copied and summed.	2016-10-24 12:47:25 +01:00
Tom Deakin	d3b497a9ca	Add a CUDA dot kernel	2016-10-14 17:51:40 +01:00
James Price	f94e36f320	[CUDA] Fix device name output (OpenCL->CUDA)	2016-07-06 17:16:35 +01:00
Tom Deakin	31cb567e21	Switch data from 1.0, 2.0 and 3.0 to 0.1, 0.2, and 0.3 resp. Using integers for maths gets unstable past 38 interations even in double precision. Using the original values/10 is safe up to the default 100 iterations.	2016-05-11 15:51:19 +01:00
Tom Deakin	2462023ed9	Set thread block size in CUDA with a #define, and check that array size is multiple of it	2016-05-11 12:21:29 +01:00
Tom Deakin	530b2adda2	Add License text to all files	2016-05-03 12:32:03 +01:00
Tom Deakin	a355acf2ee	Move source files to top level directory	2016-05-03 11:43:25 +01:00

19 Commits