BabelStream

Author	SHA1	Message	Date
James Price	1e976ff150	[SYCL] Fix multiple template specializations	2016-11-18 00:14:46 +00:00
Tom Deakin	d42bcd4675	Merge remote-tracking branch 'origin/init-arrays' into devel	2016-11-04 09:17:54 +00:00
James Price	7f4761ae52	Replace write_arrays with init_arrays This allows each model to initialise their arrays with a parallel approach, which yields the first touch required for good performance on NUMA architectures.	2016-11-02 11:22:01 +00:00
Tom Deakin	644ebc40ef	Verify reduction result to 8 decimal places	2016-10-24 16:22:35 +01:00
Tom Deakin	f32cf3bad3	Merge branch 'master' into kernel-dot Conflicts: main.cpp	2016-10-24 13:53:58 +01:00
Tom Deakin	5ae613519d	Change the value of scalar, and specify in a #define	2016-10-24 13:19:31 +01:00
James Price	1e94870859	Fix verification of dot kernel	2016-10-24 12:47:01 +01:00
Tom Deakin	28c2660b52	Merge branch 'master' into kernel-dot	2016-10-24 12:21:16 +01:00
Tom Deakin	08fe695d51	Fix typo in main file	2016-10-14 15:04:04 +01:00
Tom Deakin	275bfb2066	Check result of the final reduction	2016-10-14 14:45:28 +01:00
Tom Deakin	04ca357159	Call the Dot kernel and print out results	2016-10-14 14:40:28 +01:00
pensun	a1f9d9ece7	Add support of HIP version of GPU-STREAM. This commit was tested with HIP developer preview branch.	2016-09-05 23:41:01 -05:00
Tom Deakin	d420032c66	Remove warning about iteration count when using floats as new data values work for 100 iterations	2016-05-11 17:15:43 +01:00
Tom Deakin	31cb567e21	Switch data from 1.0, 2.0 and 3.0 to 0.1, 0.2, and 0.3 resp. Using integers for maths gets unstable past 38 interations even in double precision. Using the original values/10 is safe up to the default 100 iterations.	2016-05-11 15:51:19 +01:00
Tom Deakin	55a858e0c0	Use 2^25 as default size because 2^26 gives too many thread blocks for CUDA	2016-05-11 15:43:52 +01:00
Tom Deakin	eb10c716f2	First attempt at OpenMP 4.5	2016-05-11 15:08:08 +01:00
Tom Deakin	207fd8f784	Default to power of two array size	2016-05-11 12:04:19 +01:00
Tom Deakin	0f8f191d0e	Require number of iterations to be at least 2	2016-05-11 11:55:33 +01:00
Tom Deakin	75ef78495c	Add print out of number of iterations	2016-05-11 11:53:51 +01:00
Tom Deakin	3227e5dbf0	Print out data type for float or double	2016-05-11 11:52:17 +01:00
Tom Deakin	5c8b07262b	Default to 100 iterations to get over any warm up times	2016-05-11 11:49:44 +01:00
Matthew Martineau	894829cb05	Adjusted the Kokkos implementation to fix view initialisation, and store local copies of views for lambda scoping	2016-05-06 21:02:44 +01:00
Matthew Martineau	57189e7ca5	Merge branch 'refactor' of https://github.com/UoB-hpc/gpu-stream into refactor	2016-05-06 10:54:18 +01:00
Matthew Martineau	3b266b8266	Fix for namespace collision with #define RAJA	2016-05-06 10:53:12 +01:00
James Price	d4b3b3533c	Update SYCL version to work with ComputeCpp Still needs proper CMake rules and kernel names need to be fixed for multiple template instantiations.	2016-05-06 00:38:30 +01:00
Matthew Martineau	0a738efa54	Merging in changes from trunk	2016-05-05 17:23:47 +01:00
Matthew Martineau	7c28a6386b	Added the Kokkos and RAJA implementations	2016-05-05 17:22:29 +01:00
Tom Deakin	f0afa0c1e4	Add reference OpenMP 3.0 version	2016-05-04 10:41:41 +01:00
Tom Deakin	0b0de4e0c3	Implement the OpenACC device string functions, and device selector	2016-05-03 14:50:09 +01:00
James Price	da4f918788	Add initial SYCL implementation	2016-05-03 14:45:13 +01:00
Tom Deakin	1a38b18954	Add OpenACC version	2016-05-03 14:36:08 +01:00
Tom Deakin	530b2adda2	Add License text to all files	2016-05-03 12:32:03 +01:00
Tom Deakin	a355acf2ee	Move source files to top level directory	2016-05-03 11:43:25 +01:00

1 2

83 Commits