Tom Deakin
|
dafc63030f
|
Rename to BabelStream
|
2017-04-08 12:16:29 +01:00 |
|
Tom Deakin
|
9c08fdd184
|
Minor version bump
|
2017-04-06 10:38:48 +01:00 |
|
Tom Deakin
|
50e3a1970f
|
Add RAJA CUDA build instructions
|
2017-04-06 10:38:03 +01:00 |
|
Tom Deakin
|
c534600d04
|
[RAJA] Use Index_type for iterator index type instead of hardcoding int
|
2017-04-06 10:36:01 +01:00 |
|
Tom Deakin
|
3331f62f42
|
Add RAJA build instructions to README
|
2017-04-06 10:16:34 +01:00 |
|
Tom Deakin
|
5f9b288570
|
[RAJA] Declare pointers using RAJA_RESTRICT
|
2017-04-06 10:15:11 +01:00 |
|
Tom Deakin
|
1bd4adfe7b
|
[RAJA] Align the memory to 2MB pages
|
2017-04-06 10:14:51 +01:00 |
|
Tom Deakin
|
1eb75f034a
|
[RAJA] Use xHost and streaming stores with the Intel compiler
|
2017-04-06 10:02:25 +01:00 |
|
Tom Deakin
|
d7a93be739
|
[Kokkos] Add a COMPILER option to Makefile, which turns on streaming stores for Intel
|
2017-04-05 22:23:27 +01:00 |
|
Tom Deakin
|
d7e38c1ca9
|
Add Kokkos build instructions to README
|
2017-04-05 22:09:58 +01:00 |
|
Tom Deakin
|
d9dfc3f552
|
[Kokkos] Use long for iterator variable
|
2017-04-05 21:57:55 +01:00 |
|
Peter Steinbach
|
04589d4d4f
|
added fixed bug in dot product
|
2017-04-03 14:16:25 +02:00 |
|
Peter Steinbach
|
fd35d895d9
|
added optimized flags to CXXFLAGS
|
2017-04-03 14:16:06 +02:00 |
|
Peter Steinbach
|
55f467e24d
|
moved experimental dot product implementation of dot_impl which is build only if -DHC_DEVELOP is given
|
2017-03-27 14:22:56 +02:00 |
|
Peter Steinbach
|
2882383324
|
Merge remote-tracking branch 'upstream/master' into rocm_hc_support
|
2017-03-24 15:46:41 +01:00 |
|
Peter Steinbach
|
0e45f86588
|
added cascaded reduction based on C++AMP book
|
2017-03-24 15:19:48 +01:00 |
|
Peter Steinbach
|
96bc566ce1
|
added debug flag
|
2017-03-24 15:19:22 +01:00 |
|
Peter Steinbach
|
0535cbcd5b
|
renamed variables and introduced views
|
2017-03-23 15:55:23 +01:00 |
|
Tom Deakin
|
bf57cf578d
|
[CUDA] Merge pull request #28 from psteinb/extra_cuda_make_variable
Allow specifying compiler and flags for build
|
2017-03-17 14:22:37 +00:00 |
|
Peter Steinbach
|
d8cb7494e0
|
pulled -O3 out into CXXFLAGS, refactored CUDA compiler into CUDA_CXX
make variable to cope with clang as CUDA compiler as well
|
2017-03-17 15:18:13 +01:00 |
|
James Price
|
703eb945a2
|
[OpenMP] Align memory (2MB by default)
|
2017-03-13 17:17:20 +00:00 |
|
James Price
|
4f288ddc3d
|
[OpenMP] Add -qopt-streaming-stores for Intel
|
2017-03-13 17:15:10 +00:00 |
|
Peter Steinbach
|
8c7a801a84
|
put -O3 into CXXFLAGS to comply with OpenMP.make
|
2017-03-13 15:22:26 +01:00 |
|
Peter Steinbach
|
ea12f2a9a1
|
added EXTRA_FLAGS variable to CUDA Makefile to provide the freedom to specify debug flags or gencode flags
|
2017-03-13 14:41:16 +01:00 |
|
James Price
|
94e0900377
|
Use static shared memory in dot for CUDA and HIP
|
2017-02-28 13:24:45 +00:00 |
|
Tom Deakin
|
e7a619c63c
|
Merge pull request #27 from psteinb/fix_minus_for_an_equal
replaced - for = so that assignment takes place
|
2017-02-28 12:46:43 +00:00 |
|
Peter Steinbach
|
e570b458a6
|
replaced - for = so that assignment takes place
|
2017-02-28 13:43:57 +01:00 |
|
James Price
|
8a47b72764
|
Merge pull request #26 from psteinb/fix_sharedmem_hip
Fix sharedmem hip
|
2017-02-28 12:36:37 +00:00 |
|
Peter Steinbach
|
58773a79b7
|
removed extra lines introduced by hipify, removed obsolete commented code
|
2017-02-28 13:33:21 +01:00 |
|
Peter Steinbach
|
3fc0b57a2c
|
do initial assignment through parallel_for_each
|
2017-02-28 13:31:37 +01:00 |
|
Peter Steinbach
|
ceada6922f
|
proper declaration of tb_sum with HIP_DYNAMIC_SHARED macro
|
2017-02-28 10:07:48 +01:00 |
|
Peter Steinbach
|
350a151c3b
|
removed CUDA_PATH sentinel from HIP.make
|
2017-02-28 10:04:36 +01:00 |
|
Peter Steinbach
|
ee7cd066ac
|
renamed HIPStream implementation
|
2017-02-28 10:03:23 +01:00 |
|
Peter Steinbach
|
383c5a3ae7
|
all required operations implemented, errors are too large
|
2017-02-28 10:00:44 +01:00 |
|
Peter Steinbach
|
f7af8ebc91
|
added printf style error messages to nail down memory access problems
|
2017-02-27 17:01:38 +01:00 |
|
Peter Steinbach
|
0fc6722684
|
added Makefile and code for HC
|
2017-02-27 16:35:03 +01:00 |
|
Peter Steinbach
|
62ea5e3ed6
|
Merge remote-tracking branch 'upstream/master' into bare_hc
Conflicts:
CMakeLists.txt
|
2017-02-27 14:35:11 +01:00 |
|
Tom Deakin
|
cc90cefeeb
|
Minor version bump to signal build system update
|
2017-02-25 14:14:59 +00:00 |
|
Tom Deakin
|
4d24e2341f
|
Merge pull request #24 from UoB-HPC/bugfix/build
Simplify build system
|
2017-02-25 14:14:29 +00:00 |
|
James Price
|
2416727239
|
Refactor compiler flag handling in RAJA Makefile
|
2017-02-24 22:28:16 +00:00 |
|
Tom Deakin
|
050a27ca83
|
Add XL compiler support to OpenMP and RAJA makefiles
|
2017-02-24 17:37:30 +00:00 |
|
James Price
|
dfe5503cba
|
Allow user to override CXX in OpenCL.make
|
2017-02-24 09:33:59 -06:00 |
|
James Price
|
569cfa1d31
|
Make Cray OpenMP flag non-empty to fix error
|
2017-02-24 09:02:51 -06:00 |
|
James Price
|
a7d7998326
|
Use -framework OpenCL on Darwin
|
2017-02-24 13:40:54 +00:00 |
|
James Price
|
1aec057e48
|
Add help messages to OpenMP.make and refactor
|
2017-02-24 13:32:59 +00:00 |
|
James Price
|
8fee86a232
|
Add compiler help to OpenCL.make
|
2017-02-24 13:17:12 +00:00 |
|
James Price
|
82de818855
|
Add support for Intel as host compiler for OpenCL
|
2017-02-24 13:14:13 +00:00 |
|
James Price
|
6008f8c536
|
Add intermediate objects to OpenACC clean rule
PGI creates these, even though we don't ask for them.
|
2017-02-24 13:14:13 +00:00 |
|
Tom Deakin
|
c470b88dee
|
Add compiler help text to OpenACC
|
2017-02-24 13:13:08 +00:00 |
|
Tom Deakin
|
3be4ebc1a2
|
Add help messages to RAJA Makefile
|
2017-02-24 13:11:07 +00:00 |
|