Tom Deakin
683b8fcf88
update README
2021-02-18 13:51:24 +00:00
Tom Deakin
13dfff45a6
Merge pull request #90 from UoB-HPC/generalise-run
...
Generalise run function
2021-02-18 13:18:38 +00:00
Tom Deakin
ba47571ab2
update changelog
2021-02-18 12:37:58 +00:00
Tom Deakin
3deb9f8eff
Generalise the run functions to share construction of models and check vectors
...
An enum is added and set by the command line options to choose the running mode:
all 5 kernels or triad only.
There is now only one run fuction which call the constructor of each
model implementation, calling another routine to run the kernels.
The output is then determined by the enum value.
2021-02-18 12:35:12 +00:00
Tom Deakin
018d8a4510
[OpenCL] Remove dot kernel object in deconstructor
2021-02-02 15:45:54 +00:00
Tom Deakin
30231575cb
Tidy CUDA memory mode Makefile
2021-02-02 12:33:18 +00:00
Tom Deakin
5d697fdfe9
Add missing OpenMP flag to Intel CPU builds
2021-02-02 11:49:16 +00:00
Tom Deakin
f99f8d35d9
Revert "Add nstream kernel from PRK"
...
This reverts commit 1e94a41f3c .
2021-02-02 11:25:27 +00:00
Tom Deakin
877f820282
Revert "Update README with nstream citations"
...
This reverts commit cb0c345ad5 .
2021-02-02 11:25:14 +00:00
Tom Deakin
cb0c345ad5
Update README with nstream citations
2021-02-02 11:24:41 +00:00
Tom Deakin
1e94a41f3c
Add nstream kernel from PRK
...
PRK has a nstream kernel, which is Triad with a += update.
This means there are 3 reads and a write, which is a higher
read/write ratio. In addition, non-temporal stores for the
write on CPUs will not be beneficial, and so compilers should
take care to emit these for the other kernels, but not these.
2021-02-01 17:41:30 +00:00
Tom Deakin
435a104f6e
Check input array size is positive
2021-01-12 15:30:41 +00:00
Tom Deakin
903eb40d19
Add parseInt function for parsing CLI arguments for array size
2021-01-12 10:28:01 +00:00
Tom Deakin
15001000c5
use signed int for array size in RAJA
2021-01-12 10:25:45 +00:00
Tom Deakin
87ab797490
use signed ints for HC
2021-01-12 10:25:16 +00:00
Tom Deakin
66aaec281f
use signed ints for HIP
2021-01-12 10:24:27 +00:00
Tom Deakin
9a69d3d5d9
use signed array size for SYCL
2021-01-12 10:24:00 +00:00
Tom Deakin
20c3284629
Update CHANGELOG with signed int change
2021-01-12 10:23:21 +00:00
Tom Deakin
d01b46a87a
use signed ints for STD C++20
2021-01-12 10:22:53 +00:00
Tom Deakin
ecc47f5320
use signed ints for STD C++17
2021-01-12 10:22:29 +00:00
Tom Deakin
94c7c3dbd8
use signed array size for OpenCL
2021-01-12 10:21:48 +00:00
Tom Deakin
693a7e7478
use signed array size for CUDA
2021-01-12 10:20:44 +00:00
Tom Deakin
850c63d69b
use signed ints for ACC array size
2021-01-12 10:14:44 +00:00
Tom Deakin
e6c200a2d3
use signed int for Kokkos array size
2021-01-12 10:13:53 +00:00
Tom Deakin
00de932454
Save array size argument as signed integer
2021-01-12 10:09:55 +00:00
Tom Deakin
a9fd663471
Make OpenMP array size signed
2021-01-12 10:04:51 +00:00
Tom Deakin
4abb080a0e
Fix GCC AMD build for OpenMP offload
2020-12-30 14:40:21 +00:00
Tom Deakin
cf42335e7a
Merge branch 'cuda-memory' into main
2020-12-07 15:15:37 +00:00
Tom Deakin
9c211bca96
Update changelog for CUDA memory mode
2020-12-07 15:13:06 +00:00
Tom Deakin
e8fb3a6be4
Add C++20 version using for_each_n and range factories
...
Closes #85
2020-12-07 14:55:54 +00:00
Tom Deakin
ffa221fd35
Fix OpenMP Clang NVIDIA Target flags (missing sm architecture) with new NVARCH option
...
Example usage:
make -f OpenMP.make COMPILER=CLANG TARGET=NVIDIA NVARCH=sm_61
Fixes #61
2020-12-07 12:23:11 +00:00
Tom Deakin
5a93022fc1
Update OpenACC for Issue #80
2020-12-07 11:50:20 +00:00
Tom Deakin
b00120d346
Update STD C++17 for Issue #80
2020-12-07 11:32:22 +00:00
Tom Deakin
74f705cac9
Update OpenMP for Issue #80
2020-12-07 10:41:48 +00:00
Tom Deakin
829aa15da0
Allocate driver solution check vectors *after* the main computation
...
Each Stream implementation owns its own data, so the driver code
shouldn't allocate a large array just before. On processors with
strong NUMA effects and smaller memory capacities per NUMA domain,
these checking vectors can result in the main arrays being
allocated in the wrong NUMA domain.
The fix is to simply move the driver allocation until after the
computation has finished and we want to check the answers.
This commit only changes the driver; each model will be updated
in subsequent commits.
Fixes #80 .
2020-12-07 10:39:37 +00:00
Tom Deakin
f373927ce8
Rename branch name
2020-12-07 10:23:27 +00:00
Tom Deakin
f271d5563d
Merge pull request #84 from gonzalobg/cxx_parallel_stl
...
Add NVIDIA HPC SDK C++ parallel STL implementation
2020-12-03 14:15:45 +00:00
Gonzalo Brito Gadeschi
0855805ce2
Add NVIDIA HPC SDK C++ parallel STL implementation
...
This commits adds an implementation using the C++ parallel STL.
The Makefile uses the NVIDIA HPC SDK `nvc++` compiler with the `-stdpar` flag.
Tested using the NVIDIA HPC SDK 20.9.
2020-11-23 03:08:44 -08:00
Tom Deakin
5182342403
Update CHANGELOG.md
2020-10-26 09:58:57 +00:00
Tom Deakin
8ae8c70188
Merge pull request #81 from Kerilk/master
...
Ensure OpenCL destructors are called in the "correct" order.
2020-10-26 09:58:05 +00:00
Brice Videau
e92d034f64
Ensure OpenCL destructors are called in the correct order.
2020-10-16 18:05:23 -05:00
Tom Deakin
6f46267e6c
Add AOMP build options
2020-08-13 17:46:45 +01:00
Tom Deakin
66d915fa2e
[OpenMP] Fix ARMCLANG Makefile bug where it didn't set the flags
2020-08-12 15:39:13 +01:00
Tom Deakin
f31181dedb
Add -O3 flat to HIP.make to fix segmentation fault
2020-08-12 14:09:22 +01:00
Tom Deakin
da3946a7d5
Add missing O3 flag for OpenMP ARMCLANG
2020-08-07 17:09:46 +01:00
Tom Deakin
0ff841bbf5
Update CHANGELOG.md
2020-08-07 12:29:28 +01:00
Tom Deakin
17f057c38a
Merge pull request #79 from tom91136/master
...
Update build flags for SYCL, Kokkos, and OpenMP, tracking newest versions of each compiler
2020-08-07 12:27:43 +01:00
Tom Lin
cdaf6cb88e
Fixed a bug where ComputeCpp's flags is omitted
...
Renamed INTEL_GT -> INTEL_GPU
Only use NVCC with Kokkos if not using HIPCC
2020-08-07 11:00:56 +01:00
Tom Lin
59274d6a91
Add NVIDIA as target for dpcpp
2020-08-05 08:54:40 +01:00
Tom Lin
603dc7d136
Add HIP compilers for Kokkos
2020-08-05 08:49:18 +01:00