Go to file
2017-02-24 09:33:59 -06:00
CL Update cl2.hpp 2016-05-03 11:41:00 +01:00
results Move HIP results into new directory structure 2016-10-21 12:57:31 +01:00
.gitignore Add SYCL intermediate outputs to .gitignore 2017-02-23 19:38:14 +00:00
ACCStream.cpp Merge remote-tracking branch 'origin/init-arrays' into devel 2016-11-04 09:17:54 +00:00
ACCStream.h Merge remote-tracking branch 'origin/init-arrays' into devel 2016-11-04 09:17:54 +00:00
CUDA.make Add a clean option to all Makefiles 2017-02-24 11:57:34 +00:00
CUDAStream.cu [CUDA] Free the sum device buffer 2016-12-19 11:42:45 +00:00
CUDAStream.h Merge remote-tracking branch 'origin/init-arrays' into devel 2016-11-04 09:17:54 +00:00
HIP.make Add a clean option to all Makefiles 2017-02-24 11:57:34 +00:00
HIPStream.cu Add dot kernel to HIP implementation 2017-02-23 19:08:25 +00:00
HIPStream.h Add dot kernel to HIP implementation 2017-02-23 19:08:25 +00:00
Kokkos.make Add help message to Kokkos TARGET variable 2017-02-24 13:07:35 +00:00
KOKKOSStream.cpp Merge remote-tracking branch 'origin/init-arrays' into devel 2016-11-04 09:17:54 +00:00
KOKKOSStream.hpp Merge remote-tracking branch 'origin/init-arrays' into devel 2016-11-04 09:17:54 +00:00
LICENSE Add License text to all files 2016-05-03 12:32:03 +01:00
main.cpp Version bump 2017-01-30 13:52:45 +00:00
OCLStream.cpp Merge remote-tracking branch 'origin/init-arrays' into devel 2016-11-04 09:17:54 +00:00
OCLStream.h Merge remote-tracking branch 'origin/init-arrays' into devel 2016-11-04 09:17:54 +00:00
OMPStream.cpp [OMP] Update deconstructor to only call target region if building for GPU 2016-12-13 11:45:30 +00:00
OMPStream.h Make OpenMP string name without version number 2016-12-09 12:24:08 +00:00
OpenACC.make Add intermediate objects to OpenACC clean rule 2017-02-24 13:14:13 +00:00
OpenCL.make Allow user to override CXX in OpenCL.make 2017-02-24 09:33:59 -06:00
OpenMP.make Make Cray OpenMP flag non-empty to fix error 2017-02-24 09:02:51 -06:00
RAJA.make Add help messages to RAJA Makefile 2017-02-24 13:11:07 +00:00
RAJAStream.cpp Merge remote-tracking branch 'origin/init-arrays' into devel 2016-11-04 09:17:54 +00:00
RAJAStream.hpp Merge remote-tracking branch 'origin/init-arrays' into devel 2016-11-04 09:17:54 +00:00
README.android Move android instructions to seperate file 2017-02-23 16:45:55 +00:00
README.md Merge branch 'master' into bugfix/build 2017-02-23 16:46:25 +00:00
Stream.h Merge remote-tracking branch 'origin/init-arrays' into devel 2016-11-04 09:17:54 +00:00
SYCL.make Use computecpp_info for SYCL device compiler flags 2017-02-23 19:26:55 +00:00
SYCLStream.cpp [SYCL] Explictly use first dimension of ranges 2016-11-18 00:35:36 +00:00
SYCLStream.h [SYCL] Fix multiple template specializations 2016-11-18 00:14:46 +00:00

GPU-STREAM

Measure memory transfer rates to/from global device memory on GPUs. This benchmark is similar in spirit, and based on, the STREAM benchmark [1] for CPUs.

Unlike other GPU memory bandwidth benchmarks this does not include the PCIe transfer time.

There are multiple implementations of this benchmark in a variety of programming models. Currently implemented are:

  • OpenCL
  • CUDA
  • OpenACC
  • OpenMP 3 and 4.5
  • Kokkos
  • RAJA
  • SYCL

Website

uob-hpc.github.io/GPU-STREAM/

Usage

Drivers, compiler and software applicable to whichever implementation you would like to build against is required.

We have supplied a series of Makefiles, one for each programming model, to assist with building. The Makefiles contain common build options, and should be simple to customise for your needs too.

General usage is make -f <Model>.make Common compiler flags and names can be set by passing a COMPILER option to Make, e.g. make COMPILER=GNU. Some models allow specifying a CPU or GPU style target, and this can be set by passing a TARGET option to Make, e.g. make TARGET=GPU.

Pass in extra flags via the EXTRA_FLAGS option.

The binaries are named in the form <model>-stream.

Results

Sample results can be found in the results subdirectory. If you would like to submit updated results, please submit a Pull Request.

Citing

You can view the Poster and Extended Abstract on GPU-STREAM presented at SC'15. Please cite GPU-STREAM via this reference:

Deakin T, Price J, Martineau M, McIntosh-Smith S. GPU-STREAM v2.0: Benchmarking the achievable memory bandwidth of many-core processors across diverse parallel programming models. 2016. Paper presented at P^3MA Workshop at ISC High Performance, Frankfurt, Germany.

Other GPU-STREAM publications:

Deakin T, McIntosh-Smith S. GPU-STREAM: Benchmarking the achievable memory bandwidth of Graphics Processing Units. 2015. Poster session presented at IEEE/ACM SuperComputing, Austin, United States.

[1]: McCalpin, John D., 1995: "Memory Bandwidth and Machine Balance in Current High Performance Computers", IEEE Computer Society Technical Committee on Computer Architecture (TCCA) Newsletter, December 1995.