Go to file

sunway513 11053798ff Improved GPU-STREAM benchmark for HIP version: 1. Add optional looper kernels to take command line input for the number of groups and groupSize 2. Add GEOMEAN value calculation of the kernels 3. Instructions on configure HIP environment in the README.md 4. Add results for HIP on FIJI Nano, TITAN X; CUDA on TITAN X 5. Run script to optionally run HIP version with groups and groupSize options		2016-03-15 07:56:32 -05:00
CL	Update to latest OpenCL C++ header from Khronos	2016-02-25 20:50:27 +00:00
results	Pull request for HIP version	2016-03-14 11:44:30 -05:00
.gitignore	Removed driver warning message from result	2015-08-05 16:21:20 +01:00
common.cpp	Improved GPU-STREAM benchmark for HIP version:	2016-03-15 07:56:32 -05:00
common.h	Improved GPU-STREAM benchmark for HIP version:	2016-03-15 07:56:32 -05:00
cuda-stream.cu	Display CUDA driver version in output header	2015-09-24 12:03:44 +01:00
hip-stream.cpp	Improved GPU-STREAM benchmark for HIP version:	2016-03-15 07:56:32 -05:00
LICENSE	Remove trailing whitespaces	2015-07-31 15:35:40 +01:00
Makefile	Pull request for HIP version	2016-03-14 11:44:30 -05:00
ocl-stream-kernels.cl	Remove trailing whitespaces	2015-07-31 15:35:40 +01:00
ocl-stream.cpp	Print out OpenCL device version for chosen device in output header	2015-09-24 11:49:08 +01:00
README.md	Improved GPU-STREAM benchmark for HIP version:	2016-03-15 07:56:32 -05:00
runhip.sh	Improved GPU-STREAM benchmark for HIP version:	2016-03-15 07:56:32 -05:00

README.md

GPU-STREAM

Measure memory transfer rates to/from global device memory on GPUs. This benchmark is similar in spirit, and based on, the STREAM benchmark [1] for CPUs.

Unlike other GPU memory bandwidth benchmarks this does not include the PCIe transfer time.

Usage

Build the OpenCL and CUDA binaries with make (CUDA version requires CUDA >= v6.5)

Run the OpenCL version with ./gpu-stream-ocl and the CUDA version with ./gpu-stream-cuda

For HIP version, follow the instructions on the following blog to properly install ROCK and ROCR drivers: http://gpuopen.com/getting-started-with-boltzmann-components-platforms-installation/ Install the HCC compiler: https://bitbucket.org/multicoreware/hcc/wiki/Home Install HIP: https://github.com/GPUOpen-ProfessionalCompute-Tools/HIP

Build the HIP binaries with make gpu-stream-hip, run it with './gpu-stream-hip'

Android

Assuming you have a recent Android NDK available, you can use the toolchain that it provides to build GPU-STREAM. You should first use the NDK to generate a standalone toolchain:

# Select a directory to install the toolchain to
ANDROID_NATIVE_TOOLCHAIN=/path/to/toolchain

${NDK}/build/tools/make-standalone-toolchain.sh \
  --platform=android-14 \
  --toolchain=arm-linux-androideabi-4.8 \
  --install-dir=${ANDROID_NATIVE_TOOLCHAIN}

Make sure that the OpenCL headers and library (libOpenCL.so) are available in ${ANDROID_NATIVE_TOOLCHAIN}/sysroot/usr/.

You should then be able to build GPU-STREAM:

make CXX=${ANDROID_NATIVE_TOOLCHAIN}/bin/arm-linux-androideabi-g++

Copy the executable and OpenCL kernels to the device:

adb push gpu-stream-ocl /data/local/tmp
adb push ocl-stream-kernels.cl /data/local/tmp

Run GPU-STREAM from an adb shell:

adb shell
cd /data/local/tmp

# Use float if device doesn't support double, and reduce array size
./gpu-stream-ocl --float -n 6 -s 10000000

Results

Sample results can be found in the results subdirectory. If you would like to submit updated results, please submit a Pull Request.

[1]: McCalpin, John D., 1995: "Memory Bandwidth and Machine Balance in Current High Performance Computers", IEEE Computer Society Technical Committee on Computer Architecture (TCCA) Newsletter, December 1995.