diff --git a/README.md b/README.md index 5a84b50..fee2ef6 100644 --- a/README.md +++ b/README.md @@ -6,15 +6,30 @@ This benchmark is similar in spirit, and based on, the STREAM benchmark [1] for Unlike other GPU memory bandwidth benchmarks this does *not* include the PCIe transfer time. +There are multiple implementations of this benchmark in a variety of programming models. +Currently implemented are: + - OpenCL + - CUDA + - OpenACC + - OpenMP 3 and 4.5 + - Kokkos + - RAJA + - SYCL + Usage ----- -Build the OpenCL and CUDA binaries with `make` (CUDA version requires CUDA >= v6.5) +CMake 3.2 or above is required. +Drivers, compiler and software applicable to whichever implementation you would like to build against. Our build system is designed to only build implementations in programming models that your system supports. -Run the OpenCL version with `./gpu-stream-ocl` and the CUDA version with `./gpu-stream-cuda` +Generate the Makefile with `cmake .` -Android -------- +Build the various binaries with `make` + +This will generate binaries of the form `./gpu-stream-*` where `*` identifies the programming model. + +Android (outdated instructions) +------------------ Assuming you have a recent Android NDK available, you can use the toolchain that it provides to build GPU-STREAM. You should first @@ -48,6 +63,7 @@ Run GPU-STREAM from an adb shell: # Use float if device doesn't support double, and reduce array size ./gpu-stream-ocl --float -n 6 -s 10000000 + Results -------