cory-master/BabelStream

Fork 0

Kaan Olgu 0040130b67 TBB Instructions for Spack

2023-06-17 11:38:50 +00:00

9.1 KiB

Raw Blame History

Spack Instructions

OpenMP
OpenCL
STD
Hip(ROCM)
Cuda
Kokkos
Sycl
Sycl2020
ACC
Raja
Tbb
Thrust

OpenMP

There are 3 offloading options for OpenMP: NVIDIA, AMD and Intel.
If a user provides a value for cuda_arch, the execution will be automatically offloaded to NVIDIA.
If a user provides a value for amdgpu_target, the operation will be offloaded to AMD.
In the absence of cuda_arch and amdgpu_target, the execution will be offloaded to Intel.

Flag	Definition
cuda_arch	- List of supported compute capabilities are provided here - Useful link for matching CUDA gencodes with NVIDIA architectures
amdgpu_target	List of supported architectures are provided here

# Example 1: for Intel offload
 $ spack install babelstream%oneapi +omp 

# Example 2: for Nvidia GPU for Volta (sm_70) 
 $ spack install babelstream +omp cuda_arch=70 
 
# Example 3: for AMD GPU gfx701 
 $ spack install babelstream +omp amdgpu_target=gfx701

OpenCL

No need to specify amdgpu_target or cuda_arch here since we are using AMD and CUDA as backend respectively.

Flag	Definition
backend	4 different backend options: - cuda - amd - intel - pocl

# Example 1:  CUDA backend
 $ spack install babelstream%gcc +ocl backend=cuda

# Example 2:  AMD backend 
 $ spack install babelstream%gcc +ocl backend=amd
 
# Example 3:  Intel backend
 $ spack install babelstream%gcc +ocl backend=intel

# Example 4:  POCL backend
 $ spack install babelstream%gcc +ocl backend=pocl

STD

Minimum GCC version requirement 10.1.0
NVHPC Offload will be added in the future release

# Example 1:  data 
 $ spack install babelstream +stddata

# Example 2:  ranges
 $ spack install babelstream +stdranges
 
# Example 3:  indices
 $ spack install babelstream +stdindices

HIP(ROCM)

amdgpu_target and flags are optional here.

Flag	Definition
amdgpu_target	List of supported architectures are provided here
flags	Extra flags to pass

# Example 1:  ROCM default
 $ spack install babelstream +rocm

# Example 2:  ROCM with GPU target
 $ spack install babelstream +rocm amdgpu_target=<gfx701>
 
# Example 3:  ROCM with extra flags option
 $ spack install babelstream +rocm flags=<xxx>

# Example 4:  ROCM with GPU target and extra flags
 $ spack install babelstream +rocm amdgpu_target=<gfx701> flags=<xxx>

CUDA

The cuda_arch value is mandatory here.
If a user provides a value for mem, device memory mode will be chosen accordingly
If a user provides a value for flags, additional CUDA flags will be passed to NVCC
In the absence of mem and flags, the execution will choose DEFAULT for device memory mode and no additional flags will be passed

Flag	Definition
cuda_arch	- List of supported compute capabilities are provided here - Useful link for matching CUDA gencodes with NVIDIA architectures
mem	Device memory mode: - DEFAULT allocate host and device memory pointers. - MANAGED use CUDA Managed Memory. - PAGEFAULT shared memory, only host pointers allocated
flags	Extra flags to pass

# Example 1: CUDA no mem and flags specified
 $ spack install babelstream +cuda cuda_arch=<70>

# Example 2: for Nvidia GPU for Volta (sm_70) 
 $ spack install babelstream +cuda cuda_arch=<70> mem=<managed>
 
# Example 3: CUDA with mem and flags specified
 $ spack install babelstream +cuda cuda_arch=<70> mem=<managed> flags=<CUDA_EXTRA_FLAGS>

Kokkos

Kokkos implementation requires kokkos source folder to be provided because it builds it from the scratch

Flag	Definition
dir	Download the kokkos release from github repository ( https://github.com/kokkos/kokkos ) and extract the zip file to a directory you want and target this directory with `dir` flag
backend	2 different backend options: - cuda - omp
cuda_arch	- List of supported compute capabilities are provided here - Useful link for matching CUDA gencodes with NVIDIA architectures

# Example 1:  No Backend option specified
 $ spack install babelstream +kokkos dir=</home/user/Downloads/kokkos-x.x.xx>

# Example 2:  CUDA backend 
 $ spack install babelstream +kokkos backend=cuda cuda_arch=70 dir=</home/user/Downloads/kokkos-x.x.xx>
 
# Example 3:  OMP backend
 $ spack install babelstream +kokkos  backend=omp dir=</home/user/Downloads/kokkos-x.x.xx>

SYCL2020

Instructions for installing the intel compilers are provided here

Flag	Definition
implementation	3 different implementation options: - OneAPI-ICPX - OneAPI-DPCPP - Compute-CPP

# Example 1:  No implementation option specified (build for OneAPI-ICPX)
 $ spack install babelstream%oneapi +sycl2020

# Example 2:  OneAPI-DPCPP implementation 
 $ spack install babelstream +sycl2020 implementation=ONEAPI-DPCPP

SYCL

Flag	Definition
implementation	2 different implementation options: - OneAPI-DPCPP - Compute-CPP

# Example 1:  OneAPI-DPCPP implementation 
 $ spack install babelstream +sycl2020 implementation=ONEAPI-DPCPP

ACC

Target device selection process is automatic with 2 options:
- gpu : Globally set the target device to an NVIDIA GPU automatically if cuda_arch is specified
- multicore : Globally set the target device to the host CPU automatically if cpu_arch is specified

Flag Definition

cuda_arch - List of supported compute capabilities are provided here
- Useful link for matching CUDA gencodes with NVIDIA architectures

CPU_ARCH This sets the -tp (target processor) flag, possible values are:
px - Generic x86 Processor
bulldozer - AMD Bulldozer processor
piledriver - AMD Piledriver processor
zen - AMD Zen architecture (Epyc, Ryzen)
zen2 - AMD Zen 2 architecture (Ryzen 2)
sandybridge - Intel SandyBridge processor
haswell - Intel Haswell processor
knl - Intel Knights Landing processor
skylake - Intel Skylake Xeon processor
host - Link native version of HPC SDK cpu math library
native - Alias for -tp host

# Example 1:  For GPU Run 
 $ spack install babelstream +acc cuda_arch=<70>

# Example 2:  For Multicore CPU Run 
 $ spack install babelstream +acc cpu_arch=<bulldozer>

RAJA

RAJA implementation requires RAJA source folder to be provided because it builds it from the scratch

Flag	Definition
dir	Download the Raja release from github repository and extract the zip file to a directory you want and target this directory with `dir` flag
backend	2 different backend options: - cuda - omp
offload	Choose offloading platform `offload= [cpu]/[nvidia]`

# Example 1:  For CPU offload with backend OMP 
 $ spack install babelstream +raja offload=cpu backend=omp dir=/home/dir/raja

TBB

# Example: 
 $ spack install babelstream +tbb

9.1 KiB Raw Blame History