9.1 KiB
9.1 KiB
Spack Instructions
Table of contents
OpenMP
- There are 3 offloading options for OpenMP: NVIDIA, AMD and Intel.
- If a user provides a value for
cuda_arch, the execution will be automatically offloaded to NVIDIA. - If a user provides a value for
amdgpu_target, the operation will be offloaded to AMD. - In the absence of
cuda_archandamdgpu_target, the execution will be offloaded to Intel.
| Flag | Definition |
|---|---|
| cuda_arch | - List of supported compute capabilities are provided here - Useful link for matching CUDA gencodes with NVIDIA architectures |
| amdgpu_target | List of supported architectures are provided here |
# Example 1: for Intel offload
$ spack install babelstream%oneapi +omp
# Example 2: for Nvidia GPU for Volta (sm_70)
$ spack install babelstream +omp cuda_arch=70
# Example 3: for AMD GPU gfx701
$ spack install babelstream +omp amdgpu_target=gfx701
OpenCL
- No need to specify
amdgpu_targetorcuda_archhere since we are using AMD and CUDA as backend respectively.
| Flag | Definition |
|---|---|
| backend | 4 different backend options: - cuda - amd - intel - pocl |
# Example 1: CUDA backend
$ spack install babelstream%gcc +ocl backend=cuda
# Example 2: AMD backend
$ spack install babelstream%gcc +ocl backend=amd
# Example 3: Intel backend
$ spack install babelstream%gcc +ocl backend=intel
# Example 4: POCL backend
$ spack install babelstream%gcc +ocl backend=pocl
STD
- Minimum GCC version requirement
10.1.0 - NVHPC Offload will be added in the future release
# Example 1: data
$ spack install babelstream +stddata
# Example 2: ranges
$ spack install babelstream +stdranges
# Example 3: indices
$ spack install babelstream +stdindices
HIP(ROCM)
amdgpu_targetandflagsare optional here.
| Flag | Definition |
|---|---|
| amdgpu_target | List of supported architectures are provided here |
| flags | Extra flags to pass |
# Example 1: ROCM default
$ spack install babelstream +rocm
# Example 2: ROCM with GPU target
$ spack install babelstream +rocm amdgpu_target=<gfx701>
# Example 3: ROCM with extra flags option
$ spack install babelstream +rocm flags=<xxx>
# Example 4: ROCM with GPU target and extra flags
$ spack install babelstream +rocm amdgpu_target=<gfx701> flags=<xxx>
CUDA
- The
cuda_archvalue is mandatory here. - If a user provides a value for
mem, device memory mode will be chosen accordingly - If a user provides a value for
flags, additional CUDA flags will be passed to NVCC - In the absence of
memandflags, the execution will choose DEFAULT for device memory mode and no additional flags will be passed
| Flag | Definition |
|---|---|
| cuda_arch | - List of supported compute capabilities are provided here - Useful link for matching CUDA gencodes with NVIDIA architectures |
| mem | Device memory mode: - DEFAULT allocate host and device memory pointers. - MANAGED use CUDA Managed Memory. - PAGEFAULT shared memory, only host pointers allocated |
| flags | Extra flags to pass |
# Example 1: CUDA no mem and flags specified
$ spack install babelstream +cuda cuda_arch=<70>
# Example 2: for Nvidia GPU for Volta (sm_70)
$ spack install babelstream +cuda cuda_arch=<70> mem=<managed>
# Example 3: CUDA with mem and flags specified
$ spack install babelstream +cuda cuda_arch=<70> mem=<managed> flags=<CUDA_EXTRA_FLAGS>
Kokkos
- Kokkos implementation requires kokkos source folder to be provided because it builds it from the scratch
| Flag | Definition |
|---|---|
| dir | Download the kokkos release from github repository ( https://github.com/kokkos/kokkos ) and extract the zip file to a directory you want and target this directory with dir flag |
| backend | 2 different backend options: - cuda - omp |
| cuda_arch | - List of supported compute capabilities are provided here - Useful link for matching CUDA gencodes with NVIDIA architectures |
# Example 1: No Backend option specified
$ spack install babelstream +kokkos dir=</home/user/Downloads/kokkos-x.x.xx>
# Example 2: CUDA backend
$ spack install babelstream +kokkos backend=cuda cuda_arch=70 dir=</home/user/Downloads/kokkos-x.x.xx>
# Example 3: OMP backend
$ spack install babelstream +kokkos backend=omp dir=</home/user/Downloads/kokkos-x.x.xx>
SYCL2020
- Instructions for installing the intel compilers are provided here
| Flag | Definition |
|---|---|
| implementation | 3 different implementation options: - OneAPI-ICPX - OneAPI-DPCPP - Compute-CPP |
# Example 1: No implementation option specified (build for OneAPI-ICPX)
$ spack install babelstream%oneapi +sycl2020
# Example 2: OneAPI-DPCPP implementation
$ spack install babelstream +sycl2020 implementation=ONEAPI-DPCPP
SYCL
| Flag | Definition |
|---|---|
| implementation | 2 different implementation options: - OneAPI-DPCPP - Compute-CPP |
# Example 1: OneAPI-DPCPP implementation
$ spack install babelstream +sycl2020 implementation=ONEAPI-DPCPP
ACC
- Target device selection process is automatic with 2 options:
- gpu : Globally set the target device to an NVIDIA GPU automatically if
cuda_archis specified - multicore : Globally set the target device to the host CPU automatically if
cpu_archis specified
- gpu : Globally set the target device to an NVIDIA GPU automatically if
| Flag | Definition |
|---|---|
| cuda_arch | - List of supported compute capabilities are provided here - Useful link for matching CUDA gencodes with NVIDIA architectures |
| CPU_ARCH | This sets the -tp (target processor) flag, possible values are: px - Generic x86 Processor bulldozer - AMD Bulldozer processor piledriver - AMD Piledriver processor zen - AMD Zen architecture (Epyc, Ryzen) zen2 - AMD Zen 2 architecture (Ryzen 2) sandybridge - Intel SandyBridge processor haswell - Intel Haswell processor knl - Intel Knights Landing processor skylake - Intel Skylake Xeon processor host - Link native version of HPC SDK cpu math library native - Alias for -tp host |
# Example 1: For GPU Run
$ spack install babelstream +acc cuda_arch=<70>
# Example 2: For Multicore CPU Run
$ spack install babelstream +acc cpu_arch=<bulldozer>
RAJA
- RAJA implementation requires RAJA source folder to be provided because it builds it from the scratch
| Flag | Definition |
|---|---|
| dir | Download the Raja release from github repository and extract the zip file to a directory you want and target this directory with dir flag |
| backend | 2 different backend options: - cuda - omp |
| offload | Choose offloading platform offload= [cpu]/[nvidia] |
# Example 1: For CPU offload with backend OMP
$ spack install babelstream +raja offload=cpu backend=omp dir=/home/dir/raja
TBB
# Example:
$ spack install babelstream +tbb