Tom Deakin
693a7e7478
use signed array size for CUDA
2021-01-12 10:20:44 +00:00
Tom Deakin
3bd65a0716
Merge branch 'master' into cuda-memory
2017-05-11 11:28:33 +01:00
James Price
94e0900377
Use static shared memory in dot for CUDA and HIP
2017-02-28 13:24:45 +00:00
Tom Deakin
8d66a27131
[CUDA] If using managed memory, use device pointer for host reduction
2016-12-19 05:08:19 -07:00
Tom Deakin
62860284b2
[CUDA] Add Managed memory and Page fault options
...
To use managed memory, compile the code defining MANAGED
To use CUDA 8 page-fault memory, compile the code defining PAGEFAULT
2016-12-19 05:00:15 -07:00
Tom Deakin
b9c514fd9b
[CUDA] Free the sum device buffer
2016-12-19 11:42:45 +00:00
Tom Deakin
d42bcd4675
Merge remote-tracking branch 'origin/init-arrays' into devel
2016-11-04 09:17:54 +00:00
James Price
7f4761ae52
Replace write_arrays with init_arrays
...
This allows each model to initialise their arrays with a parallel
approach, which yields the first touch required for good performance
on NUMA architectures.
2016-11-02 11:22:01 +00:00
James Price
dfc79eeb4d
Improve performance of CUDA dot implementation
2016-10-24 21:42:39 +01:00
Tom Deakin
f32cf3bad3
Merge branch 'master' into kernel-dot
...
Conflicts:
main.cpp
2016-10-24 13:53:58 +01:00
Tom Deakin
5b1e67f666
[CUDA] Use new value of scalar
2016-10-24 13:19:54 +01:00
James Price
8a8f44b4ce
Fix CUDA host code for dot kernel
...
Wrong number of blocks was being copied and summed.
2016-10-24 12:47:25 +01:00
Tom Deakin
d3b497a9ca
Add a CUDA dot kernel
2016-10-14 17:51:40 +01:00
James Price
f94e36f320
[CUDA] Fix device name output (OpenCL->CUDA)
2016-07-06 17:16:35 +01:00
Tom Deakin
31cb567e21
Switch data from 1.0, 2.0 and 3.0 to 0.1, 0.2, and 0.3 resp.
...
Using integers for maths gets unstable past 38 interations even
in double precision. Using the original values/10 is safe up to
the default 100 iterations.
2016-05-11 15:51:19 +01:00
Tom Deakin
2462023ed9
Set thread block size in CUDA with a #define, and check that array size is multiple of it
2016-05-11 12:21:29 +01:00
Tom Deakin
530b2adda2
Add License text to all files
2016-05-03 12:32:03 +01:00
Tom Deakin
a355acf2ee
Move source files to top level directory
2016-05-03 11:43:25 +01:00