Update K40 an K20 ECC on results with v0.9 of code
This commit is contained in:
parent
ed656f5d8e
commit
e9cfdfb1cb
@ -1,12 +1,14 @@
|
||||
GPU-STREAM
|
||||
Version: 0.0
|
||||
Version: 0.9
|
||||
Implementation: CUDA
|
||||
Precision: double
|
||||
|
||||
Warning: array size must divide 1024
|
||||
Resizing array from 50000000 to 49999872
|
||||
Running kernels 10 times
|
||||
Array size: 400.0 MB (=0.4 GB)
|
||||
Total size: 1200.0 MB (=1.2 GB)
|
||||
Using CUDA device Tesla K20c
|
||||
Function MBytes/sec Min (sec) Max Average
|
||||
Copy 152227 0.0052553 0.00527829 0.0052649
|
||||
Mul 152206 0.00525604 0.00527526 0.0052646
|
||||
Add 150350 0.00798137 0.00800578 0.00799512
|
||||
Triad 150349 0.00798142 0.00801054 0.00799636
|
||||
Copy 152296.519 0.00551 0.00553 0.00552
|
||||
Mul 152284.216 0.00551 0.00553 0.00552
|
||||
Add 150549.336 0.00836 0.00838 0.00837
|
||||
Triad 150597.842 0.00836 0.00838 0.00837
|
||||
|
||||
@ -1,12 +1,14 @@
|
||||
GPU-STREAM
|
||||
Version: 0.0
|
||||
Version: 0.9
|
||||
Implementation: CUDA
|
||||
Precision: double
|
||||
|
||||
Warning: array size must divide 1024
|
||||
Resizing array from 50000000 to 49999872
|
||||
Running kernels 10 times
|
||||
Array size: 400.0 MB (=0.4 GB)
|
||||
Total size: 1200.0 MB (=1.2 GB)
|
||||
Using CUDA device Tesla K40c
|
||||
Function MBytes/sec Min (sec) Max Average
|
||||
Copy 194203 0.00411939 0.00412812 0.00412312
|
||||
Mul 194111 0.00412135 0.00412981 0.00412629
|
||||
Add 191683 0.00626033 0.00626831 0.0062641
|
||||
Triad 191560 0.00626436 0.00628582 0.00626866
|
||||
Copy 194335.669 0.00432 0.00433 0.00432
|
||||
Mul 194171.527 0.00432 0.00433 0.00433
|
||||
Add 191294.438 0.00658 0.00659 0.00658
|
||||
Triad 191240.187 0.00658 0.00659 0.00658
|
||||
|
||||
@ -1,10 +1,14 @@
|
||||
GPU-STREAM
|
||||
Version: 0.0
|
||||
Version: 0.9
|
||||
Implementation: OpenCL
|
||||
Precision: double
|
||||
|
||||
Running kernels 10 times
|
||||
Array size: 400.0 MB (=0.4 GB)
|
||||
Total size: 1200.0 MB (=1.2 GB)
|
||||
Using OpenCL device Tesla K20c
|
||||
Function MBytes/sec Min (sec) Max Average
|
||||
Copy 149700 0.00534404 0.00537675 0.00535703
|
||||
Mul 149734 0.0053428 0.00537013 0.00535519
|
||||
Add 151108 0.00794132 0.00797839 0.00795825
|
||||
Triad 151008 0.0079466 0.00797772 0.0079599
|
||||
Copy 150397.867 0.00558 0.00561 0.00559
|
||||
Mul 150241.232 0.00558 0.00560 0.00559
|
||||
Add 151673.787 0.00830 0.00833 0.00831
|
||||
Triad 151699.186 0.00829 0.00833 0.00831
|
||||
|
||||
@ -1,10 +1,14 @@
|
||||
GPU-STREAM
|
||||
Version: 0.0
|
||||
Version: 0.9
|
||||
Implementation: OpenCL
|
||||
Precision: double
|
||||
|
||||
Running kernels 10 times
|
||||
Array size: 400.0 MB (=0.4 GB)
|
||||
Total size: 1200.0 MB (=1.2 GB)
|
||||
Using OpenCL device Tesla K40c
|
||||
Function MBytes/sec Min (sec) Max Average
|
||||
Copy 190343 0.00420295 0.00422588 0.00421376
|
||||
Mul 190266 0.00420465 0.00422835 0.00421347
|
||||
Add 190144 0.00631101 0.00632369 0.00631618
|
||||
Triad 190105 0.00631229 0.00632309 0.00631735
|
||||
Copy 190354.872 0.00441 0.00443 0.00442
|
||||
Mul 190199.107 0.00441 0.00443 0.00442
|
||||
Add 190946.380 0.00659 0.00660 0.00660
|
||||
Triad 190991.101 0.00659 0.00661 0.00660
|
||||
|
||||
Loading…
Reference in New Issue
Block a user