Update K40 an K20 ECC on results with v0.9 of code

This commit is contained in:
Tom Deakin 2015-08-01 16:43:14 +01:00
parent ed656f5d8e
commit e9cfdfb1cb
4 changed files with 36 additions and 24 deletions

View File

@ -1,12 +1,14 @@
GPU-STREAM
Version: 0.0
Version: 0.9
Implementation: CUDA
Precision: double
Warning: array size must divide 1024
Resizing array from 50000000 to 49999872
Running kernels 10 times
Array size: 400.0 MB (=0.4 GB)
Total size: 1200.0 MB (=1.2 GB)
Using CUDA device Tesla K20c
Function MBytes/sec Min (sec) Max Average
Copy 152227 0.0052553 0.00527829 0.0052649
Mul 152206 0.00525604 0.00527526 0.0052646
Add 150350 0.00798137 0.00800578 0.00799512
Triad 150349 0.00798142 0.00801054 0.00799636
Copy 152296.519 0.00551 0.00553 0.00552
Mul 152284.216 0.00551 0.00553 0.00552
Add 150549.336 0.00836 0.00838 0.00837
Triad 150597.842 0.00836 0.00838 0.00837

View File

@ -1,12 +1,14 @@
GPU-STREAM
Version: 0.0
Version: 0.9
Implementation: CUDA
Precision: double
Warning: array size must divide 1024
Resizing array from 50000000 to 49999872
Running kernels 10 times
Array size: 400.0 MB (=0.4 GB)
Total size: 1200.0 MB (=1.2 GB)
Using CUDA device Tesla K40c
Function MBytes/sec Min (sec) Max Average
Copy 194203 0.00411939 0.00412812 0.00412312
Mul 194111 0.00412135 0.00412981 0.00412629
Add 191683 0.00626033 0.00626831 0.0062641
Triad 191560 0.00626436 0.00628582 0.00626866
Copy 194335.669 0.00432 0.00433 0.00432
Mul 194171.527 0.00432 0.00433 0.00433
Add 191294.438 0.00658 0.00659 0.00658
Triad 191240.187 0.00658 0.00659 0.00658

View File

@ -1,10 +1,14 @@
GPU-STREAM
Version: 0.0
Version: 0.9
Implementation: OpenCL
Precision: double
Running kernels 10 times
Array size: 400.0 MB (=0.4 GB)
Total size: 1200.0 MB (=1.2 GB)
Using OpenCL device Tesla K20c
Function MBytes/sec Min (sec) Max Average
Copy 149700 0.00534404 0.00537675 0.00535703
Mul 149734 0.0053428 0.00537013 0.00535519
Add 151108 0.00794132 0.00797839 0.00795825
Triad 151008 0.0079466 0.00797772 0.0079599
Copy 150397.867 0.00558 0.00561 0.00559
Mul 150241.232 0.00558 0.00560 0.00559
Add 151673.787 0.00830 0.00833 0.00831
Triad 151699.186 0.00829 0.00833 0.00831

View File

@ -1,10 +1,14 @@
GPU-STREAM
Version: 0.0
Version: 0.9
Implementation: OpenCL
Precision: double
Running kernels 10 times
Array size: 400.0 MB (=0.4 GB)
Total size: 1200.0 MB (=1.2 GB)
Using OpenCL device Tesla K40c
Function MBytes/sec Min (sec) Max Average
Copy 190343 0.00420295 0.00422588 0.00421376
Mul 190266 0.00420465 0.00422835 0.00421347
Add 190144 0.00631101 0.00632369 0.00631618
Triad 190105 0.00631229 0.00632309 0.00631735
Copy 190354.872 0.00441 0.00443 0.00442
Mul 190199.107 0.00441 0.00443 0.00442
Add 190946.380 0.00659 0.00660 0.00660
Triad 190991.101 0.00659 0.00661 0.00660