Checking CPU and memory performance before Database benchmarking

I am busy reading PostgreSQL 9.0 High Performance by Smith, Gregory


Running STREAM  (The Java version) for one CPU:
http://www.cs.virginia.edu/stream/FTP/Contrib/Java/


Average cpu bandwidth:  Copy: 16334MB/sec/cpu Scale: 13104MB/sec/cpu Add: 16253MB/sec/cpu Triad: 15534MB/sec/cpu
Total system bandwidth: Copy: 16334MB/sec  Scale: 13104MB/sec  Add: 16253MB/sec  Triad: 15534MB/sec  

 Running Greg's app: reading https://github.com/gregs1104/stream-scaling


 WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function      Rate (MB/s)   Avg time     Min time     Max time
Copy:       10815.5366       0.0356       0.0354       0.0358
Scale:      10758.2016       0.0358       0.0356       0.0366
Add:        11943.3337       0.0483       0.0481       0.0489
Triad:      11913.1766       0.0484       0.0482       0.0488
-------------------------------------------------------------
Solution Validates
-------------------------------------------------------------

Number of Threads requested = 2
Function      Rate (MB/s)   Avg time     Min time     Max time
Triad:      13490.2814       0.0429       0.0426       0.0435

Number of Threads requested = 3
Function      Rate (MB/s)   Avg time     Min time     Max time
Triad:      13238.9345       0.0441       0.0434       0.0454

Number of Threads requested = 4
Function      Rate (MB/s)   Avg time     Min time     Max time
Triad:      13217.3784       0.0444       0.0435       0.0458

Number of Threads requested = 5
Function      Rate (MB/s)   Avg time     Min time     Max time
Triad:      13112.7815       0.0450       0.0438       0.0471

Number of Threads requested = 6
Function      Rate (MB/s)   Avg time     Min time     Max time
Triad:      12876.9001       0.0451       0.0446       0.0456

Number of Threads requested = 7
Function      Rate (MB/s)   Avg time     Min time     Max time
Triad:      12968.6780       0.0454       0.0443       0.0477

Number of Threads requested = 8
Function      Rate (MB/s)   Avg time     Min time     Max time
Triad:      12906.0583       0.0458       0.0445       0.0479

But running it I saw that my CPU frequency was set to 800

=== CPU Core Summary ===
processor       : 7
model name      : Intel(R) Core(TM) i7-2720QM CPU @ 2.20GHz
cpu MHz         : 800.000
siblings        : 8

Reading these links
Seems like me Linux Mageia 3 is using cpupower

 cat /sys/devices/system/cpu/cpu0/cpufreq/bios_limit
2201000   
                    


Also ran  wmcpufreq which shows your current speed and profile

watch grep \"cpu MHz\" /proc/cpuinfo

i7z_GUI

I changed to  performance

Edited  /etc/sysconfig/cpupower

 === CPU Core Summary ===
processor       : 7
model name      : Intel(R) Core(TM) i7-2720QM CPU @ 2.20GHz
cpu MHz         : 2200.000
siblings        : 8


Full trace
./stream-scaling
=== CPU cache information ===
CPU /sys/devices/system/cpu/cpu0 Level 1 Cache: 32K (Data)
CPU /sys/devices/system/cpu/cpu0 Level 1 Cache: 32K (Instruction)
CPU /sys/devices/system/cpu/cpu0 Level 2 Cache: 256K (Unified)
CPU /sys/devices/system/cpu/cpu0 Level 3 Cache: 6144K (Unified)
CPU /sys/devices/system/cpu/cpu1 Level 1 Cache: 32K (Data)
CPU /sys/devices/system/cpu/cpu1 Level 1 Cache: 32K (Instruction)
CPU /sys/devices/system/cpu/cpu1 Level 2 Cache: 256K (Unified)
CPU /sys/devices/system/cpu/cpu1 Level 3 Cache: 6144K (Unified)
CPU /sys/devices/system/cpu/cpu2 Level 1 Cache: 32K (Data)
CPU /sys/devices/system/cpu/cpu2 Level 1 Cache: 32K (Instruction)
CPU /sys/devices/system/cpu/cpu2 Level 2 Cache: 256K (Unified)
CPU /sys/devices/system/cpu/cpu2 Level 3 Cache: 6144K (Unified)
CPU /sys/devices/system/cpu/cpu3 Level 1 Cache: 32K (Data)
CPU /sys/devices/system/cpu/cpu3 Level 1 Cache: 32K (Instruction)
CPU /sys/devices/system/cpu/cpu3 Level 2 Cache: 256K (Unified)
CPU /sys/devices/system/cpu/cpu3 Level 3 Cache: 6144K (Unified)
CPU /sys/devices/system/cpu/cpu4 Level 1 Cache: 32K (Data)
CPU /sys/devices/system/cpu/cpu4 Level 1 Cache: 32K (Instruction)
CPU /sys/devices/system/cpu/cpu4 Level 2 Cache: 256K (Unified)
CPU /sys/devices/system/cpu/cpu4 Level 3 Cache: 6144K (Unified)
CPU /sys/devices/system/cpu/cpu5 Level 1 Cache: 32K (Data)
CPU /sys/devices/system/cpu/cpu5 Level 1 Cache: 32K (Instruction)
CPU /sys/devices/system/cpu/cpu5 Level 2 Cache: 256K (Unified)
CPU /sys/devices/system/cpu/cpu5 Level 3 Cache: 6144K (Unified)
CPU /sys/devices/system/cpu/cpu6 Level 1 Cache: 32K (Data)
CPU /sys/devices/system/cpu/cpu6 Level 1 Cache: 32K (Instruction)
CPU /sys/devices/system/cpu/cpu6 Level 2 Cache: 256K (Unified)
CPU /sys/devices/system/cpu/cpu6 Level 3 Cache: 6144K (Unified)
CPU /sys/devices/system/cpu/cpu7 Level 1 Cache: 32K (Data)
CPU /sys/devices/system/cpu/cpu7 Level 1 Cache: 32K (Instruction)
CPU /sys/devices/system/cpu/cpu7 Level 2 Cache: 256K (Unified)
CPU /sys/devices/system/cpu/cpu7 Level 3 Cache: 6144K (Unified)
Total CPU system cache: 52690944 bytes
Suggested minimum array elements needed: 23950429
Array elements used: 23950429

=== CPU Core Summary ===
processor       : 7
model name      : Intel(R) Core(TM) i7-2720QM CPU @ 2.20GHz
cpu MHz         : 2201.000
siblings        : 8

=== Check and build stream ===

=== Testing up to 8 cores ===

-------------------------------------------------------------
STREAM version $Revision: 5.9 $
-------------------------------------------------------------
This system uses 8 bytes per DOUBLE PRECISION word.
-------------------------------------------------------------
Array size = 23950429, Offset = 0
Total memory required = 548.2 MB.
Each test is run 10 times, but only
the *best* time for each is used.
-------------------------------------------------------------
Number of Threads requested = 1
-------------------------------------------------------------
Printing one line per active thread....
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 24587 microseconds.
   (= 24587 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function      Rate (MB/s)   Avg time     Min time     Max time
Copy:       11542.3665       0.0335       0.0332       0.0343
Scale:      11572.0339       0.0340       0.0331       0.0368
Add:        13203.7697       0.0440       0.0435       0.0449
Triad:      13192.7874       0.0451       0.0436       0.0484
-------------------------------------------------------------
Solution Validates
-------------------------------------------------------------

Number of Threads requested = 2
Function      Rate (MB/s)   Avg time     Min time     Max time
Triad:      13749.1610       0.0422       0.0418       0.0431

Number of Threads requested = 3
Function      Rate (MB/s)   Avg time     Min time     Max time
Triad:      13538.3089       0.0434       0.0425       0.0455

Number of Threads requested = 4
Function      Rate (MB/s)   Avg time     Min time     Max time
Triad:      13330.8034       0.0436       0.0431       0.0445

Number of Threads requested = 5
Function      Rate (MB/s)   Avg time     Min time     Max time
Triad:      13221.0025       0.0446       0.0435       0.0472

Number of Threads requested = 6
Function      Rate (MB/s)   Avg time     Min time     Max time
Triad:      13234.7194       0.0441       0.0434       0.0450

Number of Threads requested = 7
Function      Rate (MB/s)   Avg time     Min time     Max time
Triad:      13181.8954       0.0442       0.0436       0.0460

Number of Threads requested = 8
Function      Rate (MB/s)   Avg time     Min time     Max time
Triad:      13202.8297       0.0447       0.0435       0.0466

[pvz@localhost stream-scaling-master]$



Comments

Popular Posts