I run a server with the following specs:
- Intel i7 920
- 8 GB RAM
- Linux 2.6.32-25-server #44-Ubuntu 10.04 SMP Fri Sep 17 21:13:39 UTC 2010 x86_64 GNU/Linux
- 75 Apache processes
- Low-end hardware RAID-1 with 2 disks
Historically all our problems with scaling the service have been disk bound but currently we see higher load numbers than before, especially after updating to Ubuntu 10.04. The server handles around 50 requests per second. Swap is not used, and should not be active. The MySQL dataset is some gigabytes but access should be fairly good optimized.
> top
top - 10:42:50 up 16 days, 18:49, 1 user, load average: 20.02, 16.17, 11.44
Tasks: 277 total, 4 running, 273 sleeping, 0 stopped, 0 zombie
Cpu0 : 38.6%us, 3.3%sy, 0.0%ni, 58.2%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu1 : 37.9%us, 3.3%sy, 0.0%ni, 58.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu2 : 25.9%us, 3.0%sy, 0.0%ni, 69.5%id, 1.3%wa, 0.0%hi, 0.3%si, 0.0%st
Cpu3 : 23.5%us, 2.0%sy, 0.0%ni, 67.9%id, 0.0%wa, 0.0%hi, 6.6%si, 0.0%st
Cpu4 : 16.4%us, 1.3%sy, 0.0%ni, 82.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu5 : 15.3%us, 1.3%sy, 0.0%ni, 83.4%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu6 : 14.3%us, 1.0%sy, 0.0%ni, 84.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu7 : 2.3%us, 0.6%sy, 0.0%ni, 97.1%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 8187668k total, 8117276k used, 70392k free, 178920k buffers
Swap: 4198968k total, 2084k used, 4196884k free, 6159328k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
32216 mysql 20 0 2026m 788m 4132 S 41 9.9 1292:40 mysqld
8104 www-data 20 0 491m 106m 95m S 4 1.3 1:57.62 apache2
27072 www-data 20 0 684m 112m 101m S 4 1.4 2:51.47 apache2
3391 www-data 20 0 683m 109m 98m S 4 1.4 2:22.29 apache2
16822 www-data 20 0 682m 114m 104m S 4 1.4 3:33.05 apache2
27068 www-data 20 0 555m 113m 102m S 4 1.4 2:53.77 apache2
27118 www-data 20 0 683m 119m 106m S 4 1.5 4:41.48 apache2
1036 www-data 20 0 685m 112m 100m S 3 1.4 2:27.24 apache2
3503 www-data 20 0 556m 81m 70m S 3 1.0 0:33.77 apache2
29803 www-data 20 0 682m 111m 101m S 3 1.4 2:47.09 apache2
1345 www-data 20 0 491m 115m 104m S 3 1.4 4:04.62 apache2
3001 www-data 20 0 379m 109m 98m S 3 1.4 2:13.36 apache2
[... 75 Apache processes with similar specs, but less CPU]
My question is - do you generally see any problems with the high load numbers? The resoponstime has increased, but only by ~30%. Do the load numbers include disk activity to some extent? Do you have any comments what I should focus on during optimizing? Thank you very much!
> iotop
Total DISK READ: 179.70 K/s | Total DISK WRITE: 1735.81 K/s
TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
16512 be/4 mysql 0.00 B/s 22.94 K/s ?unavailable? mysqld
20701 be/4 mysql 0.00 B/s 0.00 B/s ?unavailable? mysqld
21556 be/4 mysql 0.00 B/s 22.94 K/s ?unavailable? mysqld
28998 be/4 www-data 0.00 B/s 3.82 K/s ?unavailable? apache2 -k start
12771 be/4 mysql 0.00 B/s 3.82 K/s ?unavailable? mysqld
16824 be/4 www-data 0.00 B/s 3.82 K/s ?unavailable? apache2 -k start
2700 be/4 mysql 0.00 B/s 7.65 K/s ?unavailable? mysqld
3074 be/4 mysql 22.94 K/s 0.00 B/s ?unavailable? mysqld
17585 be/4 mysql 0.00 B/s 15.29 K/s ?unavailable? mysqld
30723 be/4 mysql 7.65 K/s 0.00 B/s ?unavailable? mysqld
29906 be/4 www-data 0.00 B/s 3.82 K/s ?unavailable? apache2 -k start
29907 be/4 mysql 0.00 B/s 15.29 K/s ?unavailable? mysqld
13547 be/4 www-data 0.00 B/s 3.82 K/s ?unavailable? apache2 -k start
7444 be/4 www-data 0.00 B/s 3.82 K/s ?unavailable? apache2 -k start
1944 be/4 mysql 149.11 K/s 0.00 B/s ?unavailable? mysqld
16825 be/4 mysql 0.00 B/s 7.65 K/s ?unavailable? mysqld
32223 be/4 mysql 0.00 B/s 3.82 K/s ?unavailable? mysqld
7801 be/4 www-data 0.00 B/s 3.82 K/s ?unavailable? apache2 -k start
5808 be/4 mysql 0.00 B/s 11.47 K/s ?unavailable? mysqld
8104 be/4 www-data 0.00 B/s 3.82 K/s ?unavailable? apache2 -k start
18890 be/4 www-data 0.00 B/s 0.00 B/s ?unavailable? apache2 -k start
1 be/4 root 0.00 B/s 0.00 B/s ?unavailable? init
2 be/4 root 0.00 B/s 0.00 B/s ?unavailable? [kthreadd]
3 rt/4 root 0.00 B/s 0.00 B/s ?unavailable? [migration/0]
-
On Linux the load average includes processes in uninterruptable sleep (which includes disk access). Your top output doesn't seem to indicate a lot of IO wait time however. Since top percentages are averaged, I might run top in high frequency update (maybe -d.1 or -d.5) and look for spikes in IO wait that aren't showing up in the default polling frequency as a next step.
From mark -
I'd personally worry about the high CPU usage of MySQL. top is only a snapshot though; if you see the cpu for mysql consistently pegged at 50%, I would perform some steps to make sure why.
Load tends to grow exponentially. The time it took MySQL to reach 50% will be much more than the time it takes to hit a 100
symcbean : By default top shows the CPU usage as a % per-cpu so 41% really means its only using about 5% of the total capacity (OK so in some case the workload can't be shared across multiple CPUs - but its certainly not enough to be constraining the eprformance)From Evert -
The standard system metrics (load, CPU, memory etc) are usually good indicators of how the performance of a system is constrained - but ultimately the performance value is all about how quickly it can service a request. In practice its a good idea to monitor these metrics and set thresholds but ultimately these are only indicative of the actual performance of the system.
I think the architecture could be better - at a rough guess, the cost of the server you describe could have bought a 4Gb/dual processor/raid 1+(5/0) for the database and at least 2 low spec machines to run the webservers on (I'm guessing there mod_php or mod_perl in there somewhere too) which would probably be significantly faster.
Certainly it seems to be the mysqld process that's causing most of the pain here - but it looks like apache is doing rather a lot of I/O. How much of your memory id getting used for I/O cache? The RSS for these Apache processes also looks high (the VIRT size too - but thats probably a consequence of the high RSS) approx 10 times the value on the nearest LAMP box I could find.
I'd recommend following the usual recipe here, but looking at the mysql stuff first:
mysql - have you got slow query logging enabled? Have you analysed it to identify potential database optimization
Have you run mysqltuner against your installation?
HTTP caching - are you sending good caching information for static content? Disabling conditional requests?
Why are your apache processes so big? Do you really need all those modules.
What's the range of RTTs to your users? Have you got compression enabled on static text/html content and for script output?
If you're running a PHP site, have you got an opcode cache (e.g. APC, ioncude, Zend) running?
HTH
From symcbean
0 comments:
Post a Comment