load average on sytem

hostingdivine

Verified User
Joined
Jan 4, 2011
Messages
17
every day increasing load average on my system. How can I decrease load average?

Ram 512mb
processor: AMD Phenom X6
currently 25 wordpress site online.

This is todays warning message.
This is an automated message notifying you that the 5 minute load average on your system is 14.25.
This has exceeded the 10 threshold.

One Minute - 26.85
Five Minutes - 14.25
Fifteen Minutes - 6.56

top - 10:50:01 up 4:00, 0 users, load average: 26.85, 14.25, 6.56
Tasks: 125 total, 1 running, 124 sleeping, 0 stopped, 0 zombie
Cpu(s): 1.1%us, 0.4%sy, 0.0%ni, 90.4%id, 8.1%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 2097152k total, 1616972k used, 480180k free, 0k buffers
Swap: 0k total, 0k used, 0k free, 0k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
18191 admin 16 0 117m 45m 6012 D 2.0 2.2 0:00.80 php-cgi
28531 admin 16 0 117m 43m 6308 D 2.0 2.1 0:00.42 php-cgi
31787 admin 16 0 118m 46m 6012 D 2.0 2.3 0:01.53 php-cgi
31788 admin 16 0 118m 46m 6012 D 2.0 2.3 0:01.62 php-cgi
1 root 18 0 10352 740 620 S 0.0 0.0 0:00.18 init
1338 dovecot 18 0 45724 2724 2096 S 0.0 0.1 0:00.00 pop3-login
1770 root 15 0 1252 320 264 S 0.0 0.0 0:00.05 da-popb4smtp
1788 nobody 15 0 48756 1512 924 S 0.0 0.1 0:00.27 directadmin
1797 mail 15 0 56888 1196 616 S 0.0 0.1 0:00.00 exim
1872 dovecot 18 0 45724 2716 2096 S 0.0 0.1 0:00.00 pop3-login
1903 apache 15 0 66488 5552 1508 S 0.0 0.3 0:00.18 httpd
1926 apache 16 0 66936 5772 1516 S 0.0 0.3 0:00.19 httpd
3315 root 15 0 200m 57m 2952 S 0.0 2.8 0:01.06 spamd
3338 root 18 0 200m 55m 672 S 0.0 2.7 0:00.00 spamd
3342 root 18 0 200m 55m 584 S 0.0 2.7 0:00.00 spamd
3355 root 18 0 130m 26m 1588 S 0.0 1.3 0:04.10 lfd
3408 root 15 0 65984 5728 2148 S 0.0 0.3 0:00.08 httpd
3557 ftp 16 0 17356 1136 508 S 0.0 0.1 0:00.00 proftpd
3565 root 15 0 74804 1172 592 S 0.0 0.1 0:00.01 crond
5365 admin 16 0 118m 46m 6012 D 0.0 2.3 0:01.43 php-cgi
5376 admin 16 0 118m 46m 6012 D 0.0 2.3 0:01.43 php-cgi
5378 admin 17 0 118m 46m 6012 D 0.0 2.3 0:01.42 php-cgi
5394 apache 15 0 66520 5472 1488 S 0.0 0.3 0:00.01 httpd
 
WordPress can be extremely resource intensive, especially if sites get a lot of hits.

How are you running PHP? mod_php, suPHP, or mod_ruid2?

You should also have a swap partition, though it doesn't look as if that's the problem here. If necessary you can create and use a swap file.

Post the output of:
Code:
$ cat /proc/cpuinfo
and
Code:
$ cat /proc/meminfo
Jeff
 
@jlasman

18191 admin 16 0 117m 45m 6012 D 2.0 2.2 0:00.80 php-cgi

It seems to be a suPHP.

Ram 512mb
processor: AMD Phenom X6

So it seems to be a VPS, probably OpenVZ, thus swap can not be used.

@hostingdivine,

if you switch to mod_suid2, it might decrease load average. Additionally you might want to install caching plugins for CMS and disable some unnecessary system daemons, and tune MySQL settings.
 
My opinion would be it's a very overloaded/oversold box.... VPS's are deceiving... it's really only glorified shared hosting... and can be even worse depending on the overselling of the host.
 
That's the direction I was going in, but I prefer to be sure before I respond. I believe it's an oversold vps.

And I've learned from experience that su_PHP isn't easy on system resources, even on boxes with a 3Ghz dedicated P4 and 2GB of memory. Years ago I tried a box with these specs once with ten WordPress sites, and it killed the box.

Jeff
 
I disagree about the oversold VPS theory ...

I sell VPS servers, and none of my boxes are oversold ... but in the past 2 months, unauthorized usage of exim has affected at least 70% of my VPS and dedicated boxes ... load averages spike into the triple digits, making just logging into the box difficult.

I am still at a loss how to control this, but the issue does seem to be with people gaining access to exim.

One thing that seems to help a bit, is limiting the DOVECOT children for imap-login and pop3-login to 3 (instead of default SIXTEEN @%#$ing children?!?)

But I am still trying to figure out how to deal with this on a very large number of servers ...

EDIT: Interestingly enough, and maybe worth pointing out for anybody else dealing with this issue, is that none of our cPanelX boxes/vpses seen to be affected by this
 
Last edited:
They said to could be oversold or overloaded, not necessarily both. If your load averages are that high, and based on your snapshot your VPS was 90% idle. I would say one or more of the other VPS's were causing the spike.

When you say "people gaining access to exim" what do you mean? They figured out how to use exim to spam (i.e. figured out username/password)? If you are using one of the spamblocker exim.conf files, then it is pretty secure, they shouldn't be able to gain access to exim.

Are you using the latest version of exim from custombuild?

I would work on figuring out how they are gaining access to exim, and plugging that hole.

Exim shouldn't be able to spike that much load, are they also using spamassassin? I can see that causing a huge spike, but that would be on incoming mails not outgoing.

Perhaps you should dig into some of the exim logs to see how these users are calling exim. From there it should be a little easier to find the source of the problem. It could be some vulnerable php/perl script that is installed.
 
I am getting hit with these issues as well, here is my output on the following commands;

Code:
cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 15
model           : 4
model name      :               Intel(R) Pentium(R) 4 CPU 3.20GHz
stepping        : 3
cpu MHz         : 3200.151
cache size      : 2048 KB
physical id     : 0
siblings        : 2
core id         : 0
cpu cores       : 1
apicid          : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 5
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov                                               pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc                                               pni monitor ds_cpl est cid cx16 xtpr
bogomips        : 6400.30
clflush size    : 64
cache_alignment : 128
address sizes   : 36 bits physical, 48 bits virtual
power management:

processor       : 1
vendor_id       : GenuineIntel
cpu family      : 15
model           : 4
model name      :               Intel(R) Pentium(R) 4 CPU 3.20GHz
stepping        : 3
cpu MHz         : 3200.151
cache size      : 2048 KB
physical id     : 0
siblings        : 2
core id         : 0
cpu cores       : 1
apicid          : 1
fpu             : yes
fpu_exception   : yes
cpuid level     : 5
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov                                               pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc                                               pni monitor ds_cpl est cid cx16 xtpr
bogomips        : 6399.04
clflush size    : 64
cache_alignment : 128
address sizes   : 36 bits physical, 48 bits virtual
power management:

and

Code:
cat /proc/meminfo
MemTotal:      2049808 kB
MemFree:       1021768 kB
Buffers:         28052 kB
Cached:         348380 kB
SwapCached:      35712 kB
Active:         489080 kB
Inactive:       349980 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:      2049808 kB
LowFree:       1021768 kB
SwapTotal:     4194296 kB
SwapFree:      3977976 kB
Dirty:            3224 kB
Writeback:           0 kB
AnonPages:      447868 kB
Mapped:          28184 kB
Slab:           151284 kB
PageTables:      15516 kB
NFS_Unstable:        0 kB
Bounce:              0 kB
CommitLimit:   5219200 kB
Committed_AS:  1479236 kB
VmallocTotal: 34359738367 kB
VmallocUsed:    265644 kB
VmallocChunk: 34359472119 kB
HugePages_Total:     0
HugePages_Free:      0
HugePages_Rsvd:      0
Hugepagesize:     2048 kB
 
gaining access to exim = a user account sending me warnings that " [user] has sent XXXXX emails yesterday ", but when I ask the client, they tell me they are not sending ANY emails ... :confused:

also when I log into DA and view process monitor, it shows DOZENS of instances of dovecot children - of either imap-login or pop3-login flavors ...

reducing the minimum number of dovecot children ( /etc/dovecot.conf )to 3 of each type seems to be a band-aid solution at first, but then within 24-48 hours, loads spike all over again, and completely disabling dovecot and exim seems to be the only way to control it. :(
 
Dovecot children is normal, it depends on how many people are connected to their IMAP/POP3 accounts. Dovecot also has X number processes waiting to process connections, but it has nothing to do with sending emails, only reading.

The only way to verify that the customer has actually sent XXXXX number of emails, is to review the log files (/var/log/exim/mainlog to be exact). You should be able to see all emails sent. Pick a customer that you received one of those messages about, and grep all the lines in the exim log relating to that unix userid.

If you see a lot of entries that look similar to this:
cwd=/home/$UNIXUSER/domains/$DOMAIN/public_html 5 args: /usr/sbin/sendmail -t -i -f $VUSER@$DOMAIN

Then something from a website for the user $UNIXUSER is causing it. Of course this also assumes you have the following in all your users httpd.conf files:
php_admin_value sendmail_path '/usr/sbin/sendmail -t -i -f $UNIXUSER@$DOMAIN'

If you see a lot of sendmail calls in the log file that don't match above, then they may have a script running on the server, pumping out the spam, and you need to find them and then patch the hole that allowed them in.
 
Someone is sending email as that username, from the server. And it's not exim. Exim is just doing what it's supposed to be doing.

Disable relay from 127.0.0.1 if it isn't already disabled, but read up on how it might affect your users first.

Depending on your Terms of Service and your customer relations policy, either block all outgoing email from them until they find and fix it, or find and fix it for them.

Jeff
 
>> Disable relay from 127.0.0.1 if it isn't already disabled, but read up on how it might affect your users first.


Could you - or somebody else - please point me to where I can read up on how to do this? Both searches here and Google are very inconclusive .. (search term: disable open relay) :(

I have a great number of servers with all the same issue, so the quicker I can eradicate it the better ...

THANX FOR ANY HELP!!
 
Last edited:
Hello Guys,
I am really very sorry to not respond here after my post and your response.

I configured CSF in my vps after I post here and seems all going ok. But today 5 times I got warning message on my directadmin admin panel. Please see below one of them.

I couldnt understand actually what happen here. Not much bandwidth uses today.


********************
This is an automated message notifying you that the 5 minute load average on your system is 68.44.
This has exceeded the 10 threshold.

One Minute - 77.04
Five Minutes - 68.44
Fifteen Minutes - 49.6

top - 08:37:37 up 11 days, 16:33, 0 users, load average: 60.97, 63.64, 53.12
Tasks: 251 total, 3 running, 225 sleeping, 0 stopped, 23 zombie
Cpu(s): 0.4%us, 0.1%sy, 0.0%ni, 98.1%id, 1.4%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 2097152k total, 1474292k used, 622860k free, 0k buffers
Swap: 0k total, 0k used, 0k free, 0k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
26130 admin 15 0 87560 13m 5532 S 0.1 0.6 0:00.03 /usr/local/php5/bin/php-cgi
9647 admin 16 0 94308 19m 5512 D 0.0 1.0 0:00.17 /usr/local/php5/bin/php-cgi
26219 root 18 0 127m 9860 544 D 0.0 0.5 0:00.02 lfd - (child) process tracking...
1723 admin 16 0 95364 17m 5516 D 0.0 0.9 0:00.16 /usr/local/php5/bin/php-cgi
7527 admin 18 0 141m 14m 5812 D 0.0 0.7 0:00.10 /usr/local/php5/bin/php-cgi
7922 jainfara 18 0 88504 13m 5512 D 0.0 0.6 0:00.06 /usr/local/php5/bin/php-cgi
8028 admin 18 0 87960 10m 5512 D 0.0 0.5 0:00.06 /usr/local/php5/bin/php-cgi
13542 admin 18 0 141m 15m 5860 D 0.0 0.8 0:00.07 /usr/local/php5/bin/php-cgi
13880 admin 17 0 94308 20m 5556 D 0.0 1.0 0:00.16 /usr/local/php5/bin/php-cgi
14173 admin 16 0 98016 23m 5752 D 0.0 1.1 0:00.12 /usr/local/php5/bin/php-cgi
21767 admin 18 0 89284 14m 5624 D 0.0 0.7 0:00.07 /usr/local/php5/bin/php-cgi
22440 admin 16 0 94308 20m 5556 D 0.0 1.0 0:00.08 /usr/local/php5/bin/php-cgi
23740 root 18 0 12740 1280 840 R 0.0 0.1 0:00.02 /usr/bin/top -c -b -n 1
24152 admin 18 0 94308 20m 5556 D 0.0 1.0 0:00.08 /usr/local/php5/bin/php-cgi
24403 root 18 0 127m 10m 1500 D 0.0 0.5 0:06.13 lfd - scanning log files
25840 jainfara 18 0 83880 9140 4820 D 0.0 0.4 0:00.01 /usr/local/php5/bin/php-cgi
1 root 18 0 10352 628 600 S 0.0 0.0 0:00.13 init [3]
1432 dovecot 16 0 0 0 0 Z 0.0 0.0 0:00.00 [pop3-login] <defunct>
1595 root 18 0 99952 1096 896 S 0.0 0.1 0:00.00 crond
1644 apache 18 0 66256 2704 1448 S 0.0 0.1 0:00.00 /usr/sbin/httpd -k start -DSSL
1664 apache 15 0 66260 2628 1412 S 0.0 0.1 0:00.02 /usr/sbin/httpd -k start -DSSL
1665 apache 18 0 66260 2676 1408 S 0.0 0.1 0:00.00 /usr/sbin/httpd -k start -DSSL
1669 apache 18 0 66260 2588 1396 S 0.0 0.1 0:00.01 /usr/sbin/httpd -k start -DSSL
 
That load is most likely not coming from your VPS but another one on the same server. Not much you can do from your VPS, but you can notify the admin of the host server and they can do some investigation.
 
The hardware node might be oversold. In such cases there is little you can do, without moving to another node or VPS provider.

Anyway I would not say that's the only reason of your load. You might need to do some investigation on your VPS, and if everything is fine there, then you should contact your VPS provider, or even begin to choose another one.
 
Back
Top