"The system load average is 11.7" please help ASAP

FileSick

Verified User
Joined
Oct 5, 2013
Messages
104
hello

just now my server was down for two hours and when i talked to the hosting company support team they rebooted the server and everything worked with no problem after that and the first thing i've done that i opened my directadmin account to find this system message could you please explain it to me and what happened and how to fix this issue please :


Subject: Warning: The system load average is 11.7 Today at 18:52

This is an automated message notifying you that the 5 minute load average on your system is 11.7.
This has exceeded the 10 threshold.

One Minute - 20.89
Five Minutes - 11.7
Fifteen Minutes - 4.89

top - 18:52:14 up 6 days, 16:56, 0 users, load average: 19.94, 11.65, 4.91
Tasks: 249 total, 2 running, 246 sleeping, 0 stopped, 1 zombie
Cpu(s): 0.8%us, 0.6%sy, 0.0%ni, 97.9%id, 0.7%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 8054336k total, 7950908k used, 103428k free, 4048k buffers
Swap: 10288124k total, 3746096k used, 6542028k free, 7298588k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
79 root 20 0 0 0 0 D 3.7 0.0 5:28.27 [kswapd0]
16252 apache 20 0 2155m 1.0g 1.0g D 2.8 13.6 0:17.83 /usr/sbin/httpd -k start -DSSL
16271 apache 20 0 2155m 1.0g 1.0g D 2.8 13.1 0:17.80 /usr/sbin/httpd -k start -DSSL
16141 root 20 0 0 0 0 S 2.3 0.0 0:07.79 [flush-8:0]
16175 apache 20 0 2155m 704m 699m D 1.9 9.0 0:21.00 /usr/sbin/httpd -k start -DSSL
48 root 20 0 0 0 0 S 0.9 0.0 0:23.98 [kblockd/2]
16142 apache 20 0 2155m 689m 684m D 0.9 8.8 0:20.48 /usr/sbin/httpd -k start -DSSL
16176 apache 20 0 2155m 693m 688m D 0.9 8.8 0:21.22 /usr/sbin/httpd -k start -DSSL
1326 TheMagic 20 0 306m 1344 896 S 0.5 0.0 17:47.14 /usr/bin/transmission-daemon -g /home/TheMagicfs/.config/transmission
16456 apache 20 0 0 0 0 Z 0.5 0.0 0:00.01 [httpd] <defunct>
1 root 20 0 19356 500 328 S 0.0 0.0 0:00.52 /sbin/init
2 root 20 0 0 0 0 S 0.0 0.0 0:00.01 [kthreadd]
3 root RT 0 0 0 0 S 0.0 0.0 0:01.14 [migration/0]
4 root 20 0 0 0 0 S 0.0 0.0 0:19.20 [ksoftirqd/0]
5 root RT 0 0 0 0 S 0.0 0.0 0:00.00 [migration/0]
6 root RT 0 0 0 0 S 0.0 0.0 0:00.99 [watchdog/0]
7 root RT 0 0 0 0 S 0.0 0.0 0:00.80 [migration/1]
8 root RT 0 0 0 0 S 0.0 0.0 0:00.00 [migration/1]
9 root 20 0 0 0 0 S 0.0 0.0 0:04.88 [ksoftirqd/1]
10 root RT 0 0 0 0 S 0.0 0.0 0:00.85 [watchdog/1]
11 root RT 0 0 0 0 S 0.0 0.0 0:00.32 [migration/2]
12 root RT 0 0 0 0 S 0.0 0.0 0:00.00 [migration/2]
13 root 20 0 0 0 0 S 0.0 0.0 0:07.32 [ksoftirqd/2]


================================
Automated Message Generated by DirectAdmin


please help me with and from what i understood which isn't much is that " This has exceeded the 10 threshold " i understand that i'm getting too much load on the server but this isn't the first time so how to increase this threshold to make it at-least 30 to make it unlimited please help me ASAP this is happening frequently

waiting for your respond
 
guys please help me ASAP i think it's because of the cloud hosting i'm running i have no limit on it please help me what limits should i make and how much like the max file size is no limit the upload size is no limit please help me ASAP the server is been down for over 5 hours right now and everytime i reboot it it goes offline again please help me ASAP

waiting for your respond
 
the server wen down again and i got this message please help me

This is an automated message notifying you that the 5 minute load average on your system is 12.92.
This has exceeded the 10 threshold.

One Minute - 23.2
Five Minutes - 12.92
Fifteen Minutes - 5.78

top - 21:36:52 up 1:10, 0 users, load average: 24.74, 13.95, 6.27
Tasks: 250 total, 5 running, 245 sleeping, 0 stopped, 0 zombie
Cpu(s): 1.0%us, 1.1%sy, 0.0%ni, 91.4%id, 6.4%wa, 0.0%hi, 0.1%si, 0.0%st
Mem: 8054336k total, 7946924k used, 107412k free, 13228k buffers
Swap: 10288124k total, 4275028k used, 6013096k free, 7367488k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1702 mysql 20 0 2073m 68m 3264 D 52.9 0.9 2:25.01 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib64/mysql/plugin --user=mysql --log-error=/var/lib/mysql/server.filesick.com.err --pid-file=/var/lib/mysql/server.filesick.com.pid
3473 apache 20 0 2155m 124m 116m D 11.4 1.6 0:28.11 /usr/sbin/httpd -k start -DSSL
3483 apache 20 0 2155m 108m 99m D 9.3 1.4 0:25.72 /usr/sbin/httpd -k start -DSSL
3864 apache 20 0 204m 11m 4184 D 6.9 0.1 0:02.09 /usr/sbin/httpd -k start -DSSL
3874 apache 20 0 204m 11m 4184 D 6.3 0.1 0:01.93 /usr/sbin/httpd -k start -DSSL
3968 apache 20 0 208m 13m 3084 D 6.0 0.2 0:00.51 /usr/sbin/httpd -k start -DSSL
3811 apache 20 0 208m 13m 3092 D 5.7 0.2 0:00.82 /usr/sbin/httpd -k start -DSSL
3986 apache 20 0 208m 13m 2920 D 5.4 0.2 0:00.46 /usr/sbin/httpd -k start -DSSL
3500 apache 20 0 2155m 107m 98m D 5.1 1.4 0:26.35 /usr/sbin/httpd -k start -DSSL
3857 apache 20 0 208m 13m 2916 D 5.1 0.2 0:02.00 /usr/sbin/httpd -k start -DSSL
3495 apache 20 0 2155m 110m 102m D 4.2 1.4 0:25.40 /usr/sbin/httpd -k start -DSSL
3812 apache 20 0 208m 13m 3068 D 4.2 0.2 0:01.75 /usr/sbin/httpd -k start -DSSL
3821 apache 20 0 208m 13m 3084 D 4.2 0.2 0:00.82 /usr/sbin/httpd -k start -DSSL
3850 apache 20 0 208m 14m 3236 D 4.2 0.2 0:00.79 /usr/sbin/httpd -k start -DSSL
3957 apache 20 0 208m 13m 3076 D 4.2 0.2 0:00.90 /usr/sbin/httpd -k start -DSSL
3482 apache 20 0 2155m 112m 104m D 3.9 1.4 0:26.48 /usr/sbin/httpd -k start -DSSL
3840 apache 20 0 208m 14m 3320 D 3.9 0.2 0:00.82 /usr/sbin/httpd -k start -DSSL
3883 apache 20 0 204m 11m 4180 D 3.9 0.1 0:01.93 /usr/sbin/httpd -k start -DSSL
79 root 20 0 0 0 0 D 3.0 0.0 0:18.43 [kswapd0]
3843 apache 20 0 208m 14m 3084 D 3.0 0.2 0:00.43 /usr/sbin/httpd -k start -DSSL
3656 root 20 0 0 0 0 S 2.7 0.0 0:05.28 [flush-8:0]
3964 apache 20 0 208m 14m 3084 D 2.4 0.2 0:00.34 /usr/sbin/httpd -k start -DSSL
3967 apache 20 0 204m 11m 3908 D 2.1 0.1 0:00.28 /usr/sbin/httpd -k start -DSSL

what to do ?
 
It could be a DDOS attack or something.
Try to have a look at http://yourdomain.com/server-status and /server-info. Maybe at this moment there is nothing to see.

Install CSF/LFD firewall and also edit /etc/httpd/conf/extra/httpd-info.conf
ExtendedStatus On (so remove the mark).
Next to that, in server-status and server-info allow localhost and your own ip and restart apache.

If it happens again, CSF will also send you a message with 3 mails in there with ps, a link to server status and another one.
Good chance you can find that way what site is causing the trouble.
 
i just enabled the csf and lfd cause it was disabled by me long time ago everything went offline now what's the problem i can't access the doirectadmin anymore what's the problem now ?
 
Provide an URL or an IP so we can check, probably the firewall blocked you somehow.

Regarding the high-load, there would be two main option:

1 - your server is under DDoS attack, so check server-status (you need to enable it to be able to check it)
2 - your server performance are not enough for the load coming from web users

Regards
 
the website name is FileNerd.com and since i enabled the the CSF and lfd and all the wbesites are working properly except the directadmin i can't access it i don't know why but i think you are right about the firewall blocking me cause the first email i received was blocking egypt ip and that's was my ip but i rebooted the router and still can't access it so what should i do

and about the high load how to be for sure cause i run a cloud hosting website and that could be very easy the users

waiting for your respond
 
Can you still access using SSH? Maybe you didnt open DirectAdmin port in your firewall.

What are VPS specs? You can check if the load is caused by a DDoS attack or value visitors with server-status.

Also, please dont send me PM to notice me about thread update, the forum send automatic mail, once i've time, i do log and reply, there is no need for PM's unless you wanna hire me to work on your server.

Regards
 
yes i can access through SSH it's the only thing working beside the websites but not the FTP or directadmin

so if you think i can do something using the SSH ?

waiting for your respond
 
Definitly you can do everything using SSH.

You can:

1 - Disable CSF using command "csf -x" to be able to log into DirectAdmin and correctly setup CSF
2 - Edit CSF configuration file /etc/csf/csf.conf to open needed ports and than restart CSF using "csf -r"
3 - Add your IP to whitelist file /etc/csf/csf.allow and than restart CSF using "csf -r" to be able to log into DA and fix CSF configuration

Regards
 
thank you so much for your fast respond right now i got CSF disabled and if you can please tell me in details how to do the steps 2 and 3 ?

waiting for your respond
 
Steps 2 and 3 can be done either via the admin section of DA, the CSF plugin is present there and you can edit the options.
Or you can login via SSH and edit the files manually.
 
thank you so much richard for your fast respond well i am looking forward to do these steps from the DA cause it's easier there but i don't know exactly what to do with CSF so if you can please tell me how to do it ?

waiting for your respond
 
thank you guys i edited the ports and now i can access the directadmin with the csf enabled

now how to know if the high system load was a DDos attack or just the users ?

waiting for your respond
 
If it's DDOS attacks then CSF+LFD will go a long way towards stopping them. It may very well be that your users are using a lot of resources.

But what looks suspicious to me is the high use of Swap memory. You're using over 6GB of swap memory. Swap memory isn't in RAM, it's on your hard drive. No matter how fast your hard drive is, it's slow enough to cause huge serverloads while it moves memory between the drive and RAM (where it can use it). And since the top command (that's what your paste-in shows us) does show that the swap daemon (the program handling swap) is using the most resources.

Do you know what the server load is? It's the average amount of processes waiting to run within a given period. Processes wait to run while that memory is being transferred.

Unfortunately there may be no easy way to clear the swap memory, and it could be caused by either a DOS or DDOS attack, or misconfiguraiton, or even by high use by users. You can shut down processes one by one in the hope that will clear swap, but often the best thing to do is to restart the server. Then watch for a while, using a shell connetion, to see how quickly it goes back up.

Jeff
 
i really don't know i just restarted it and it seems like it never changed it's the same what do you think ?
 
Immediately after you restart the server it shouldn't be using any swap memory, so look more carefully at the output of the top command; post whatever appears in these lines (will be different; this is copied from your post above):
Code:
top - 18:52:14 up 6 days, 16:56, 0 users, load average: 19.94, 11.65, 4.91
Tasks: 249 total, 2 running, 246 sleeping, 0 stopped, 1 zombie
Cpu(s): 0.8%us, 0.6%sy, 0.0%ni, 97.9%id, 0.7%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 8054336k total, 7950908k used, 103428k free, 4048k buffers
Swap: 10288124k total, 3746096k used, 6542028k free, 7298588k cached
Jeff
 
you are right i probably misread it so sorry i think it was really a DDos attack cause since the enable of the csf and everything is fine now and it's keep on blocking china's ips alot so as hong kong so everything is fine now thank you so much for your great help and time
 
If you find which website is attacked, and if they don't have any visitors or customers of china, you could consider using a Chinese blacklist for that website.

I'm using it (it's for .htaccess) for a couple of forums, and it's now extended to a lot of more Asian country's, Russia, Vietnam and Brasil and attackers and spammers were forcely reduced by this.
http://www.wizcrafts.net/chinese-blocklist.html

I like the .htaccess more, but there's also one for iptables on his site. From within CSF you can also block by country. But I wouldn't advise to use iptables as it's serverwide and could take up memory for all the iptables lines.
 
Back
Top