CentOS DirectAdmin freezes from on arbitrary moments

jhertum

New member
Joined
Mar 17, 2013
Messages
5
Hi. Somehow my DA installation sometimes freezes. The server is not busy at all. There are a few small sites on it.

We work with two network adapters. On one adapter we allow http / ftp and on the other one we allow ssh (only reachable via our VPN).
When the sites get non-responsive, I still can login using the ssh and try to restart httpd / directadmin / mysqld etc. But nothing seems to help. Only a reboot.
Strange enough sometimes the reboot is needed once in 4 weeks.. and last days way too often.

The logs seems to be clean to me. Anyone an idea how I can investigate this issue in detail?

Some info:

CentOS release 6.3 (Final)

DA info
Compiled on CentOS 6.0 64-Bit
Compile Date Mar 16 2013, 22:28:50
Server Version 1.43.0
Current Available Version 1.430000
(I just updated before I posted this message from 1.42)

top - 15:42:26 up 1:30, 1 user, load average: 0.01, 0.06, 0.02
Tasks: 185 total, 1 running, 184 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.0%us, 0.1%sy, 0.0%ni, 99.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 8059552k total, 748120k used, 7311432k free, 49424k buffers
Swap: 10305528k total, 0k used, 10305528k free, 230608k cached


Thanks !
 
Hello,

That might be a hardware issue, you might need to check dmesg and system logs. Also you might need to connect a monitor and see what it display there or use IP KVM.
 
If you can still log in through your private network using ssh, then the server isn't hanging, though obviously some services may be. So you should first check to see if you can log in through ssh on your main IP# (if it's normally not allowed, allow it temporarily while still logged in through your vpn IP#). If you can't shell in there, then it may be a firewall issue.

Be sure to try disabling your firewall completely and then see if your server still appears to be unresponsive.

Also check top to see what services are taking up the most resources, and also your serverload and memory (especially swap memory) in use.

Jeff
 
Hi Jeff,

Thanks for your remarks.

The most difficult thing with this issue is that the system runs for weeks like a dream. And then at some point it hangs... and last week was terrible.
When I logged on last week (when the sites were not responsive) I copied a top; see below:

Code:
top - 15:42:26 up 1:30, 1 user, [B]load average: 0.01, 0.06, 0.02[/B]
Tasks: 185 total, 1 running, 184 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.0%us, 0.1%sy, 0.0%ni, 99.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 8059552k total, 748120k used, 7311432k free, 49424k buffers
[B]Swap: 10305528k total, 0k used[/B], 10305528k free, 230608k cached

Strange enough .. starting stopping services did not resolve the issue (restarted: httpd, da, mysqld, named) etc.

Last weekend we ran the latest updates on CentOs and the latest upgrade on DA. Maybe this helps.
Our next plan can be to swap the hardware; as we have some spare machines.


Thanks
Jeroen
 
Up until now we tried many things. All the hardware was replaced (except the Harddisks).

There is memory enough. While it hangs; the server is more or less IDLE. Like 1% busy.
And since there are not many sites on this server.. there is almost no traffic.

However.. strange enough the machine "stalls" several times per day for about a minute or two. Actually .. not the machine, but the sites.

I have no clue anymore how to resolve this.
 
It looks indeed like the server blocks for 2-3 minutes. Maybe a bit longer. But often when my clients call there is nothing wrong... and they can also access their site again.

What is CSF? And how can I investigate/resolve this?

Thanks !
 
CSF is CSF/LFD (full name Configserver Firewall) a good firewall which als has a Directadmin plugin. Very extended but also easy to use and to configure.
 
I did some more investigation.... the server "stalls" more often than I expected:


Monitor Downtime detail
Date Downtime Duration
Apr 22, 2013 2:01 PM EEST 11 Mins 52 Secs
Apr 22, 2013 11:12 AM EEST 12 Mins 53 Secs
Apr 22, 2013 10:15 AM EEST 12 Mins 7 Secs
Apr 22, 2013 9:14 AM EEST 14 Mins 9 Secs
Apr 22, 2013 8:20 AM EEST 6 Mins 27 Secs
Apr 22, 2013 7:39 AM EEST 10 Mins 57 Secs
Apr 22, 2013 6:46 AM EEST 37 Mins 27 Secs
Apr 22, 2013 2:51 AM EEST 11 Mins 9 Secs
Apr 21, 2013 8:53 PM EEST 18 Mins 57 Secs


The machine has two network interfaces. One we use as a backdoor with a hardware VPN connection (SSH is only accesible from there). The other port is connected to "the internet".. but only works for port http / ftp and port 2222.

What I "think" I notice. If the server hangs, it's not responsive on the "internet side". When I logon to the "VPNe o network interface" ... the sites on the other interfaces also come alive again. Is this coincidence?
 
Try to check your currently running routes when the server is not responsive on the "internet side", and you connect to the "VPNe o network interface". Maybe you've got an issue either with routes, or with Ethernet card.
 
Back
Top