[WORKAROUND!] Bug in Apache 2? - Service stopped (cannot be started)

Mikej0h · May 29, 2007

First of all: I don't know if this is the right spot for posting this, but I think it can be very usefull for other people since we were searching for a workaround or solution for hours!

Since today, all our servers supplied with apache 2 had problems with starting up the 'httpd' daemon.
From 11:00 today, all servers using apache 2(.0.x) (in our cases used for PHP4 + PHP5 with suPHP) went to status 'stopped'. All we tried was useless.
We checked the logfiles (nothing came up) and recompiled Apache to ensure nothing was wrong with it. Ofcourse we first updated the datafiles with the customapache script.
However after recompiling it still didn't work.

We came up, it could be the problem of the logfiles which are build-up per-domain basis.
We edited the files in /usr/local/directadmin/data/templates/custom, and uncommented the part where the logfiles are defined.
We ran 'echo "action=rewrite&value=httpd" >> /usr/local/directadmin/data/task.queue' and '/usr/local/directadmin/dataskq d' to commit / complete this operation.
All user-configs are reset to default, and all domains were uncommented for logfile usage.

We restarted Apache, and it seemed it solved the problem.
When you are executing steps described above, the stats like Webalizer and AWstats don't work anymore, because they build their database upon the logfiles of Apache.

I hope anyone can use this, because we had this problem on 13 of our servers (yes, in once

)

Ofcourse this is a workaround, BUT websites can be reached again, your customers will not experience any form of downtime if above is executed.

nobaloney · May 29, 2007

I'm not sure I understand. You commented them out? or you uncommented them, as you wrote?

How many domains are you running? If you're running too many domains then apache can't keep error log files open for all of them; it creates too many file descriptors. If this is the problem, then it's been documented several times in these forums.

Jeff

Mikej0h · May 29, 2007

jlasman said:
I'm not sure I understand. You commented them out? or you uncommented them, as you wrote?

How many domains are you running? If you're running too many domains then apache can't keep error log files open for all of them; it creates too many file descriptors. If this is the problem, then it's been documented several times in these forums.

Jeff

I commented them (put a # before the lines).

The servers where this happend had about ~ 300-400 domains per server.
At how many domains is this limit you are talking about?

nobaloney · May 29, 2007

Is this what you're referring to?

I don't remember the exact number, but we've done it for clients before when they had this problem.

It's not a DA problem or a bug. It's a limit of how your OS is compiled and how many files it can keep open at one time.

Jeff

Mikej0h · May 29, 2007

jlasman said:
Is this what you're referring to?

I don't remember the exact number, but we've done it for clients before when they had this problem.

It's not a DA problem or a bug. It's a limit of how your OS is compiled and how many files it can keep open at one time.

Jeff

I know it's not a problem of DirectAdmin, but people come here to look for a solution. That's the reason I post it here!
However; I don't think it's in the 'MaxClients' setting, because the problem occurs even when there are no noticable visitors.

I will try option two, but as said before I don't think that's the solution either. Why? Because we recently split one (about 700 domains) server (apache 1.37.x) to two servers (one with ~ 300, one with ~400), both are now running apache 2, and both are expiencing problems. Apache 1 didn't 'feature' this bug, it must be faulty in apache 2.0.xx

The tutorial you are refering to says the limit is about 800 sites/domains.
The servers we use are handling less half that limit.

Any advice what else we could do, except point two of the tutorial you are refering to?
Thanks till now anyway!

nobaloney · May 29, 2007

If the fix resolved the problem then it did. DA thought (and I thought) that it would work to about 1000 domains, but it's not dependent on domains, it's dependent on how many open file descriptors you have on your system. And it doesn't matter if the system or any domains are active or not; http opens all the error logs when it starts. Which is why the fix works.

Sure, you can resolve the problem the right way; if you need more file descriptors recompile your kernel manually. Not for the fainthearted, and definitely way beyond the scope of the forum.

Jeff

Mikej0h · May 30, 2007

Yes, but I have no entries in my log files.
The tutorial you gave me, ended with:

Related error messages:
[error] System: Too many open files in system (errno: 23)

host: isc_socket_create: not enough free resources socket.c:2117: REQUIRE(maxfd <= (int)1024) failed.
host: isc_socket_create: not enough free resources

I didn't get those messages!!
The only entry he made was (when I did /usr/lib/apachectl start, instead of service httpd start) 'Unable to open logs'.
The logs were set to the right permissions, and writable by the correct user.

nobaloney · May 30, 2007

I didn't give you the tutorial. I pointed you to a tutorial written by DirectAdmin staff, and asked you if it was what you followed.

I've never used /usr/lib/apachectl so I don't know what error messages it will return, nor do I know which log they'll end up in. Nor do I know what you consider the right permissions; perhaps you're right and if you give your logs what the server considers the right permisssion that will fix things.

Nor do I know where DirectAdmin found that list of related errors; I never found anything like that.

(My guess is they'll show up wherever your OS distribution logs kernel errors as that's what they are.)

You're welcome to post some of your log directory and the name of the OS distribution you're using so someone who uses the same can advise you if the ownership and permissions are set correctly.

Jeff

felosi · May 31, 2007

this has been a problem for me as well, And if you look on teh forum it has happend to a few other people as well. Ill try this out and hope it works

smtalk · Oct 21, 2007

Found this on few servers with the same problem (Apache starts to segfault with many vhosts):

Code:

(gdb) b ap_process_request
Note: breakpoints 1 and 3 also set at pc 0x809f9e9.
Breakpoint 4 at 0x809f9e9: file http_request.c, line 252.
(gdb) run -X -d /etc/httpd
Starting program: /usr/sbin/httpd -X -d /etc/httpd
[Thread debugging using libthread_db enabled]
[New Thread -1208407840 (LWP 30464)]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread -1208407840 (LWP 30464)]
0x00e21d63 in RAND_SSLeay () from /lib/libcrypto.so.4
(gdb) n
Single stepping until exit from function RAND_SSLeay,
which has no line number information.

Program terminated with signal SIGSEGV, Segmentation fault.
The program no longer exists.

smtalk · Oct 29, 2007

"This is an OpenSSL bug, fixed in 0.9.8c and later (it uses select() rather than poll() and doesn't check for the FD_SETSIZE overflow)."

[WORKAROUND!] Bug in Apache 2? - Service stopped (cannot be started)

Mikej0h

Verified User

nobaloney

NoBaloney Internet Svcs - In Memoriam †

Mikej0h

Verified User

nobaloney

NoBaloney Internet Svcs - In Memoriam †

Mikej0h

Verified User

nobaloney

NoBaloney Internet Svcs - In Memoriam †

Mikej0h

Verified User

nobaloney

NoBaloney Internet Svcs - In Memoriam †

felosi

Verified User

smtalk

Administrator

smtalk

Administrator