apache getting stuck

Wunk

Verified User
Joined
Sep 25, 2003
Messages
121
We have a DA machine (Redhat 9, 1 GB mem, Dell 650) which regularly has apache crashing out..


The error log says:
[Mon Jan 26 20:37:02 2004] [warn] child process 11927 still did not exit, sending a SIGTERM
[Mon Jan 26 20:37:02 2004] [warn] child process 11972 still did not exit, sending a SIGTERM
[Mon Jan 26 20:37:02 2004] [warn] child process 11995 still did not exit, sending a SIGTERM
[Mon Jan 26 20:37:07 2004] [error] Cannot remove module mod_frontpage.c: not found in module list
[Mon Jan 26 20:37:09 2004] [warn] module perl_module is already loaded, skipping
[Mon Jan 26 20:37:10 2004] [crit] (98)Address already in use: make_sock: could not bind to port 8090


I can't find out what causes this, but it happens quite a lot.., and customers are complaining.., all that helps is killing -9 all running httpd processes and restarting the service..

Any suggestions on what could cause this or even better: a solution ?
 
Hello,

I think an update / recompile might be in order:
Code:
cd /usr/local/directadmin/customapache
rm -f configure.*
./build clean
./build update
./build all
John
 
It downloaded an updated PHP and Apache, so let's hope this will fix it..

Thanks :)
 
It didn't fix it.., last night around 1:00am it crashed again :(

Any other suggestions ?
 
Hello,

Are there any apache modules that have been added via rpm? Because this can't really be done.. try this:

rm -rf /usr/lib/apache/*

and then do the "./build all" again, as mentioned above.

John
 
Nope, it's a plain RedHat 9 server without modules or anything installed..

I've emptied the modules directory, and rebuilt DA, let's hope it'll stay stable now :)
 
I donno what exactly attracted my attention to this thread, but i have the exact same problems. With a RH 9.0 server, it can't be the hardware since it's a rather new server.

Recompiling Apache didn't work.
The configure.php from the custombuilder only has one extra line. Support for Freetype...

After i needed to recompile the last time (2 feb or something), the problems started. Every night at 1 am or something (daily cron runs at 0:01 am).

I still have an old ./build on another server, i am thinking of using this one to rebuild this server since i really am getting irritated with this problem...

What exactly has been done to the custombuilder or it's way to compile from January 1, 2004 ?
 
Hello,

Is you gd library compiled with freetype?? If you include freetype support in php, it will try and call functions in the gd library that may not exists. You can check it by running:

ldd /usr/local/lib/libgd.so
and
ldd /usr/lib/apache/libphp4.so

There were a few changes made the the configure.apache_ssl file.. mainly just the environment headers at the top. You can try removing them as they were only added to support freebsd, but they didn't seem to affect RedHat when I was testing, could be wrong.

John
 
The ldd output :
Code:
# ldd /usr/local/lib/libgd.so
        libfreetype.so.6 => /usr/local/lib/libfreetype.so.6 (0x4004f000)
        libpng.so.3 => /usr/local/lib/libpng.so.3 (0x400ac000)
        libz.so.1 => /usr/local/lib/libz.so.1 (0x400d4000)
        libm.so.6 => /lib/tls/libm.so.6 (0x400e8000)
        libc.so.6 => /lib/tls/libc.so.6 (0x42000000)
        /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x80000000)
# ldd /usr/lib/apache/libphp4.so
        libcrypt.so.1 => /lib/libcrypt.so.1 (0x401d5000)
        libmcrypt.so.4 => /usr/local/lib/libmcrypt.so.4 (0x40202000)
        libltdl.so.3 => /usr/lib/libltdl.so.3 (0x40234000)
        libfreetype.so.6 => /usr/local/lib/libfreetype.so.6 (0x4023b000)
        libpng.so.3 => /usr/local/lib/libpng.so.3 (0x40298000)
        libz.so.1 => /usr/local/lib/libz.so.1 (0x402c0000)
        libresolv.so.2 => /lib/libresolv.so.2 (0x402ce000)
        libm.so.6 => /lib/tls/libm.so.6 (0x402e0000)
        libdl.so.2 => /lib/libdl.so.2 (0x40302000)
        libnsl.so.1 => /lib/libnsl.so.1 (0x40306000)
        libcurl.so.2 => /usr/local/lib/libcurl.so.2 (0x4031b000)
        libssl.so.4 => /lib/libssl.so.4 (0x4033f000)
        libcrypto.so.4 => /lib/libcrypto.so.4 (0x40375000)
        libgssapi_krb5.so.2 => /usr/kerberos/lib/libgssapi_krb5.so.2 (0x40466000)
        libkrb5.so.3 => /usr/kerberos/lib/libkrb5.so.3 (0x40479000)
        libcom_err.so.3 => /usr/kerberos/lib/libcom_err.so.3 (0x404d7000)
        libk5crypto.so.3 => /usr/kerberos/lib/libk5crypto.so.3 (0x404d9000)
        libc.so.6 => /lib/tls/libc.so.6 (0x42000000)
        /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x80000000)

So I think it should be working :D

The older build script, from Nov 2003, didn't have that FreeBSD support, did it ?
On another server with exactly the same configs a compile with that build file works perfectly... (Don't know if this is significant)
 
I created a temporary cronjob on this server that restarts the httpd every hour. So far i can see it finally stayed stable all night.

I know it's just an temporary solution, but i was hoping to minimize further downtime for users...

If Wunk doesn't have any problems anymore, I will also empty that directory, that currently contains:

Code:
total 9.7M
drwxr-xr-x    2 root     root         4.0K Feb 10 22:28 .
drwxr-xr-x   47 root     root          16K Jan 29 22:40 ..
lrwxrwxrwx    1 root     root           20 Feb 10 22:28 apache -> ../../usr/lib/apache
-rw-r--r--    1 root     root         8.2K Feb 10 22:28 httpd.exp
-rwxr-xr-x    1 root     root         1.2M Feb 10 22:36 libperl.so
-rwxr-xr-x    1 root     root         7.4M Feb 10 22:35 libphp4.so
-rwxr-xr-x    1 root     root          99K Feb 10 22:28 libproxy.so
-rwxr-xr-x    1 root     root         219K Feb 10 22:28 libssl.so
-rwxr-xr-x    1 root     root          12K Feb 10 22:28 mod_access.so
-rwxr-xr-x    1 root     root         9.7K Feb 10 22:28 mod_actions.so
-rwxr-xr-x    1 root     root          13K Feb 10 22:28 mod_alias.so
-rwxr-xr-x    1 root     root         8.0K Feb 10 22:28 mod_asis.so
-rwxr-xr-x    1 root     root         9.6K Feb 10 22:28 mod_auth_anon.so
-rwxr-xr-x    1 root     root          13K Feb 10 22:28 mod_auth.so
-rwxr-xr-x    1 root     root          31K Feb 10 22:28 mod_autoindex.so
-rwxr-xr-x    1 root     root          11K Feb 10 22:28 mod_cern_meta.so
-rwxr-xr-x    1 root     root          17K Feb 10 22:28 mod_cgi.so
-rwxr-xr-x    1 root     root          11K Feb 10 22:28 mod_define.so
-rwxr-xr-x    1 root     root          12K Feb 10 22:28 mod_digest.so
-rwxr-xr-x    1 root     root         9.4K Feb 10 22:28 mod_dir.so
-rwxr-xr-x    1 root     root         9.4K Feb 10 22:28 mod_env.so
-rwxr-xr-x    1 root     root          15K Feb 10 22:28 mod_example.so
-rwxr-xr-x    1 root     root          11K Feb 10 22:28 mod_expires.so
-rwxr-xr-x    1 root     root          19K Feb 10 22:28 mod_frontpage.so
-rwxr-xr-x    1 root     root         9.4K Feb 10 22:28 mod_headers.so
-rwxr-xr-x    1 root     root          18K Feb 10 22:28 mod_imap.so
-rwxr-xr-x    1 root     root          38K Feb 10 22:28 mod_include.so
-rwxr-xr-x    1 root     root          21K Feb 10 22:28 mod_info.so
-rwxr-xr-x    1 root     root         8.6K Feb 10 22:28 mod_log_agent.so
-rwxr-xr-x    1 root     root          20K Feb 10 22:28 mod_log_config.so
-rwxr-xr-x    1 root     root         9.7K Feb 10 22:28 mod_log_referer.so
-rwxr-xr-x    1 root     root          26K Feb 10 22:28 mod_mime_magic.so
-rwxr-xr-x    1 root     root          17K Feb 10 22:28 mod_mime.so
-rwxr-xr-x    1 root     root          12K Feb 10 22:28 mod_mmap_static.so
-rwxr-xr-x    1 root     root          30K Feb 10 22:28 mod_negotiation.so
-rwxr-xr-x    1 root     root          61K Feb 10 22:28 mod_rewrite.so
-rwxr-xr-x    1 root     root          12K Feb 10 22:28 mod_setenvif.so
-rwxr-xr-x    1 root     root          13K Feb 10 22:28 mod_speling.so
-rwxr-xr-x    1 root     root          21K Feb 10 22:28 mod_status.so
-rwxr-xr-x    1 root     root          10K Feb 10 22:28 mod_unique_id.so
-rwxr-xr-x    1 root     root          10K Feb 10 22:28 mod_userdir.so
-rwxr-xr-x    1 root     root          14K Feb 10 22:28 mod_usertrack.so
-rwxr-xr-x    1 root     root          11K Feb 10 22:28 mod_vhost_alias.so
I need to note that the apache symlink in it isn't working, it's marked red...
 
My output ends up with this, the only thing we added was --enable-dbx support, but problems existed before that was added


[root@shared-dedi-1 root]# ldd /usr/local/lib/libgd.so
libpng.so.3 => /usr/local/lib/libpng.so.3 (0x40057000)
libm.so.6 => /lib/tls/libm.so.6 (0x40086000)
libc.so.6 => /lib/tls/libc.so.6 (0x42000000)
/lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x80000000)


[root@shared-dedi-1 root]# ldd /usr/lib/apache/libphp4.so
libcrypt.so.1 => /lib/libcrypt.so.1 (0x401e6000)
libmcrypt.so.4 => /usr/local/lib/libmcrypt.so.4 (0x40213000)
libltdl.so.3 => /usr/lib/libltdl.so.3 (0x40245000)
libpng.so.3 => /usr/local/lib/libpng.so.3 (0x4024c000)
libresolv.so.2 => /lib/libresolv.so.2 (0x40274000)
libm.so.6 => /lib/tls/libm.so.6 (0x40287000)
libdl.so.2 => /lib/libdl.so.2 (0x402a9000)
libnsl.so.1 => /lib/libnsl.so.1 (0x402ac000)
libcurl.so.2 => /usr/local/lib/libcurl.so.2 (0x402c1000)
libssl.so.4 => /lib/libssl.so.4 (0x402e5000)
libcrypto.so.4 => /lib/libcrypto.so.4 (0x4031a000)
libgssapi_krb5.so.2 => /usr/kerberos/lib/libgssapi_krb5.so.2 (0x40412000)
libkrb5.so.3 => /usr/kerberos/lib/libkrb5.so.3 (0x40425000)
libcom_err.so.3 => /usr/kerberos/lib/libcom_err.so.3 (0x40483000)
libk5crypto.so.3 => /usr/kerberos/lib/libk5crypto.so.3 (0x40485000)
libc.so.6 => /lib/tls/libc.so.6 (0x42000000)
/lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x80000000)
libz.so.1 => /usr/lib/libz.so.1 (0x40495000)
 
Hello,

How about removing the environment lines from the top of the configure.apache_ssl file and do a clean compile of apache. Not sure what else to check.

John
 
DirectAdmin Support said:
Hello,

How about removing the environment lines from the top of the configure.apache_ssl file and do a clean compile of apache. Not sure what else to check.

John

Hi,

It seems to be an idea :)
I'll do it tonight (dutch time) together with a mysqld update to the latest version (a friend of mine straced the daemon and found a few disturbing things), or at least, if the client in issue will allow me to...
 
Well, it crashed again last night, each time it seems to crash around 0:15am or somewhere around that time.. (DA cron issue somewhere ?)

I'll try removing the SSL part..
 
Last edited:
Hello,

Is the "LoadModule modules/libssl.so" bit still there? Send me your login info again if you want me to have a look. Also, let me know if you commented it when you're are fixing it so we can deteremine if that's what taking it down.

If you want to test if the nightly tally is doing it, you can just run:
Code:
echo "action=tally&value=all" >> /usr/local/directadmin/data/task.queue
and wait a few minutes to see what happens.

John
 
Well, I stayed up for it today, and it just went down again, same log entries:

[Fri Feb 13 23:12:47 2004] [error] [client 213.93.42.151] (13)Permission denied: cannot read directory for multi: /home/robende/domains/sharedip/
[Sat Feb 14 00:12:01 2004] [warn] child process 27179 still did not exit, sending a SIGTERM
[Sat Feb 14 00:12:01 2004] [warn] child process 27236 still did not exit, sending a SIGTERM
[Sat Feb 14 00:12:06 2004] [error] Cannot remove module mod_frontpage.c: not found in module list
[Sat Feb 14 00:12:06 2004] [warn] module perl_module is already loaded, skipping
[Sat Feb 14 00:12:06 2004] [crit] (98)Address already in use: make_sock: could not bind to port 8090


So basically it looks like the cron job that causes it is:
10 0 * * * root echo 'action=tally&value=all' >> /usr/local/directadmin/data/task.queue

And since we don't have other cronjobs running around that time, AND it's happening at this hour each time.., I'm pretty sure that causes it.. (which tasks is it doing around that time that could involve apache ?)


Point is.., it can go fine for a few days and not happen, or be dead 2 days in a row (like now)

Would you like to take a look ? (let me know your mailaddy and I'll provide a login)

For the moment I'm adding the following cronjob as a workaround for the next crash:

0 20 * * * /etc/init.d/httpd stop ; sleep 10 ; killall -9 httpd ; sleep 30 ; /etc/init.d/httpd restart > /dev/null 2> /dev/null


So far this is the only server that's acting up like this, we're hosting well over 30 directadmin panels, all without trouble, server hardware configurations are all equal though (at least most of them), they're all Dell poweredge 650's.., we tried upping the RAM to 1 GB, to no avail..
 
Last edited:
btw, the libssl is still loaded, entry in /etc/httpd/conf/httpd.conf:

<IfDefine HAVE_SSL>
LoadModule ssl_module modules/libssl.so
</IfDefine>
 
hi,

we have exactly the same problem here.
" only at one server " all the others have no problems.

server suddenly gets load of 60 and apache crashes.

loadavg-day.png
 
Last edited:
Hello,

That's the nightly tally. The dataskq program should spike the load up while it recounts everything and runs webalizer on all the the logs. At the end of the recount, apache is restarted. The tally can take anywhere from 5 minutes to 1.5 hours, depending on how many sites you have, and how busy they are... so the only thing that I can think of that would cause apache to die would be one of

1) not enough memory during the tally. Not sure this would be the case, but perhaps keep an eye on the dataskq program during the tally and see if anything is spking out of control.

2) when the logs are rotated, they are deleted, but apache isn't reloaded until the end of the tally. Perhaps apache isn't handling the the removal of the error logs too well.. Only the ErrorLog keep the filedescriptor open.. so when DA removes the file, perhaps apache panics and doesn't know what to do. To check this, you could edit the virtual_host*.conf templates, and comment out the "ErrorLog" lines. This will put all errors into the /var/log/httpd/error_log file, which is not deleted by DA. Then we can see if that was causing it. (run echo "action=rewrite&value=httpd" >> /usr/local/directadmin/data/task.queue after the mods to the templates to rewite the httpd.conf files.

Let me know if that does anything.

John
 
DirectAdmin Support said:
Hello,

That's the nightly tally.

1) not enough memory during the tally. Not sure this would be the case, but perhaps keep an eye on the dataskq program during the tally and see if anything is spking out of control.

2) when the logs are rotated, they are deleted, but apache isn't reloaded until the end of the tally. Perhaps apache isn't handling the the removal of the error logs too well..
John

I currently am working on the server of Subhosting.
The problems were :
* Load went up too high (upto 20 for just working normally)
* Apache sometimes crashed after the nightly tally

My first idea was to try to get the load down. This took me about 5 to 12 days to do, finally figured out what was causing it (i think).
John: For the next custombuilder release, please, build in a check for FreeBSD or RH. I removed the added code as you recommended (without further editing other files). And the load now goes upto 1.90 as max and at this moment it's stable at 0.80.
Furthermore I updated the MySQLD since it was using enourmous amounts of CPU time without a reason (strace pointed out it had a few problems).

Next i'm going to simulate the nightly tally to see what the max load will be and to see if it would kill Apache.

I don't think it's the reload of Apache that's the problem, since i use a script to parse the http://<ip>/~<user> traffic on my own server that messes around with logs, and Apache doesn't have any problems with it.
 
Back
Top