plakjeworst
New member
- Joined
- Apr 28, 2016
- Messages
- 4
On several new clean install servers the number of semaphores that apache uses keeps rising until reaching the system's limit and refusing to start. There are many threads about this problem and some directadmin KB articles about how to clear the semaphores and raise the limit but none of this addresses the cause of the problem.
Apache uses semaphore arrays (9 in my setup), which you can view with "ipcs -s". Because semaphores not bound to a process, terminating the process directly will leave the semaphores in the system for every unclean termination. With about 9 semaphores each time this quickly hits the limit of typically 128 semaphore arrays.
It seems every time apache is restarted by directadmin (after adding a domain, or simply clicking restart in the service monitor), it does not terminates it cleanly, leaving semaphores behind. I am using Centos 7 with the latest directadmin. I've done some testing and it looks like the systemd apache service itself is not restarting apache cleanly. Counting the semaphores before and after the restart I get this:
The 9 semaphores from the first process are left in the system. Note that apachectl does it right:
Looking in /etc/systemd/system/httpd.service file (written by custombuild), it appears the problem is the ExecStop command and KillSignal. Part of the original file:
ExecStop sends the TERM signal to gracefully shutdown apache, then the processes should be killed if not gone after 5 seconds. This is not what these settings do.
What systemd does is first is execute ExecStop. Immediately after, it will send the then remaining (main+child) processes SIGTERM (unless other specified by KillSignal, which is done here). Then after the delay specified by TimeoutStopSec, it sends a hard SIGKILL.
Because ExecStop does not wait, likely all processes are still active. Normally this would still work since systemd would send SIGTERM itself to all processes and then wait, but because KillSignal is overruled, it sends a SIGKILL to all processes immediately.
When the KillSignal setting is removed (and defaults to SIGTERM), the problem with the semaphores is solved. The ExecStop line can probably be removed as well, as systemd will send SIGTERM to all processes anyway.
On centos 6 (init.d instead of systemd) I have similar problems, however I cannot easily debug this since it is a live server. Might be that directadmin kills the processes directly instead of using the init.d script (which seems to send a TERM signal first), but I'm not sure.
Apache uses semaphore arrays (9 in my setup), which you can view with "ipcs -s". Because semaphores not bound to a process, terminating the process directly will leave the semaphores in the system for every unclean termination. With about 9 semaphores each time this quickly hits the limit of typically 128 semaphore arrays.
It seems every time apache is restarted by directadmin (after adding a domain, or simply clicking restart in the service monitor), it does not terminates it cleanly, leaving semaphores behind. I am using Centos 7 with the latest directadmin. I've done some testing and it looks like the systemd apache service itself is not restarting apache cleanly. Counting the semaphores before and after the restart I get this:
Code:
$ ipcs -s | grep apache | wc -l
9
$ systemctl restart httpd
$ ipcs -s | grep apache | wc -l
18
The 9 semaphores from the first process are left in the system. Note that apachectl does it right:
Code:
$ ipcs -s | grep apache | wc -l
9
$ apachectl restart
$ ipcs -s | grep apache | wc -l
9
Looking in /etc/systemd/system/httpd.service file (written by custombuild), it appears the problem is the ExecStop command and KillSignal. Part of the original file:
Code:
[Service]
Type=forking
ExecStart=/usr/sbin/httpd $OPTIONS -k start
ExecReload=/usr/sbin/httpd $OPTIONS -k graceful
ExecStop=/bin/kill -TERM ${MAINPID}
# We want systemd to give httpd some time to finish gracefully, but still want
# it to kill httpd after TimeoutStopSec if something went wrong during the
# graceful stop. If it's not stopped in 5 seconds, it dies.
KillSignal=SIGKILL
TimeoutStopSec=5s
ExecStop sends the TERM signal to gracefully shutdown apache, then the processes should be killed if not gone after 5 seconds. This is not what these settings do.
What systemd does is first is execute ExecStop. Immediately after, it will send the then remaining (main+child) processes SIGTERM (unless other specified by KillSignal, which is done here). Then after the delay specified by TimeoutStopSec, it sends a hard SIGKILL.
Because ExecStop does not wait, likely all processes are still active. Normally this would still work since systemd would send SIGTERM itself to all processes and then wait, but because KillSignal is overruled, it sends a SIGKILL to all processes immediately.
When the KillSignal setting is removed (and defaults to SIGTERM), the problem with the semaphores is solved. The ExecStop line can probably be removed as well, as systemd will send SIGTERM to all processes anyway.
On centos 6 (init.d instead of systemd) I have similar problems, however I cannot easily debug this since it is a live server. Might be that directadmin kills the processes directly instead of using the init.d script (which seems to send a TERM signal first), but I'm not sure.