Wouter0100
Verified User
- Joined
- Nov 9, 2012
- Messages
- 25
Last months we improved our monitoring and we started to monitor systemd services. As a result of this, we discovered that from time to time - something is gracefully reloading HTTPD at the same time that HTTPD is restarting. This results in an error. I don't know how long this has been going on, could also be started when we added new servers.
For example:
Here you see that Apache is being stopped and directly after, a `/usr/sbin/httpd -k graceful` is executed as a transient systemd service. This last graceful command results in an error - which results in the transient service failing and thus systemd getting in an "unhealthy" or "error" state.
To our current research, the command that is being executed (`/usr/sbin/httpd -k graceful`) is not the `ExecReload=` directive of the httpd service. I tested this theory, but it seems not the case.
It happens from time to time. Last I recorded was:
Does anyone know what is going on?
Our setup where this occurs is fairly standard: DirectAdmin, CloudLinux and Imunify360. We already e-mailed CloudLinux and Imunify360 asking them if their software is responsible for the above, but they denied it was.
For example:
Sep 16 14:51:05 hostname systemd[1]: Stopping The Apache HTTP Server...
Sep 16 14:51:06 hostname systemd[1]: Started /usr/sbin/httpd -k graceful.
Sep 16 14:51:07 hostname httpd[1730168]: (98)Address already in use: AH00072: make_sock: could not bind to address [::]:80
Sep 16 14:51:07 hostname httpd[1730168]: (98)Address already in use: AH00072: make_sock: could not bind to address 0.0.0.0:80
Sep 16 14:51:07 hostname httpd[1730168]: no listening sockets available, shutting down
Sep 16 14:51:07 hostname httpd[1730168]: AH00015: Unable to open logs
Sep 16 14:51:07 hostname httpd[1730168]: httpd not running, trying to start
Sep 16 14:51:07 hostname systemd[1]: run-r9269d5650ee14cb18a559d9f75ac83b3.service: Main process exited, code=exited, status=1/FAILURE
Sep 16 14:51:07 hostname systemd[1]: run-r9269d5650ee14cb18a559d9f75ac83b3.service: Failed with result 'exit-code'.
Sep 16 14:51:10 hostname systemd[1]: httpd.service: State 'stop-sigterm' timed out. Killing.
Sep 16 14:51:10 hostname systemd[1]: httpd.service: Killing process 223181 (httpd) with signal SIGKILL.
Sep 16 14:51:10 hostname systemd[1]: httpd.service: Killing process 1347277 (httpd) with signal SIGKILL.
Sep 16 14:51:10 hostname systemd[1]: httpd.service: Killing process 1717512 (httpd) with signal SIGKILL.
Sep 16 14:51:10 hostname systemd[1]: httpd.service: Killing process 1727902 (lsphp) with signal SIGKILL.
Sep 16 14:51:10 hostname systemd[1]: httpd.service: Killing process 1728045 (lsphp) with signal SIGKILL.
Sep 16 14:51:10 hostname systemd[1]: httpd.service: Killing process 1728489 (lsphp) with signal SIGKILL.
Sep 16 14:51:10 hostname systemd[1]: httpd.service: Killing process 1728542 (lsphp) with signal SIGKILL.
Sep 16 14:51:10 hostname systemd[1]: httpd.service: Killing process 1728555 (lsphp) with signal SIGKILL.
Sep 16 14:51:10 hostname systemd[1]: httpd.service: Killing process 1728641 (lsphp) with signal SIGKILL.
Sep 16 14:51:10 hostname systemd[1]: httpd.service: Killing process 1728651 (lsphp) with signal SIGKILL.
Sep 16 14:51:10 hostname systemd[1]: httpd.service: Killing process 1728675 (lsphp) with signal SIGKILL.
Sep 16 14:51:10 hostname systemd[1]: httpd.service: Killing process 1728711 (lsphp) with signal SIGKILL.
Sep 16 14:51:10 hostname systemd[1]: httpd.service: Killing process 1728787 (lsphp) with signal SIGKILL.
Sep 16 14:51:10 hostname systemd[1]: httpd.service: Killing process 1728795 (lsphp) with signal SIGKILL.
Sep 16 14:51:10 hostname systemd[1]: httpd.service: Killing process 1730179 (lsphp) with signal SIGKILL.
Sep 16 14:51:10 hostname systemd[1]: httpd.service: Main process exited, code=killed, status=9/KILL
Sep 16 14:51:10 hostname systemd[1]: httpd.service: Failed with result 'timeout'.
Sep 16 14:51:10 hostname systemd[1]: Stopped The Apache HTTP Server.
Here you see that Apache is being stopped and directly after, a `/usr/sbin/httpd -k graceful` is executed as a transient systemd service. This last graceful command results in an error - which results in the transient service failing and thus systemd getting in an "unhealthy" or "error" state.
To our current research, the command that is being executed (`/usr/sbin/httpd -k graceful`) is not the `ExecReload=` directive of the httpd service. I tested this theory, but it seems not the case.
It happens from time to time. Last I recorded was:
The above issue seems not only when stopping HTTPD, but also when starting it.okt 14 02:17:36 hostname systemd[1]: Stopping The Apache HTTP Server...
okt 14 02:17:40 hostname systemd[1]: Started /usr/sbin/httpd -k graceful.
...
okt 14 02:17:41 hostname httpd[2533510]: (98)Address already in use: AH00072: make_sock: could not bind to address [::]:80
okt 14 02:17:41 hostname httpd[2533510]: (98)Address already in use: AH00072: make_sock: could not bind to address 0.0.0.0:80
okt 14 02:17:41 hostname httpd[2533510]: no listening sockets available, shutting down
okt 14 02:17:41 hostname httpd[2533510]: AH00015: Unable to open logs
okt 14 02:17:41 hostname httpd[2533510]: httpd not running, trying to start
okt 14 02:17:41 hostname systemd[1]: run-r515fe7c22ab641e88dfe152e50e3d512.service: Main process exited, code=exited, status=1/FAILURE
okt 14 02:17:41 hostname systemd[1]: run-r515fe7c22ab641e88dfe152e50e3d512.service: Failed with result 'exit-code'.
okt 14 02:17:41 hostname systemd[1]: httpd.service: State 'stop-sigterm' timed out. Killing.
okt 14 02:17:41 hostname systemd[1]: httpd.service: Killing process 1116854 (httpd) with signal SIGKILL.
okt 14 02:17:41 hostname systemd[1]: httpd.service: Killing process 2468867 (httpd) with signal SIGKILL.
okt 14 02:17:41 hostname systemd[1]: httpd.service: Killing process 2469320 (httpd) with signal SIGKILL.
okt 14 02:17:41 hostname systemd[1]: httpd.service: Killing process 2531813 (lsphp) with signal SIGKILL.
okt 14 02:17:41 hostname systemd[1]: httpd.service: Killing process 2532040 (lsphp) with signal SIGKILL.
okt 14 02:17:41 hostname systemd[1]: httpd.service: Killing process 2532080 (lsphp) with signal SIGKILL.
okt 14 02:17:41 hostname systemd[1]: httpd.service: Killing process 2532085 (lsphp) with signal SIGKILL.
okt 14 02:17:41 hostname systemd[1]: httpd.service: Main process exited, code=killed, status=9/KILL
okt 14 02:17:41 hostname systemd[1]: httpd.service: Failed with result 'timeout'.
okt 14 02:17:41 hostname systemd[1]: Stopped The Apache HTTP Server.
okt 14 02:17:44 hostname systemd[1]: Started The Apache HTTP Server.
okt 14 02:17:45 hostname systemd[1]: Started /usr/sbin/httpd -k graceful.
Does anyone know what is going on?
Our setup where this occurs is fairly standard: DirectAdmin, CloudLinux and Imunify360. We already e-mailed CloudLinux and Imunify360 asking them if their software is responsible for the above, but they denied it was.