Weird sshd restart problem after creating new users (sometimes)

kevinbentlage

Verified User
Joined
Jul 15, 2013
Messages
13
Hi,

Since a few weeks we experiencing a weird problem on multiple DA boxes.

When we add a new user through directadmin, with SSH access, the new user can't logon on SSH. Also sshd can't be restarted manually:

[root@uat01 ~]# service sshd restart
Redirecting to /bin/systemctl restart sshd.service
Job for sshd.service failed because a timeout was exceeded. See "systemctl status sshd.service" and "journalctl -xe" for details.

[root@uat01 ~]# systemctl status sshd.service
● sshd.service - OpenSSH server daemon
Loaded: loaded (/usr/lib/systemd/system/sshd.service; enabled; vendor preset: enabled)
Active: activating (start) since Mon 2017-01-23 15:38:43 CET; 1min 14s ago
Docs: man:sshd(8)
man:sshd_config(5)
Process: 1016738 ExecReload=/bin/kill -HUP $MAINPID (code=exited, status=0/SUCCESS)
Process: 1018705 ExecStart=/usr/sbin/sshd $OPTIONS (code=exited, status=0/SUCCESS)
Main PID: 1009034 (code=exited, status=0/SUCCESS)
CGroup: /system.slice/sshd.service
└─1016740 /usr/sbin/sshd

Jan 23 15:38:43 uat01.****.nl systemd[1]: Starting OpenSSH server daemon...
Jan 23 15:38:43 uat01.****.nl sshd[1018706]: error: Bind to port 22 on 0.0.0.0 failed: Address already in use.
Jan 23 15:38:43 uat01.****.nl systemd[1]: PID file /var/run/sshd.pid not readable (yet?) after start.

The strange thing is, that sometimes these problems will occur after adding a new user, but sometimes they don't, and everything is working well. We have these problems on multiple boxes. Some with a high amount of users (50+), and some boxes with a few users (between 1 and 10).

The only way to get SSHD restarted is by rebooting the whole server (sshd is starting normally) or by killing the sshd process with "kill -9" and then start it again.

What we tried:
- Enable PidFile /var/run/sshd.pid in sshd_conf (disabled by default)
- Removed all ssh public keys that are added to several users

We're experience these problems on CentOS 7.1 boxes with up2date DirectAdmin installs.

Does anyone have any idea where this problems are caused?
 
i also have this problem.
beacuse you aleardy connect to ssh. you can''t close the connection.
only when you use KVM or console to rester the service.

there is solution to kill the pid on stop.
but it will disconect you
 
You should be careful to apply the changes, as with restarting sshd with KillMode=mixed you will be kicked off the server (if you connected via SSH).
 
it's not gonna kick you...i tested on all my servers before =]
 
Well, what do I miss then?

Anyway if your session was not terminated, but mine was... there is a chance to be kicked off alongside with a chance not to be kicked off. So we need to identify what exactly effects this.

Do you have an idea on why my SSH session was terminated after the modification then? And your sessions were not? I did not dig the issue yet.
 
Hi all,

We face the same issues. I worked with a customer to see where it goes wrong. With adding a user with ssh access there is no problem we tested this several times. But when we removed a user it did try to restart sshd but the old sshd process keeps running. Also restarting sshd from the service monitor failed/timeout because the parent pid does not match running. At this time i suggested the customer to stop sshd before removing a user then after 5 min start sshd. This seem to work as workaround.

The main problem is located at removing a user with ssh access which causes the parent sshd process keep running and restarts won't work again until this process is killed.

We tested this on centos 7.3.1611 and directadmin 1.50 with is now probably 1.51.

Hope someone can dig into this and find a solution.

Regards
 
I have the same problem with CentOS 7 + DirectAdmin: sometimes these problems will occur after adding a new user, but sometimes they don't, and everything is working well.
 
I had the same problem after removing a user. I was forced to do this: http://forum.directadmin.com/showthread.php?t=54322&p=278650#post278650

Also it was needed to do "systemctl daemon-reload" before doing "systemctl restart sshd.service"

Then after sshd was able to restart normally I changed back to KillMode=process and did "systemctl daemon-reload" before doing "systemctl restart sshd.service"

It seems this will happen everytime I remove a user (with ssh access). I am also using CentOS 7.3. Hopefully this bug can be fixed in future DirectAdmin versions.

Edit: Also please note that when this happen after removing a user, sshd continue to work for existing users, only new users created after the problem occur will not have sshd working.
 
Last edited:
I got the same issue. I tryed changing KillMode=mixed and after systemctl daemon-reload, sshd can be restarted normally.
It did kick me out but restarted sshd from DirectAdmin Service monitoring tool and was able to log back in.
After fixing the issue, changed back to KillMode=process and the problem is still gone. I am able to restart sshd normally now (both from command line and from DA).
Thanks for pointing this out. I can confirm it fixes the issue.

Regards,

Isaias
 
I'm in the middle of one migration and I can't login to ssh, what options do I have? reboot the whole server? Thank you.
 
Back
Top