Results 1 to 8 of 8

Thread: Reload of named.service fails, and multiple named processes on Debian 9

  1. #1
    Join Date
    Nov 2005
    Location
    Norway
    Posts
    91

    Question Reload of named.service fails, and multiple named processes on Debian 9

    Greetings!

    I am experiencing a problem with bind not reloading properly on one of our servers. On an identical server, it is working as it should, but it has some strange issues as well. I can't figure out why they are different, or why one of them is failing to reload bind. Both servers have been set up using Ansible (automation), so they should be exactly the same.

    The first issue is the reload problem on web32. The error in the log is:

    Code:
    Apr  4 11:26:01 web32 systemd[1]: named.service: Unit cannot be reloaded because it is inactive.
    And sure enough, when checking the status for named.service, this is the output on web32:

    Code:
    # systemctl status named.service
    ● named.service - BIND Domain Name Server
       Loaded: loaded (/etc/systemd/system/named.service; enabled; vendor preset: enabled)
       Active: failed (Result: exit-code) since Mon 2019-03-25 14:29:29 CET; 1 weeks 2 days ago
         Docs: man:named(8)
     Main PID: 4241 (code=exited, status=0/SUCCESS)
    
    Apr 03 12:37:02 web32 systemd[1]: named.service: Unit cannot be reloaded because it is inactive.
    While on web33, where everything works as expected, the output is different:

    Code:
    # systemctl status named.service
    ● named.service - BIND Domain Name Server
       Loaded: loaded (/etc/systemd/system/named.service; enabled; vendor preset: enabled)
       Active: active (running) since Tue 2019-04-02 17:02:25 CEST; 1 day 18h ago
         Docs: man:named(8)
     Main PID: 18657 (named)
        Tasks: 7 (limit: 4915)
       CGroup: /system.slice/named.service
               └─18657 /usr/sbin/named -f -u bind
    
    Apr 04 09:40:25 web33 rndc[24184]: server reload successful
    If I check the status of bind9.service on the two servers, they show this for Web32:

    Code:
    # systemctl status bind9.service 
    ● bind9.service - LSB: Start and stop bind9
       Loaded: loaded (/etc/init.d/bind9; generated; vendor preset: enabled)
       Active: active (running) since Thu 2019-04-04 10:05:57 CEST; 1h 50min ago
         Docs: man:systemd-sysv-generator(8)
       CGroup: /system.slice/bind9.service
               └─12475 /usr/sbin/named -u bind
    And web33:

    Code:
    # systemctl status bind9.service
    ● bind9.service - LSB: Start and stop bind9
       Loaded: loaded (/etc/init.d/bind9; generated; vendor preset: enabled)
       Active: active (running) since Thu 2019-04-04 10:08:35 CEST; 1h 48min ago
         Docs: man:systemd-sysv-generator(8)
        Tasks: 7 (limit: 4915)
       CGroup: /system.slice/bind9.service
               └─26631 /usr/sbin/named -u bind
    Both servers have /etc/init.d/bind9 and /etc/systemd/system/named.service and they are identical.

    Looking at the processlist on each server, I see two named processes running, which I find strange. In addition, the two servers are different in that regards as well:

    Web32:

    Code:
    # ps auxfww | grep name[d]
    root     21444  0.0  0.2 389260 21364 ?        Ssl  Mar21   0:12 named
    bind     12475  0.0  0.3 411260 26788 ?        Ssl  10:05   0:01 /usr/sbin/named -u bind
    Web33:

    Code:
    # ps auxfww | grep name[d]
    bind     18657  0.0  0.3 411000 26332 ?        Ssl  Apr02   0:09 /usr/sbin/named -f -u bind
    bind     26631  0.0  0.3 407620 29176 ?        Ssl  10:08   0:00 /usr/sbin/named -u bind
    The servers are both running Debian 9(.8), Linux 4.9.0-5-amd64 x86_64, with Directadmin 1.56:

    Code:
    # lsb_release -a
    No LSB modules are available.
    Distributor ID:	Debian
    Description:	Debian GNU/Linux 9.8 (stretch)
    Release:	9.8
    Codename:	stretch
    
    # uname -srm
    Linux 4.9.0-5-amd64 x86_64
    
    # /usr/local/directadmin/directadmin o
    Compiled on 'Debian 9.0 64-bit'
    Compile time: Mar 18 2019 at 02:18:53
    Timestamp: '1552897108'
    Compiled with IPv6
    
    # /usr/local/directadmin/directadmin v
    Version: DirectAdmin v.1.56.0
    So I guess my three questions are:

    1 - Why do one of my servers have an inactive named.service? A reload of bind9.service works as expected.
    2 - Why do I have two named processes running in the first place?
    3 - How do I resolve 1 and 2?

    This is confusing me bigtime. Any help appreciated!
    Kristian Rønningen - CTO Nordhost
    Cloud services, server management, consultancy

  2. #2
    Join Date
    Aug 2006
    Location
    LT, EU
    Posts
    7,445
    Martynas Bendorius
    MB Martynas IT. Professional server management company. Official DirectAdmin, CloudLinux, LiteSpeed and Comodo partners.

  3. #3
    Join Date
    Nov 2005
    Location
    Norway
    Posts
    91
    Ah, that is useful indeed. Not sure what has happened on my servers to cause one to work without that setting, and the other to not work though.
    Kristian Rønningen - CTO Nordhost
    Cloud services, server management, consultancy

  4. #4
    Join Date
    Nov 2005
    Location
    Norway
    Posts
    91
    For some reason, when setting named_service_override=bind9 in the directadmin.conf file, and restarting directadmin, there seems to be no attempt whatsoever to reload bind. I can't see any trace of DirectAdmin even trying in any of the logs. The changed zone is not available if I query the local named either.
    Last edited by kristian; 04-04-2019 at 07:03 AM.
    Kristian Rønningen - CTO Nordhost
    Cloud services, server management, consultancy

  5. #5
    A few steps to test with:
    1. Add/remove a test record from any dns zone. Quickly type:
      Code:
      cat /usr/local/directadmin/data/task.queue
      to see if DA has added the changed value for the action. If not, confirm the setting with
      Code:
      ./directadmin c | grep named_service_override
    2. If you do see the correct action, you can manually test it, repeatedly by dumping the same thing to the task.queue again, and running the dataskq with
      Code:
      cd /usr/local/directadmin
      echo 'action=bind9&value=reload' >> data/task.queue; ./dataskq d2000
      and check /var/log/directadmin/system.log to see if mentinos bind9 or not.


    John

  6. #6
    Join Date
    Nov 2005
    Location
    Norway
    Posts
    91
    After changing a DNS zone, /usr/local/directadmin/data/task.queue contained:

    Code:
    action=bind%39&value=reload
    So it seems the 9 is urlencoded before inserted for some reason. The config does not contain the urlencoded value:

    Code:
    # ./directadmin c | grep named_service_override
    named_service_override=bind9
    I also tried manually inserting it with a proper name:

    Code:
    echo 'action=bind9&value=reload' >> /usr/local/directadmin/data/task.queue
    In both these cases, nothing was logged to /var/log/directadmin/system.log or /var/log/directadmin/errortaskq.log or anywhere else that I could find, but the entry in the queue file disappeared.
    Kristian Rønningen - CTO Nordhost
    Cloud services, server management, consultancy

  7. #7
    URL encoding shouldn't affect anything, as it's decoded in the dataskq.
    I'm interested in the dataskq output, so please run it through as described in #2 of my previous post.

    John

  8. #8
    Join Date
    Nov 2005
    Location
    Norway
    Posts
    91
    Apologies, I expected it to show in the log as well. Here's the output:

    Code:
    # echo 'action=bind9&value=reload' >> data/task.queue; ./dataskq d2000
    Debug mode. Level 2000
    
    root priv set: uid:0 gid:0 euid:0 egid:0
    pidfile written
    starting queue
    dataskq: command: action=bind9&value=reload
    done queue
    Kristian Rønningen - CTO Nordhost
    Cloud services, server management, consultancy

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •