Rsyslogd and Journald

jonium

Verified User
Joined
Nov 10, 2010
Messages
208
Location
Alezio - Lecce- Apulia - South Italy
Hello,
can you confirm that rsyslogd is mandatory in order to correctly make csf running?
I have some centos box running both Rsyslogd and Journald and sometimes rsyslogd uses 100% of cpu and the server go down...

DircetAdmin latest versions
Custombuild 2
 
The rsyslogd is part of the OS. It is mandatory for the system to be able to generate the system logs and write to them.
Not only CSF uses it, everything is using it.

It's best to investigate as to why it starts using 100% of the cpu. Check your logs if there are massive log entry's in the system logs or something else in some logfile is giving a clue.

Maybe others can also give good tips for this. Maybe @Zhenyapan because he has a lot of servers, maybe he has good idea's.
 
If you not silence/disable the logs of some channel, it could fill up the logs system.


From my experience, I just suspect the bind9 is doing bad. For better ways, just hire someone to check directly for you.
 
Here are some recent Bind logs:
Code:
Jan 03 17:58:05 myserver.xxx named[3833]: limit responses to 66.249.93.0/24 for www.domain2.it IN A  (0b0f904b)
Jan 03 17:58:05 myserver.xxx named[3833]: client @0x7f57a41bbfd0 66.249.93.160#60012 (www.domain2.it): rate limit slip response to 66.249.93.0/24 for www.domain2.it IN A  (0b0f904b)
Jan 03 17:58:05 myserver.xxx named[3833]: client @0x7f57a41bbfd0 66.249.93.14#36571 (www.domain2.it): rate limit drop response to 66.249.93.0/24 for www.domain2.it IN A  (0b0f904b)
Jan 03 17:59:05 myserver.xxx named[3833]: stop limiting responses to 66.249.93.0/24 for www.domain2.it IN A  (0b0f904b)
Jan 03 18:19:45 myserver.xxx named[3833]: limit responses to 51.178.111.0/24 for www.domain1.fr IN A  (a0ed7875)
Jan 03 18:19:45 myserver.xxx named[3833]: client @0x7f57a41908f0 51.178.111.233#17725 (www.domain1.fr): rate limit slip response to 51.178.111.0/24 for www.domain1.fr IN A  (a0ed7875)
Jan 03 18:19:45 myserver.xxx named[3833]: client @0x7f57a41908f0 51.178.111.240#21270 (www.domain1.fr): rate limit drop response to 51.178.111.0/24 for www.domain1.fr IN A  (a0ed7875)
Jan 03 18:19:45 myserver.xxx named[3833]: client @0x7f57a4182150 51.178.111.233#47406 (www.domain1.fr): rate limit slip response to 51.178.111.0/24 for www.domain1.fr IN A  (a0ed7875)
Jan 03 18:19:45 myserver.xxx named[3833]: client @0x7f57a4182150 51.178.111.225#33627 (www.domain1.fr): rate limit drop response to 51.178.111.0/24 for www.domain1.fr IN A  (a0ed7875)
Jan 03 18:19:45 myserver.xxx named[3833]: client @0x7f57a4182150 51.178.111.237#53646 (www.domain1.fr): rate limit slip response to 51.178.111.0/24 for www.domain1.fr IN A  (a0ed7875)
Jan 03 18:19:45 myserver.xxx named[3833]: client @0x7f57a4182150 51.178.111.240#52638 (www.domain1.fr): rate limit drop response to 51.178.111.0/24 for www.domain1.fr IN A  (a0ed7875)
Jan 03 18:19:45 myserver.xxx named[3833]: client @0x7f57a4182150 51.178.111.225#37799 (www.domain1.fr): rate limit slip response to 51.178.111.0/24 for www.domain1.fr IN A  (a0ed7875)
Jan 03 18:19:45 myserver.xxx named[3833]: client @0x7f57a4212d90 51.178.111.240#50440 (www.domain1.fr): rate limit drop response to 51.178.111.0/24 for www.domain1.fr IN A  (a0ed7875)
 
Code:
named[3833]: increase from 500 to 750 RRL entries with 503 bins; average search length 2.0
named[3833]: client @0x7f57a4212d90 213.219.38.223#43752 (tripadvisor.com): query (cache) 'tripadvisor.com/A/IN' denied
named[3833]: client @0x7f57a41bbfd0 213.219.38.223#27749 (uber.com): query (cache) 'uber.com/A/IN' denied
obviously tripadvisor.com and uber.com aren't in hosting at my server...
 
idk, but that's could be a reason.

trying disable ratelimit logs and limits other logs into 20m size.


put it at the end of file, or relate code.
#named.conf
Code:
logging {

category rate_limiting_log  { null; };
        
channel default_log {
          file "/var/named/log/default" versions 3 size 20m;
          print-time yes;
          print-category yes;
          print-severity yes;
          severity info;
     };
};

"/var/named/log/default" About this location, please matching with your server config. Between RHEL and Debian have diference store location.
 
yes, maybe you need to manual create the logs folder.

That's how ddos/bot doing. They don't care what's will happening, just spamming to your server while they could.
 
Normally all logs should be on "default" channel.
Maybe you need to change owner of that folder "log" into "named:named", Basic I turn on the log when need to debug something and turnoff when finish for prevent any weird bot/ddos.
 
the server seems to be more stable anyway sometimes it reboots.
After last umpteenth reboots I found the following log:

Code:
Jan 12 15:08:43 localhost kernel: Switched APIC routing to cluster x2apic.
Jan 12 15:08:43 localhost kernel: ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
Jan 12 15:08:43 localhost kernel: smpboot: CPU0: Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz (fam: 06, model: 9e, stepping: 0d)
Jan 12 15:08:43 localhost kernel: TSC deadline timer enabled
Jan 12 15:08:43 localhost kernel: mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 4: be00000000800400
Jan 12 15:08:43 localhost kernel: mce: [Hardware Error]: TSC 0 ADDR ffffffff91e0553b MISC ffffffff91e0553b
Jan 12 15:08:43 localhost kernel: mce: [Hardware Error]: PROCESSOR 0:906ed TIME 1705068520 SOCKET 0 APIC 0 microcode f0
Jan 12 15:08:43 localhost kernel: Performance Events: PEBS fmt3+, Skylake events, 32-deep LBR, full-width counters, Intel PMU driver.
Jan 12 15:08:43 localhost kernel: ... version:                4
Jan 12 15:08:43 localhost kernel: ... bit width:              48
Jan 12 15:08:43 localhost kernel: ... generic registers:      4
Jan 12 15:08:43 localhost kernel: ... value mask:             0000ffffffffffff
Jan 12 15:08:43 localhost kernel: ... max period:             00007fffffffffff
Jan 12 15:08:43 localhost kernel: ... fixed-purpose events:   3
Jan 12 15:08:43 localhost kernel: ... event mask:             000000070000000f
Jan 12 15:08:43 localhost kernel: NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
Jan 12 15:08:43 localhost kernel: mce: [Hardware Error]: Machine check events logged
Jan 12 15:08:43 localhost kernel: mce: [Hardware Error]: CPU 3: Machine Check: 0 Bank 3: be00000000800400
Jan 12 15:08:43 localhost kernel: smpboot: Booting Node   0, Processors  #1 #2 #3 #4
Jan 12 15:08:43 localhost kernel: mce: [Hardware Error]: TSC 0
Jan 12 15:08:43 localhost kernel: ADDR ffffffff91e0553b MISC ffffffff91e0553b
Jan 12 15:08:43 localhost kernel: mce: [Hardware Error]: PROCESSOR 0:906ed TIME 1705068520 SOCKET 0 APIC 6 microcode f0
Jan 12 15:08:43 localhost kernel:  #5 #6 #7 #8
Jan 12 15:08:43 localhost kernel: TAA CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/tsx_async_abort.html for more details.
Jan 12 15:08:43 localhost kernel: MMIO Stale Data CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/processor_mmio_stale_data.html for more details.
Jan 12 15:08:43 localhost kernel:  #9 #10 #11 #12 #13 #14 #15 OK

and now I'm reading that pages (
and
https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/processor_mmio_stale_data.html )

Has anyone else ever had this problem ?
 
look like something wrong with your CPU.

Since I never work with pure server. Mostly I work with VPS like Xen, Proxmox. so it hard to answer this if there have problem with kernel or hardware.
 
idk, but that's could be a reason.

trying disable ratelimit logs and limits other logs into 20m size.


put it at the end of file, or relate code.
#named.conf
Code:
logging {

category rate_limiting_log  { null; };
       
channel default_log {
          file "/var/named/log/default" versions 3 size 20m;
          print-time yes;
          print-category yes;
          print-severity yes;
          severity info;
     };
};

"/var/named/log/default" About this location, please matching with your server config. Between RHEL and Debian have diference store location.
After more than 3 weeks I didn't have that problem anymore.
Thank you
 
Back
Top