mysql crash on automatic backup

Golovior

New member
Joined
Dec 12, 2015
Messages
15
Since a 1.5 month I've got a Centos 7 VPS with DirectAdmin and a bunch of websites and forums. Before that the server was hosted on a Centos 5 VPS. DirectAdmin is a different install.

Since the server has moved to Centos 7, I've got a problem with a automatic backup. As soon as the backup hits MySQL it crashes on a table of 1.1 GB. This happens every saterday morning early. Connecting to mysql isn't possible anymore after that.
To make the server work i need to restart the mysql service, repair the table and again restart the mysql service.

So there are 2 possible options on what happens:
1) The backup does not get through the table and keeps trying. When i restart mysql service the table crashes. So the backup isn't finished.
2) The backup crashes on this table and keeps the database locked.


Now i really need to solve this as soon as possible. People are starting to make a fuss about it and are starting to point at me for breaking it.

I see 2 possible options how to solve this:
1) Stop the automatic backup which starts every friday to saterday morning early.
2) Find out why the server is crashing on mysql and solving this.

The first i tried and didn't succeed.
The second gives me the following information in the logs:

Excessive resource usage: mysql
Excessive resource usage: libstoragemgmt

But I can't find how to solve these messages.

Is there anyone who can help me with this?
 
I can't help you with mysql. But where is that "Excessive resource usage" coming from? Was this a notice from csf/lfd firewall?
If yes, that is not the cause, because that is only a warning that some process is using more time then would be normal.
Which is logically with that kind of big tables.
If you did not change anything in CSF/LFD to kill the process when it takes too long (it's only warning by default), this is no issue.
In csf.pignor you could add these lines:
Code:
exe:/usr/sbin/mysqld
exe:/usr/sbin/mysqld_safe
then the mysql notice should be gone. It won't solve the problem though.

However, I don't know about the libstoremgmt. I don't have that on my servers (Centos 6). Maybe that has something to do with the problem, causing a timeout?
Hopefully anybody else can help you with that part and with finding the real cause of the problem.
 
Hello Richard,

Thank you for your response. I did see some things about adding those lines to csf.pignor, but couldn't find a clear explanation on what it does. Your explanation is very clear and it will cut down the amount of mails i get greatly, so thank you for clearing this up.

I hope so to someone can help me with the mysql crash.
 
Hello,

Is LFD on your server configured to kill processes? If so you need to add MySQL to the ignore list. But first try to disable LFD and run the backup manually.
 
Hello Alex,

If i disable LFD, would that result in not blacklisting the IP addresses which try to login with brute force? If so, i will not disable LFD. I get those attacks almost every day, so i need this blacklisting.
Where can i find this part of the configuration file? I searched through the configuration, but can't find it anywhere.
 
Yes, it would. Read /var/log/lfd.log for the details. This way for example:

Code:
grep Kill -i /var/log/lfd.log | less
 
Hello Alex,

It does indeed break off these two processes (mysql and libstoragemgmt). This happens every hour + 1 second. (08:32:19 - 09:32:20 - 10:32:21 - etc.)

How do i stop this in a secure way? Cause this could definitly be the reason why the backup is failing.
 
Update /etc/csf/csf.pignore with:

Code:
exe:/sbin/ntpdexe:/usr/bin/dbus-daemon
exe:/usr/bin/dbus-daemon-1
exe:/usr/bin/fetchmail
exe:/usr/bin/freshclam
exe:/usr/libexec/dovecot/anvil
exe:/usr/libexec/dovecot/imap
exe:/usr/libexec/dovecot/imap-login
exe:/usr/libexec/dovecot/lmtp
exe:/usr/libexec/dovecot/managesieve
exe:/usr/libexec/dovecot/managesieve-login
exe:/usr/libexec/dovecot/pop3
exe:/usr/libexec/dovecot/pop3-login
exe:/usr/libexec/gam_server
exe:/usr/libexec/hald-addon-acpi
exe:/usr/libexec/hald-addon-keyboard
exe:/usr/local/bin/clamd
exe:/usr/local/bin/freshclam
exe:/usr/local/bin/pureftpd_uploadscan.sh
exe:/usr/local/directadmin/dataskq
exe:/usr/local/directadmin/directadmin
exe:/usr/local/libexec/dovecot/imap
exe:/usr/local/libexec/dovecot/imap-login
exe:/usr/local/libexec/dovecot/pop3
exe:/usr/local/libexec/dovecot/pop3-login
exe:/usr/local/php53/bin/php53
exe:/usr/local/php53/bin/php-cgi53
exe:/usr/local/php53/bin/php_uploadscan.sh
exe:/usr/local/php53/sbin/php-fpm53
exe:/usr/local/php54/bin/php54
exe:/usr/local/php54/bin/php-cgi54
exe:/usr/local/php54/bin/php_uploadscan.sh
exe:/usr/local/php54/sbin/php-fpm54
exe:/usr/local/php55/bin/php55
exe:/usr/local/php55/bin/php-cgi55
exe:/usr/local/php55/bin/php_uploadscan.sh
exe:/usr/local/php55/sbin/php-fpm55
exe:/usr/local/php56/bin/php56
exe:/usr/local/php56/bin/php-cgi56
exe:/usr/local/php56/bin/php_uploadscan.sh
exe:/usr/local/php56/sbin/php-fpm56
exe:/usr/local/sbin/nginx
exe:/usr/sbin/exim
exe:/usr/sbin/hald
exe:/usr/sbin/httpd
exe:/usr/sbin/mysqld
exe:/usr/sbin/mysqld_safe
exe:/usr/sbin/named
exe:/usr/sbin/nginx
exe:/usr/sbin/ntpd
exe:/usr/sbin/proftpd
exe:/usr/sbin/pure-ftpd
exe:/usr/sbin/sshd

Critically important to have these lines:

Code:
exe:/usr/local/directadmin/dataskq
exe:/usr/local/directadmin/directadmin

and restart

Code:
service lfd restart
 
At the moment my /etc/csf/csf.pignore contains:

exe:/usr/sbin/sshd
exe:/usr/sbin/proftpd
exe:/usr/libexec/gam_server
exe:/usr/sbin/named
exe:/usr/sbin/exim
exe:/usr/sbin/mysqld
exe:/usr/sbin/mysqld_safe
exe:/usr/libexec/hald-addon-acpi
exe:/usr/sbin/hald
exe:/bin/dbus-daemon
exe:/usr/bin/dbus-daemon-1
exe:/usr/libexec/hald-addon-keyboard
exe:/usr/libexec/dovecot/pop3-login
exe:/usr/libexec/dovecot/imap-login
exe:/usr/local/directadmin/directadmin
exe:/usr/local/directadmin/dataskq
exe:/usr/sbin/httpd
exe:/usr/bin/dbus-daemon
exe:/usr/local/mysql-5.1.54-linux-x86_64/bin/mysqld
exe:/usr/libexec/dovecot/anvil
exe:/usr/sbin/ntpd
exe:/sbin/ntpd
exe:/usr/libexec/dovecot/pop3
exe:/usr/libexec/dovecot/imap
exe:/usr/local/libexec/dovecot/pop3
exe:/usr/local/libexec/dovecot/pop3-login
exe:/usr/local/libexec/dovecot/imap
exe:/usr/local/libexec/dovecot/imap-login
exe:/usr/libexec/dovecot/anvil
exe:/usr/sbin/chronyd
exe:/usr/lib/polkit-1/polkitd
exe:/usr/sbin/exim
exe:/usr/bin/dbus-daemon
exe:/usr/sbin/nscd
exe:/usr/sbin/mysqld
exe:/usr/sbin/mysqld_safe


The last 2 rows have been added last saterday.
Am i missing any important lines which could fail the backup?

Hopefully the last 2 lines will make the backup work.

I'll let you all know saterday.
 
Apperently this wasn't the reason of the crash.
This morning the backup crashed again...
 
When searching for the reason why the backup still failed, i looked at the log for killing processes and there i see what i don't want to see:

Code:
Dec 19 06:03:49 fvandenberg lfd[20654]: *User Processing* PID:28771 Kill:0 User:mysql Time:592775 EXE:/usr/bin/bash CMD:/bin/sh /usr/bin/mysqld_safe --basedir=/usr

What to do with this? Why is lfd still killing my mysqld_safe process?
I did restart lfd.
 
No, you've got Kill:0.

Run backup (of the DBs) in live (not with cron) with debug mode enabled and see what might be wrong, and/or read MySQL and Directadmin logs for more details. Without seeing logs hardly can we help you more.

Code:
[COLOR=#000000][FONT=courier new]/usr/local/directadmin/dataskq d800
[/FONT][/COLOR]
 
I searched for the logfile and searched in the logfile for the error.

The error starts at 1:02 saterday night:

Code:
151226  1:02:16 [Warning] mysqld: Disk is full writing '/tmp/STg8ceuY' (Errcode: 28). Waiting for someone to free space... (Expect up to 60 secs delay for server to continue after freeing disk space)

It seems the backup hasn't enough room to create itself. Which is odd, because i've got at least 100 GB free space on the server. Why doesn't it work to create the backup then?
 
Probably not the expected result:

Code:
-bash: /usr/local/directadmin: ls a directory
 
Type (or copy/paste) and execute fully as it is written here, do NOT cut it:

Code:
[COLOR=#333333]/usr/local/directadmin/directadmin c | grep tmpdir
[/COLOR]
 
Sorry, my bad.

Copy pasting doesn't work to the console (using the console of the webhost) and i copied the entire string. Sorry.

The reply is:
Code:
tmpdir=/home/tmp
backup_tmpdir=/home/tmp

Thank you for your patience and your help.
 
Back
Top