mysql error

jonium

Verified User
Joined
Nov 10, 2010
Messages
210
Location
Alezio - Lecce- Apulia - South Italy
Hi all,
from this night at 4 the MySql service on a server stopped and I can't access the server via SSHD anymore.
All other services seem to are running, only clamd stopped some minutes ago.
I can access Directadmin and operate but can't access via SSH.
I suspected that some space error occurred and delete some sites that needed to be deleted to free some space but the problem is still unresolved.
Maybe the space to be freed is the tmp partition, can you help me understand is it be possible?
Here is last messages of /var/log/messages log:
Dec 26 15:02:06 servername kernel: ata2.00: exception Emask 0x50 SAct 0x0 SErr 0x4090800 action 0xe frozen

Dec 26 15:02:06 servername kernel: ata2.00: irq_stat 0x00400040, connection status changed

Dec 26 15:02:06 servername kernel: ata2: SError: { HostInt PHYRdyChg 10B8B DevExch }

Dec 26 15:02:06 servername kernel: ata2.00: failed command: FLUSH CACHE EXT

Dec 26 15:02:06 servername kernel: ata2.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 26#012 res 40/00:00:50:68:2b/00:00:c7:00:00/40 Emask 0x50 (ATA bus error)

Dec 26 15:02:06 servername kernel: ata2.00: status: { DRDY }

Dec 26 15:02:06 servername kernel: ata2: hard resetting link

Dec 26 15:02:07 servername kernel: ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)

Dec 26 15:02:07 servername kernel: ata2.00: failed to read native max address (err_mask=0x1)

Dec 26 15:02:07 servername kernel: ata2.00: HPA support seems broken, skipping HPA handling

Dec 26 15:02:07 servername kernel: ata2.00: revalidation failed (errno=-5)

Dec 26 15:02:12 servername kernel: ata2: hard resetting link

Dec 26 15:02:12 servername kernel: ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)

Dec 26 15:02:12 servername kernel: ata2.00: failed to enable AA (error_mask=0x1)

Dec 26 15:02:12 servername kernel: ata2.00: failed to enable AA (error_mask=0x1)

Dec 26 15:02:12 servername kernel: ata2.00: configured for UDMA/133 (device error ignored)

Dec 26 15:02:12 servername kernel: ata2.00: retrying FLUSH 0xea Emask 0x50

Dec 26 15:02:12 servername kernel: ata2.00: FLUSH failed Emask 0x1

Dec 26 15:02:12 servername kernel: ata2.00: device reported invalid CHS sector 0

Dec 26 15:02:12 servername kernel: ata2: EH complete

Dec 26 15:02:12 servername kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0

Dec 26 15:02:12 servername kernel: ata2.00: irq_stat 0x40000001

Dec 26 15:02:12 servername kernel: ata2.00: failed command: FLUSH CACHE EXT

Dec 26 15:02:12 servername kernel: ata2.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 28#012 res 51/10:00:00:00:00/00:00:00:00:00/00 Emask 0x81 (invalid argument)

Dec 26 15:02:12 servername kernel: ata2.00: status: { DRDY ERR }

Dec 26 15:02:12 servername kernel: ata2.00: error: { IDNF }

Dec 26 15:02:12 servername kernel: ata2.00: failed to enable AA (error_mask=0x1)

Dec 26 15:02:12 servername kernel: ata2.00: failed to enable AA (error_mask=0x1)

Dec 26 15:02:12 servername kernel: ata2.00: configured for UDMA/133 (device error ignored)

Dec 26 15:02:12 servername kernel: ata2.00: device reported invalid CHS sector 0

Dec 26 15:02:12 servername kernel: ata2: EH complete

Dec 26 15:02:12 servername kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0

Dec 26 15:02:12 servername kernel: ata2.00: irq_stat 0x40000001

Dec 26 15:02:12 servername kernel: ata2.00: failed command: FLUSH CACHE EXT

Dec 26 15:02:12 servername kernel: ata2.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 1#012 res 51/10:00:00:00:00/00:00:00:00:00/00 Emask 0x81 (invalid argument)

Dec 26 15:02:12 servername kernel: ata2.00: status: { DRDY ERR }

Dec 26 15:02:12 servername kernel: ata2.00: error: { IDNF }

Dec 26 15:02:12 servername kernel: ata2.00: failed to enable AA (error_mask=0x1)

Dec 26 15:02:12 servername kernel: ata2.00: failed to enable AA (error_mask=0x1)

Dec 26 15:02:12 servername kernel: ata2.00: configured for UDMA/133 (device error ignored)

Dec 26 15:02:12 servername kernel: ata2.00: device reported invalid CHS sector 0

Dec 26 15:02:12 servername kernel: ata2: EH complete

Dec 26 15:02:12 servername kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0

Dec 26 15:02:12 servername kernel: ata2.00: irq_stat 0x40000001

Dec 26 15:02:12 servername kernel: ata2.00: failed command: FLUSH CACHE EXT

Dec 26 15:02:12 servername kernel: ata2.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 5#012 res 51/10:00:00:00:00/00:00:00:00:00/00 Emask 0x81 (invalid argument)

Dec 26 15:02:12 servername kernel: ata2.00: status: { DRDY ERR }

Dec 26 15:02:12 servername kernel: ata2.00: error: { IDNF }

Dec 26 15:02:12 servername kernel: ata2.00: failed to enable AA (error_mask=0x1)

Dec 26 15:02:13 servername kernel: ata2.00: failed to enable AA (error_mask=0x1)

Dec 26 15:02:13 servername kernel: ata2.00: configured for UDMA/133 (device error ignored)

Dec 26 15:02:13 servername kernel: ata2.00: device reported invalid CHS sector 0

Dec 26 15:02:13 servername kernel: ata2: EH complete

Dec 26 15:02:13 servername kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0

Dec 26 15:02:13 servername kernel: ata2.00: irq_stat 0x40000001

Dec 26 15:02:13 servername kernel: ata2.00: failed command: FLUSH CACHE EXT

Dec 26 15:02:13 servername kernel: ata2.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 9#012 res 51/10:00:00:00:00/00:00:00:00:00/00 Emask 0x81 (invalid argument)

Dec 26 15:02:13 servername kernel: ata2.00: status: { DRDY ERR }

Dec 26 15:02:13 servername kernel: ata2.00: error: { IDNF }

Dec 26 15:02:13 servername kernel: ata2.00: failed to enable AA (error_mask=0x1)

Dec 26 15:02:13 servername kernel: ata2.00: failed to enable AA (error_mask=0x1)

Dec 26 15:02:13 servername kernel: ata2.00: configured for UDMA/133 (device error ignored)

Dec 26 15:02:13 servername kernel: ata2.00: device reported invalid CHS sector 0

Dec 26 15:02:13 servername kernel: ata2: EH complete

Dec 26 15:02:13 servername kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0

Dec 26 15:02:13 servername kernel: ata2.00: irq_stat 0x40000001

Dec 26 15:02:13 servername kernel: ata2.00: failed command: FLUSH CACHE EXT

Dec 26 15:02:13 servername kernel: ata2.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 13#012 res 51/10:00:00:00:00/00:00:00:00:00/00 Emask 0x81 (invalid argument)

Dec 26 15:02:13 servername kernel: ata2.00: status: { DRDY ERR }

Dec 26 15:02:13 servername kernel: ata2.00: error: { IDNF }

Dec 26 15:02:13 servername kernel: ata2.00: failed to enable AA (error_mask=0x1)

Dec 26 15:02:13 servername kernel: ata2.00: failed to enable AA (error_mask=0x1)

Dec 26 15:02:13 servername kernel: ata2.00: configured for UDMA/133 (device error ignored)

Dec 26 15:02:13 servername kernel: ata2.00: device reported invalid CHS sector 0

Dec 26 15:02:13 servername kernel: ata2.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 13#012 res 51/10:00:00:00:00/00:00:00:00:00/00 Emask 0x81 (invalid argument)

Dec 26 15:02:13 servername kernel: ata2.00: status: { DRDY ERR }

Dec 26 15:02:13 servername kernel: ata2.00: error: { IDNF }

Dec 26 15:02:13 servername kernel: ata2.00: failed to enable AA (error_mask=0x1)

Dec 26 15:02:13 servername kernel: ata2.00: failed to enable AA (error_mask=0x1)

Dec 26 15:02:13 servername kernel: ata2.00: configured for UDMA/133 (device error ignored)

Dec 26 15:02:13 servername kernel: ata2.00: device reported invalid CHS sector 0

Dec 26 15:02:13 servername kernel: ata2.00: failed to enable AA (error_mask=0x1)

Dec 26 15:02:13 servername kernel: ata2.00: failed to enable AA (error_mask=0x1)

Dec 26 15:02:13 servername kernel: ata2.00: configured for UDMA/133 (device error ignored)

Dec 26 15:02:13 servername kernel: ata2.00: device reported invalid CHS sector 0

Dec 26 15:02:17 servername journal: Runtime journal is using 1.5G (max allowed 1.5G, trying to leave 2.3G free of 14.0G available ? current limit 1.5G).

Dec 26 15:02:17 servername journal: Journal started

Dec 26 15:02:17 servername kernel: sd 1:0:0:0: [sdb] tag#13 Sense Key : Illegal Request [current] [descriptor]

Dec 26 15:02:17 servername kernel: sd 1:0:0:0: [sdb] tag#13 Add. Sense: Logical block address out of range

Dec 26 15:02:17 servername kernel: sd 1:0:0:0: [sdb] tag#13 CDB: Synchronize Cache(10) 35 00 00 00 00 00 00 00 00 00

Dec 26 15:02:17 servername kernel: blk_update_request: I/O error, dev sdb, sector 2559872

Dec 26 15:02:17 servername kernel: md: super_written gets error=-5, uptodate=0

Dec 26 15:02:17 servername kernel: md/raid1:md3: Disk failure on sdb3, disabling device.#012md/raid1:md3: Operation continuing on 1 devices.
after those logs there aren't any logs until today, strange...
Is it safe to try to restart the server? this way the tmp should empty I think

Thanks for your help.
 
Your hard disk is physically dieing/dead.

You need to contact your host and get the disk swapped out for a new one.

I'm hoping you have off-site backups available.
 
it dangerous.
backup your data first, And replace your hardware. or tobe safe, just buy more Disk and use raid6.

Raid1 still dangerous for data.


Any raid still can't be safe. but it can help you prevent from loss data.
 
I wish to be sure that the problem is the disk, because I suspect that the tmp partition is full

Whilst that may also be true, you definitely have at least one faulty disk - quite possibly two.

You should really be monitoring your drives on a regular basis using something like smartctl:

[root@cello ~]# smartctl -a /dev/nvme1n1 | grep -i "test result"
SMART overall-health self-assessment test result: PASSED
[root@cello ~]#

If at any point you see a drive has failed, it needs replacing ASAP.

The biggest security will always be daily off-site backups though.
 
already sure that your drive have something wrong

Dec 26 15:02:17 servername kernel: md/raid1:md3: Disk failure on sdb3, disabling device.#012md/raid1:md3: Operation continuing on 1 devices.
 
Whilst that may also be true, you definitely have at least one faulty disk - quite possibly two.

You should really be monitoring your drives on a regular basis using something like smartctl:

[root@cello ~]# smartctl -a /dev/nvme1n1 | grep -i "test result"
SMART overall-health self-assessment test result: PASSED
[root@cello ~]#

If at any point you see a drive has failed, it needs replacing ASAP.

The biggest security will always be daily off-site backups though.
thank you, I already do that, i received no notifications from smart
 
sometimes drives just become offline, without any notifications. But hope your monitoring service checks at least once a hour smart attributes (realocated sectors, wearing level, reserved space, working hours, unrecowerable errors) and once a 10 minutes raid consistency. This will give you enogh time ti replace/rebuild till unbalanced overload kill remaining disks.
 
thank you, I already do that, i received no notifications from smart
Tested the wrong device I guess. You should use the device which says is defective in the command:

Disk failure on sdb3
So try:
smartctl -a /dev/sdb3 | grep -i "test result"

I would advise to use the tips above. And make sure afterwards you never let /tmp get that full.
 
Back
Top