Extremely High Server Load- can't access

shadowq · Nov 7, 2023

Hi there,

I've had a new issue start over the past 4-8 weeks. Once every 5-10 days, the server will crawl up into a load of 100+, and become completely unresponsive. Can't access via SSH, can't load DA's login panel, etc. The only way forward is to SSH into the node, grab a list of processes, and kill the mysqld process for container. This calms the server down instantly, and allows me to connect again. However, there is nothing at all in the mysqld log file that's concerning. I've looked through every other log file in /var/log to see if there's anything else that's coinciding with the high load. Nope. I'm not sure where to look next. Any assistance would be great!

The node is a Xeon E-2236, dual 1.92TB NVMe drives, 64GB ECC RAM.
Container has 8 cores, 16GB RAM, with very quiet neighbours.

Thanks in advance!
Jarrod.

Active8 · Nov 8, 2023

Using Wordpress ? then check your plugins and install Wordfence to throttle amount of connection per/min
without any logs its hard to say

Zhenyapan · Nov 8, 2023

configure mysql to write slowlog with requests longer than 10sec at least.
check with mysqltuner if there enough ram to allowed amount of connections.
also you can configure mysql to kill long queries (if you don't have such normally, like some dumps or imports).

toml · Nov 8, 2023

If you don't have sar updating your system (i.e. adding sa1 and sa2 contab entries), I suggest you enable it. Then when something like this occurs you can run sar and see what your loads looked like. This will give you an idea if the bottleneck is CPU (if you see high CPU) or disk subsystem if you see high iowait percentage.

shadowq · Nov 13, 2023

Active8 said:
Using Wordpress ? then check your plugins and install Wordfence to throttle amount of connection per/min
without any logs its hard to say

Multiple users utilising WP. All utilising WF though. Agreed that it's difficult without logs!

shadowq · Nov 13, 2023

Zhenyapan said:
configure mysql to write slowlog with requests longer than 10sec at least.
check with mysqltuner if there enough ram to allowed amount of connections.
also you can configure mysql to kill long queries (if you don't have such normally, like some dumps or imports).

mysqltuner seems happy with results:
[OK] Maximum reached memory usage: 1.3G (5.32% of installed RAM)
[OK] Maximum possible memory usage: 2.5G (10.59% of installed RAM)
[OK] Overall possible memory usage with other process is compatible with memory available
[OK] Slow queries: 0% (0/216K)
[OK] Highest usage of available connections: 31% (48/151)

Zhenyapan · Nov 13, 2023

so just wait till something appears in logs.

shadowq · Nov 14, 2023

Zhenyapan said:
so just wait till something appears in logs.

The longest it has been left as unresponsive and a super high load is about an hour or so. IMO, if there's no logs by then, there are not going to be any logs.

ccto · Nov 14, 2023

You may check whether it is CPU bound or I/O bound

Code:

# ### Which process is using CPU, and check wa (I/O wait)
top -d1

# ### check %util which devices is busy
iostat 1 -x   

# ### # which process is using I/O heavily
iotop       

# ### To monitor MySQL running process ###
watch -n 1 "echo 'show full processlist;' | mysql --defaults-extra-file=/usr/local/directadmin/conf/my.cnf"

I had an experience - it was running OpenVZ container, for unknown reason, around/over 100 days uptime, the host level swap glowed to around 5GB, and suddenly caused I/O wait very high (and loadavg over 100, compared loadavg < 2 normally). My solution is simply restarting all containers.

Extremely High Server Load- can't access

shadowq

Verified User

Active8

Verified User

Zhenyapan

Verified User

toml

Verified User

shadowq

Verified User

shadowq

Verified User

Zhenyapan

Verified User

shadowq

Verified User

ccto

Verified User