Standard Practice to monitor active client scripts?

Vibe · Feb 20, 2009

Hi everyone,

Over the weekend we had a client Perl script go haywire causing their Apache error log (/var/log/https/domains) to grow to 12GB within an hour - WHAT FUN!

This of course brought the server to it's knees as VAR was filled 100%. Not being able to SSH in (CPU @100% + extreme latency) this forced a trip to the data center, reboot into single user mode, and deleting the log file to bring everything back to normal.

We have never had this before (bear in mind we we host just a few sites and not as experienced as most in this forum

). We use Munin to monitor and send various system alerts - however, considering the hour window - we were not able to stop the event from occurring. Disk alerts come at 80% and 92% capacity - and in a short while we were already at 100%.

Munin (and I believe MRTG) are great for displaying raw data and learning specific patterns over time - however, little is available about specific users.

May I ask if there are standard practices used by the Pros in the forum with regard to monitoring "user" events such as (1)user script access times (2)cpu usage by user/user script(s) and (3)active user scripts? Rather than issuing a "ps -aux" while sitting at the helm I would believe something else is available?

Thanks for any input/suggestions that you can provide!

squirrelhost · Feb 20, 2009

I use a tiddly little perl script to monitor disk space from cron.
(each partition that is, i.e. it monitors /home, /usr etc ...
and handles multiple disks, each examined separately)

Simply emails, but if you've an email->sms thingy setup, it
could sms you, by altering email address to that one instead.

Installing and configuring complicated software to do it was a bit of overkill I thought.

Standard Practice to monitor active client scripts?

Vibe

Verified User

squirrelhost

Verified User