Monitoring for heavy users (domains)

elbarto

Verified User
Joined
Oct 8, 2008
Messages
133
Do you know of any tool that can help me debug which domains (virtual hosts) are being most heavily used?

I have a web server with many users and about 300 registered domains. I want to see which of them receive more requests per minute/hour/day, so that I can move them to other web servers.

My guess would be that some tool which analizes all Apache's access logs could do the trick. I could code something like that, but as it seems a pretty standard procedure, I thought maybe someone has done this before.

Thanks in advance
 
Hi.

i'm not so good programmer but script below may help you (i hope :))

Code:
#!/bin/bash

LOG="/var/tmp/domains.log"
DA="/usr/local/directadmin/data/users"
rm -f $LOG
touch $LOG

for username in `ls $DA`; do
    cat ${DA}/${username}/domains.list | \
    while read domain; do
      if [ -f /var/log/httpd/domains/${domain}.log ]; then
        REQ=`cat  /var/log/httpd/domains/${domain}.log | wc -l`
        if [ $REQ -gt 100 ]; then
            echo "$REQ $domain" >> $LOG
        fi
      fi
    done
done

sort -nr  $LOG | more

# EOF

save code to some file (an example file.sh) and change attribute:
chmod 755 file.sh
Script will show request per day for each domain

(Yes i know that code isn't perfect but it's work :))
 
Last edited:
Thanks snk, that gives me some ideas.

The problem there is that it gives me the amount of requests per domain in a period of time based on the log files I have. That raises at least two problems:

- The amount of requests, though important, doesn't say much about the actual traffic in bytes. I guess a more complete approach should sum the bytes of each request.

- If the log files rotate when they reach a specific size (say 1 MB), then each domain will have logs for a different period of times. Let's say I have domain1.com and domain2.com. The first one receives lots of visits each day and the other one receives very little, but it's been online for a long time. It could be possible that both access log files could have the same amount of lines (2500 lines, about 240 kb). But the difference would be that domain1.com had 2500 requests in one day, and domain2.com had 2500 requests in 3 months.

However, it was helpful to learn that I could get a list of domains from /usr/local/directadmin/data/users/<user>/domains.list

I think I would need a sort of real-time approach. Something that I could run in a specific moment (when server load is higher), pipe all access logs to a single file and then process it, grouping by domain, summarizing the requests' sizes and sorting by max size.
 
I haven't found the perfect solution yet. I still need something that gives me a real-time idea of which domain is receiving more requests at a specific moment.

But here you can see the little script I made (using snk's idea): http://www.tail-f.com.ar/files/access_log_parser.sh.txt

I published the explanation on my blog (in spanish) if its of any help.

If you have any new ideas I would be very pleased to hear them. Thanks for your support!
 
You can also use apache's built-in ExtendedStatus

ExtendedStatus On
<Location /httpd-status>
SetHandler server-status
</Location>
 
Thanks Floyd... that's useful too but it's pretty static too...

Do you know if there is any way to check that extended status from the console? (I mean without making an HTTP request).

The ideal application that I'm thinking of is some sort of "top", where I could see which domains are receiving the most requests in realtime. I know it's too much to ask, but it could be done with the same information that the extended status provides and a little coding with your favourite programming language hehe.

Maybe if I had a command which returned the same information, I could make a parser, and read the data once every second to make a dinamic sorting like in the top command. Of course I could make a localhost http request, but if it could be avoided, it would be much better.
 
pretty neat!

that seems to be what I was looking for... I will look into it and let you know how it went...

So far I think I might have to do some tweak to work with Directadmin's access_log files structures, because I think this script assumes that you provide a directory where all logs are access logs or something, and I have many different files on /var/log/httpd/domains (access logs, error logs, logs backups, etc.).

thanks so much scsi!
 
Back
Top