Excessive awstats usage?

Richard G

Verified User
Joined
Jul 6, 2008
Messages
12,560
Location
Maastricht
I don't know if this is normal... but more then 2000 seconds is more then 33 minutes, is that normal?

Code:
Resource:     Process Time
Exceeded:     2057 > 2000 (seconds)
Executable:   /bin/bash
Command Line: sh -c \"/usr/local/awstats/wwwroot/cgi-bin/awstats.pl\" -config=domain.nl -staticlinks=awstats.domain.nl.1811 -diricons=icon -configdir=/home/user/domains/domain.nl/awstats/.data -output=browserdetail  2>&1

Is only happening on 1 domain. But this could be due to the fact that it's the only forum running on the server.

Is this normal or is this taking up too much time?
 
Hello Richard,

Taking too much time for what? To parse 1Mb log file? I'd rather say yes, it's too much... but if you are on a single core server with 10+ Gbs of Apache logs, then it might be expected....
 
Also I would recommend to disable HTML file generation by adding /usr/local/directadmin/scripts/custom/awstats_process.sh and change line number 10 to become:

Code:
ADD_HTML=0
 
@Alex:
To parse 1Mb log file? I'd rather say yes, it's too much...
The domain log for that domain is at this moment (20.00 hours) 3.4 MB if that is the log you mean.
However, the server has an 8 core i7-3770 cpu with 16 GB RAM inside. So if that is too much time, do you have idea's to fix this? Or where to look for a cause?

@Ditto:
Thank you. But what are these html files are used for? Aren't they used for detail info? Next to that I never had this issue before on any of the servers, and it's only occuring on 1 domain.
 
And what about CPU load? Disk IO usage? Are they high ?

awstats is a PERL based application, and I believe you have no relative details in system logs? If the issue replicates every night tally I'd rather use strace -p <pid> and connect to the process when it parses logs and generates HTML files to see what might be wrong.
 
Oef I have no clue. It indeed replicates every night tally. But I don't know what time exactly that is so I only see the CSF notice emails about it. And I don't kow if cpu load and disk io are high at that point.
This is my top output for this moment (not running the tally):
Code:
Tasks: 311 total,   1 running, 310 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.4%us,  0.5%sy,  0.0%ni, 99.2%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  16147176k total, 15124176k used,  1023000k free,   862664k buffers
Swap:  8388540k total,    64828k used,  8323712k free, 10684596k cached

As far as I know I also don't have a high disk IO usage. This is from atop at this moment:
Code:
DSK |          sdb  | busy      7%  | read       0  | write    109 |  KiB/w     10 |  MBr/s   0.00 |  MBw/s   0.12 |  avio 6.45 ms |
DSK |          sda  | busy      6%  | read       0  | write    114 |  KiB/w     10 |  MBr/s   0.00 |  MBw/s   0.12 |  avio 5.47 ms |
 
[cut]@Ditto:
Thank you. But what are these html files are used for? Aren't they used for detail info? Next to that I never had this issue before on any of the servers, and it's only occuring on 1 domain.

The HTML files is generated to display static HTML pages of the statistic. They are not needed, without them you will only have dynamic generated awstats page directly from the .txt log files. Please see the bottom of this page: https://www.directadmin.com/features.php?id=1125

Code:
If you wish to increase system efficiency, you can disable the html files from being generated (as they are the bulk of the cpu-time) and have only the data files
be computed.  To do this, type:
cd /usr/local/directadmin/scripts/custom
cp ../awstats_process.sh .

edit the custom/awstats_process.sh, and set:
ADD_HTML=0

and the html files will no longer be generated, increasing performance.
 
@Ditto:
Thank you. I might consider doing that. But I'm still curious to the cause that only this one domain is having this issue.
I had a kindlike thing on a cpanel server once. They had a solution to delete some file which would then be rebuild and the problem was gone. I thought maybe in DA there was a kindlike solution.

Maybe it's taking more time because it's run during the tally and lots of things are done then.
 
You might clean all existing data in /awstats/ under your domain as well, if you don't need old data and see whether or not it helps.

I don't have any other ideas.
 
Oke thank you. I was just too late last night. There were 2 awstats processes running simulanuously, but was just too late to do the strace. There was no high cpu though.
I'll try your idea and if that does not help, just disable the html like Ditto said.

Thank you.
 
I did some further investigation and found something which I don't know what is the correct way.
On the server with the issue I found several accounts having files like this in the /awstats/.data directory:

Code:
awstats112017.some-domain.nl.txt
awstats112018.some-domain.nl.txt
and the suddenly one like this:
Code:
awstats112018.www.some-domain.nl.txt

So all are without www, and this month there is one with www in front of it.
They also contain a symlink like this:
-rw-r--r-- 1 user user 61K 2018-11-23 00:13 awstats.some-domain.conf
lrwxrwxrwx 1 user user 27 2017-02-14 00:15 awstats.www.some-domain.nl.conf -> awstats.some-domain.nl.conf

So I started looking on another server which does -not- have the .txt with the www in front of it. They do have the symlink.

Further checking of this problem domain I found this which certainly prooves something is wrong. For years this site has a couple of hundred K in there.
And now this... in 3 months time, that is not possible imho.

Have a look at the growing size:
Code:
216K 2018-10-01 00:12 awstats092018.some-domain.nl.txt
4.5M 2018-11-01 00:13 awstats102018.some-domain.nl.txt
10M 2018-11-23 00:13 awstats112018.some-domain.nl.txt
It seems this suddenly growing size is causing the timing issue imho.
While the file -with- the www is small again:
Code:
246K 2018-11-22 00:10 awstats112018.www.some-domain.nl.txt

So I got 2 questions.
1.) How can I fix this? Because such a sudden size growing is not normal.
2.) Why does 1 server has a single www.domain.nl.txt between the non-www ones in the data directory, and another server which has the exact same setup, does not have such www file present? Also a 3rd server but then with Centos 7 also does not have the awstats with the www in the filename.
 
Last edited:
When looking at my servers the www version of the .txt file seem to be generated almost only the 1 day of every month (only a very few www files is generated on different days). It likely is not a bug, but the way it works. But I would not know why some www versions is created on a different day in the month.

Because most of the www versions of the file on my servers is created on the 1 day of each month, I would guess it could be related to monthly summary being generated, but I have not looked closer into the files to determine that. If that is the case, it could maybe explain why the www files on my servers is a little bigger then the others on my servers.

Also your customer site could have received extra much trafic that day the file was so big.

Edit: By the way I notice the only www files on my servers that is not generated on the 1 day each month, is from today date. That tells me it is temporary generated www version of the file every day, but removed the next day, and they are only left behind the 1 day each month. So, in other words:

The newest file/date will also have a www version of the file, that www version of the files is alwarys removed the next time/day the tally run, and it only keept for the 1 day each months.
 
Last edited:
The newest file/date will also have a www version of the file, that www version of the files is alwarys removed the next time/day the tally run, and it only keept for the 1 day each months.
That would be a logical explanation, unfortunatelly incorrect.
In my case the www version of the file is from the day before the current day. My www version's are from the 22'nd and today it's the 23'rd so the tally already has run.
It could be that it was from the day before then, but still I wonder why they are not present on my other servers. But if you also have them on your server, then it's probably nothing to worry about.

Still... that leaves the enormous growth of the file. So there would be more traffic on the 1st of november and more then twice the traffic at the 23st? That's not really believable?
Can I check that some how? Because this site does not have that much traffice.
From Statistics:
Code:
901876 	100.00% 	784637 	100.00% 	10870740 	100.00% 	Unresolved/Unknown
So traffic until today is 10870740 kbytes which is 10 MB. In october it was even 15 MB but this would not generate a 15 MB txt file size. The october filesize was 275 K.

I don't say it's a bug, because it looks like all other sites are working well. Only this site needed more then 33 minutes last night to generate the awstats. I know I can disable the html files. But this should not happen and this filesize is odd for this site, so imho there must be something wrong somewhere.
 
Definataly something wrong here.

Top is showing that awstats.pl is using 97-100% cpu time.
I did an strace but I don't understand the code so I investaged further and it looks like the browserdetail is causing this issue.
Look at this:
Code:
-rw-r--r-- 1 user user  26M 2018-10-31 00:20 awstats.userdomain.nl.1810.browserdetail.html
And now the one from today:
Code:
-rw-r--r-- 1 user user 124M 2018-11-25 00:50 awstats.userdomain.nl.1811.browserdetail.html

So for some reason, only on this domain, only in october and november, the browserdetail.html is growing out of proportion and stalling the awstats process.
In spite of the fact that this site does not have very big lot of visitors.
And looking at any other month or year, this never happened, always was a couple of hundred K.
Look at the one from september:
Code:
-rw-r--r-- 1 user user  80K 2018-09-30 00:12 awstats.userdomain.nl.1809.browserdetail.html
Which prooves this site is not much visited. 80K and then suddenly 26M and 124M? Odd.

I found this though, which is rather unbelievable for that site and can't imagine that is causing this enormous growth of the files:
Most users ever online was 599 on 31 Oct 2018 19:49
 
Last edited:
And this was some of the strace output of the awstats.pl pid trace:
Code:
write(1, ".1599.66</td><td>No</td><td>3,16"..., 4096) = 4096
rt_sigaction(SIG_0, NULL, {0x9f0ed807, [], SA_NODEFER|SA_NOCLDWAIT|0x2c67268}, 8) = -1 EINVAL (Invalid argument)
rt_sigaction(SIGHUP, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGINT, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGQUIT, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGILL, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGTRAP, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGABRT, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGBUS, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGFPE, NULL, {SIG_IGN, [FPE], SA_RESTORER|SA_RESTART, 0x7f3d4167e570}, 8) = 0
rt_sigaction(SIGKILL, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGUSR1, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGSEGV, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGUSR2, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGPIPE, NULL, {SIG_IGN, [], 0}, 8) = 0
rt_sigaction(SIGALRM, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGTERM, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGSTKFLT, NULL, {SIG_DFL, [], 0}, 8) = 0
 
Almost all the domains for the exception of 5 domains (from 100+ hosted on one server) have their awstats.www.domain.com.conf linked to awstats.domain.com.conf since Feb 11 2017.


Strace output does not show anything useful.

Probably you have some hacking attempts through UserAgent. So you might need to check raw logs for not standard User agents.

And/or use SkipUserAgents="" and OnlyUserAgents="" in /home/USERNAME/domains/DOMAIN/awstats/.data/awstats.DOMAIN.conf
 
I have investigated with John and it's 1 user agent giving issues. Chrome has an overdose of entry's. Like this:
Code:
chrome52.0.3630.745 1 1
chrome57.0.4184.567 1 1
chrome55.0.8804.78 1 1
chrome41.0.8440.894 1 1
chrome59.0.5191.423 1 1
chrome49.0.6195.794 1 1
and so on.

Last evening I already tried something like you suggested:
LevelForBrowsersDetection=0 # 0 disables Browsers detection.
and
ShowBrowsersStats=None

I don't know what to fill in exactly with "SkipUserAgens" because Internet Explorer would be "msie" hence the example given in the file.

For Firefox probably Mozilla, but do is it just Chrome, for the Chrome browser? Or...? Because in the logs it often also shows Mozilla 5.0 and Applewebkit in lines with Chrome.
So I'm not sure.
I searched to disable the complete browser statistics but found no solution, except for maybe the SkipUserAgents.
 
It seems the bot that is giving you all this traffic does not identify itself with a unique user agent name, so it would be impossible to block a spesific user agent. Maybe study the log files and see if you are lucky and all the traffic is from the same IP or a small IP range, and then block that IP or IP range? Other then that, I think you only have 3 options at this time:

1: Don't do anything and let it consume the needed CPU time, as it does not looks like a bug, but rather high amount of bot visits.

2: Disable HTML file generation in AWStats, I think it would help reduce the CPU time a lot: https://forum.directadmin.com/showthread.php?t=57113&p=292120#post292120

3: In DirectAdmin 1.51.3 a new feature was added to disable AWStats on a per user account basis, so if this user does not need to use AWStats, maybe disable it for that account: https://www.directadmin.com/features.php?id=1935 Edit: Also see: https://www.directadmin.com/features.php?id=2123
 
Last edited:
So you think it's a bot? That could well be the case. I just discovered some Israeli thing which was hammering on all topics. So I blocked that ip. See what happens.
I also temporarily moved the 2 big files, because I'm not sure if they are counted every time when building awstats, so included they would explain the file to keep being that big.

1.) I also think it's not a bug. If it was, then more domains (and servers) would have issues.
2.) That was indeed one of the options presented by you earlier too, but I did not want to disable this globally.

3.) That seems like a nice option. However, I don't understand the second link.
So in /usr/local/directadmin/data/users/username/user.conf I have to add:
awstats=0
But then the second link:
Code:
To save the data, it's similar to the other per-item options:
CMD_CHANGE_INFO
awstats=anytext
awstatsvalue=0|1
For json, just include json=yes.
Is this just information, or do I have to do anything with this? Like put json=yes in user.conf or something like that? I'm not English, I don't understand this part completely.
 
Regarding the second link, it is about the feature now is available for end users in DirectAdmin. So don't worry about all the other information in that link, all you have to do is log in to DirectAdmin as the user, and go to "Site Summary / Statistics / Logs", and scroll down the page until you see this line:

"Awstats" - "On" - "Save AWstats"

Just set it to "Off. At least this feature is available as default in the Enhanced skin.
 
Back
Top