Specific robots.txt overwritten by default robots.txt

patrickkasie

Verified User
Joined
Sep 21, 2021
Messages
243
Location
Een echte Hollander
Dear DirectAdmin forum,

I have been trying to get a default robots.txt working for all domains without overwriting any specific robots.txt that are already existing on the websites.

Here's what I have:
vpsxx.domain0.nl
domain1.nl
domain2.com
domain3.be

They all have a robots.txt by default, which is set like this:
/etc/httpd/conf/extra/httpd-alias.conf
Code:
(...)
Alias /robots.txt /var/www/html/robots.txt

This provides every single domain on the server with a default robots.txt. However, when domain1.nl wants to have a unique robots.txt, the default robots.txt overwrites that one.

My question is: how do you overwrite the default robots.txt and not the specific domain1.nl/robots.txt?
 
Why don't you let the customer choose?

You can set a default robots.txt file in your /home/admin/domains/default directory (same for resellers) to have it created on domain creation.
Then users can overwrite it if they want their own.

If you don't want that, maybe somebody else knows how to make an exclusion for 1 domain the way you are doing it now.
 
Why don't you let the customer choose?
I let the customer choose. When they have a robots.txt under their public_html folder or where it's served from, it should be their robots.txt that overwrites the default one, but it needs to happen for every single domain retroactively right now. Not only that, if we want a change, we should be able to do that on 1 central place, exactly like security.txt - which also suffers the same fate.

Basically, my current method is the reason why when you try to make a page called roundcube and visit that, you'll end up at roundcube. In my case, I'd like to overwrite this - for robots and any file I want them to be able to overwrite.
 
I let the customer choose.
Not really, because otherwise this 1 domain change wasn't needed. Robots.txt are pointed to a different robots.txt due to this alias.

when you try to make a page called roundcube and visit that, you'll end up at roundcube.
So how did you do that? If a make a page called roundcube and visit that, then I end op visiting the page I created. Is that what you mean?
Reason for this is because roundcube is not a file but a directory in alias.
Try making a /roundcube directory and visit that as customer. :)
So it's not like they can create their own roundcube directory now.

if we want a change, we should be able to do that on 1 central place, exactly like security.txt - which also suffers the same fate.
With security.txt that would be a violation of the AVG/GPDR. Unless you put your own contact info (company) in there ofcourse.
I haven't tested this, but probably you could put that in the same place I suggested before and then make it immutable so user can't change it. You can test this easily and then this problem would not exist anymore for you.
However same as with a domain_create_post.sh script in which you could set owner to root for security.txt) a problem could occur when the domain is deleted by the user. Maybe not if diradmin is used as owner, but you could easily test this yourself.

Providing or changing robots.txt is not a task of a hosting provider. However if you choose to do so, feel free to do it, but I doubt it can be done this way with the httpd-alias.conf as it will always use the server's robots.txt file.

Updating users robots.txt file might also be possible via a perl -pi command or something similar. However, again then you would be changing or adding stuff to something the user just would like to have different.

Anyway, for the current situation, you could change the users httpd.conf and set the alias to his own robots.txt in there, might work.
 
This is possible, but I'm not gonna complete the tutorial on this question. 😁


For apache,
You can combination with If/else statement like using "IfFile" to check the file exists or not and alias to the target global files
 
Back
Top