Robots.txt blocking Addon Domains
-
I have this site as my primary domain: http://www.libertyresourcedirectory.com/
I don't want to give spiders access to the site at all so I tried to do a simple Disallow: / in the robots.txt. As a test I tried to crawl it with Screaming Frog afterwards and it didn't do anything. (Excellent.)
However, there's a problem. In GWT, I got an alert that Google couldn't crawl ANY of my sites because of robots.txt issues. Changing the robots.txt on my primary domain, changed it for ALL my addon domains. (Ex. http://ethanglover.biz/ ) From a directory point of view, this makes sense, from a spider point of view, it doesn't.
As a solution, I changed the robots.txt file back and added a robots meta tag to the primary domain. (noindex, nofollow). But this doesn't seem to be having any effect. As I understand it, the robots.txt takes priority.
How can I separate all this out to allow domains to have different rules? I've tried uploading a separate robots.txt to the addon domain folders, but it's completely ignored. Even going to ethanglover.biz/robots.txt gave me the primary domain version of the file. (SERIOUSLY! I've tested this 100 times in many ways.)
Has anyone experienced this? Am I in the twilight zone? Any known fixes? Thanks.
Proof I'm not crazy in attached video.
-
Sort of resolved, maybe the wrong place to ask any further. The above is a working fix for what seems like a legit bug, I'll update if WordPress forums say anything.
-
No, I don't like to waste memory and bandwidth. If you can do it yourself, you should probably do it yourself. I'm moving this question to WordPress.
-
Hi Ethan
One thing I have heard of people trying is a plugin that serves dynamic robots.txt files. I don't use add-on sites so you will probably have to test the behavior. He is an example of one of the plugins.
https://wordpress.org/plugins/wp-robots-txt/
hope this helps,
Anthony -
Ethan
It sounds like the issue has been resolved. I'm not too familiar with domain add-ons but if you have any more trouble let us know and I'll be sure another Moz Associate takes a look.
-Dan (Moz Associate)
-
-
Hi Ethan
Sorry, I wasn't clear. I was thinking you could drop the use of the robots.txt all together and just use the Meta Tag approach since it seems that the robots.txt is having a global impact to your sites. Search engines will still crawl the pages, but it should exclude them from the index.
Hope this helps,
Anthony -
Anthony, based on your response it's obvious you haven't read the question or follow-up.
-
Hi Ethan
One approach may be to try using the Robots Meta Tag. You can use noindex to tell Google not to index. This won't prevent crawling, but Google should respect the request to not index your site. I have included a good guide below to get you started.
https://developers.google.com/webmasters/control-crawl-index/docs/robots_meta_tag
Hope this helps,
Anthony B
Biondo Creative
biondocreative.com -
I've found a quick fix for now: http://ethanglover.biz/using-robots-txt-with-addon-domains/
This is still an issue, and it may be exclusive to WordPress.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Do I need a separate robots.txt file for my shop subdomain?
Hello Mozzers! Apologies if this question has been asked before, but I couldn't find an answer so here goes... Currently I have one robots.txt file hosted at https://www.mysitename.org.uk/robots.txt We host our shop on a separate subdomain https://shop.mysitename.org.uk Do I need a separate robots.txt file for my subdomain? (Some Google searches are telling me yes and some no and I've become awfully confused!
Technical SEO | | sjbridle0 -
Different Domains on Same IP
Hello I'm just wondering how much of a difference it makes having links to a site from 2 separate domains that are on the same IP, compared to if the domains were on separate IPs? Thank you! Sam
Technical SEO | | wearehappymedia0 -
Robots.txt - "File does not appear to be valid"
Good afternoon Mozzers! I've got a weird problem with one of the sites I'm dealing with. For some reason, one of the developers changed the robots.txt file to disavow every site on the page - not a wise move! To rectify this, we uploaded the new robots.txt file to the domain's root as per Webmaster Tool's instructions. The live file is: User-agent: * (http://www.savistobathrooms.co.uk/robots.txt) I've submitted the new file in Webmaster Tools and it's pulling it through correctly in the editor. However, Webmaster Tools is not happy with it, for some reason. I've attached an image of the error. Does anyone have any ideas? I'm managing another site with the exact same robots.txt file and there are no issues. Cheers, Lewis FNcK2YQ
Technical SEO | | PeaSoupDigital0 -
Is there any value in having a blank robots.txt file?
I've read an audit where the writer recommended creating and uploading a blank robots.txt file, there was no current file in place. Is there any merit in having a blank robots.txt file? What is the minimum you would include in a basic robots.txt file?
Technical SEO | | NicDale0 -
URL redirecting domains
Hi Is there anything wrong/dangerous forwarding a clutch of domains to a sub page (landing page) on a different domain ? Say Brand X buys Brand Z and wants to close down Brand Z site but have Brand Z domain fwd to a landing page (explaining the company acquisition) on Brand X site. In addition Brand Z had a few related but unused domains forwarding to Brand Z doman & now also wants those fwd'd to the new landing page on brand X Since the reasons for doing this forwarding are legitimate company reasons relating to an acquisition i would have thought it should be ok but can anyone think of a reason why could be bad since i remember in the old days peeps used to redirect domains for seo reasons so worried fwd'ing a load of domains could cause some sort of negative flag with big G ? Also do domain redirects transfer the authority/juice from the old site/domain to the new destination page (new landing page on brand x site) similar to how a 301 redirect works ? Many Thanks Dan
Technical SEO | | Dan-Lawrence0 -
IPs and Domains
If a domain loads on the domain and the IP is that a problem? So it loads on domain.com and 69.16.....com Thanks!
Technical SEO | | tylerfraser0 -
Site not being Indexed that fast anymore, Is something wrong with this Robots.txt
My wordpress site's robots.txt used to be this: User-agent: * Disallow: Sitemap: http://www.domainame.com/sitemap.xml.gz I also have all in one SEO installed and other than posts, tags are also index,follow on my site. My new posts used to appear on google in seconds after publishing. I changed the robots.txt to following and now post indexing takes hours. Is there something wrong with this robots.txt? User-agent: * Disallow: /cgi-bin Disallow: /wp-admin Disallow: /wp-includes Disallow: /wp-content/plugins Disallow: /wp-content/cache Disallow: /wp-content/themes Disallow: /wp-login.php Disallow: /wp-login.php Disallow: /trackback Disallow: /feed Disallow: /comments Disallow: /author Disallow: /category Disallow: */trackback Disallow: */feed Disallow: */comments Disallow: /login/ Disallow: /wget/ Disallow: /httpd/ Disallow: /*.php$ Disallow: /? Disallow: /*.js$ Disallow: /*.inc$ Disallow: /*.css$ Disallow: /*.gz$ Disallow: /*.wmv$ Disallow: /*.cgi$ Disallow: /*.xhtml$ Disallow: /? Disallow: /*?Allow: /wp-content/uploads User-agent: TechnoratiBot/8.1 Disallow: ia_archiverUser-agent: ia_archiver Disallow: / disable duggmirror User-agent: duggmirror Disallow: / allow google image bot to search all imagesUser-agent: Googlebot-Image Disallow: /wp-includes/ Allow: /* # allow adsense bot on entire siteUser-agent: Mediapartners-Google* Disallow: Allow: /* Sitemap: http://www.domainname.com/sitemap.xml.gz
Technical SEO | | ideas1230 -
Why Google did not index our domain?
Hi, We launched tmart 60 days ago and submitted to google, bing, yahoo 20 days later. But google had never indexed our website still when yahoo indexed it in one week. What we have checked or tried: 1. We got 20~50 inlinks in one month and now 81 inlinks via yahoo site explorer. 2. This domain has registered for 13 years and we purchased it from sedo last year. We
Technical SEO | | zt673
did not find any problems from domain archive pages. 3. Page similar: the homepage is 50% similar to one of our competitors when we just launched.
So we adjusted the page structure and modified the content one month later and decreased the similarity to 30% (by tools from webconfs.com) 4. Google Robots: googlebot crawled our website every day after we submitted for indexing.
We opened GWT account for it and added the xml sitemap last week. GWT said nothing
was wrong except the time of page loading. Our questions: Why google did not indexed our website? What should we do? Thanks, wu0