Robots.txt blocking Addon Domains
-
I have this site as my primary domain: http://www.libertyresourcedirectory.com/
I don't want to give spiders access to the site at all so I tried to do a simple Disallow: / in the robots.txt. As a test I tried to crawl it with Screaming Frog afterwards and it didn't do anything. (Excellent.)
However, there's a problem. In GWT, I got an alert that Google couldn't crawl ANY of my sites because of robots.txt issues. Changing the robots.txt on my primary domain, changed it for ALL my addon domains. (Ex. http://ethanglover.biz/ ) From a directory point of view, this makes sense, from a spider point of view, it doesn't.
As a solution, I changed the robots.txt file back and added a robots meta tag to the primary domain. (noindex, nofollow). But this doesn't seem to be having any effect. As I understand it, the robots.txt takes priority.
How can I separate all this out to allow domains to have different rules? I've tried uploading a separate robots.txt to the addon domain folders, but it's completely ignored. Even going to ethanglover.biz/robots.txt gave me the primary domain version of the file. (SERIOUSLY! I've tested this 100 times in many ways.)
Has anyone experienced this? Am I in the twilight zone? Any known fixes? Thanks.
Proof I'm not crazy in attached video.
-
Sort of resolved, maybe the wrong place to ask any further. The above is a working fix for what seems like a legit bug, I'll update if WordPress forums say anything.
-
No, I don't like to waste memory and bandwidth. If you can do it yourself, you should probably do it yourself. I'm moving this question to WordPress.
-
Hi Ethan
One thing I have heard of people trying is a plugin that serves dynamic robots.txt files. I don't use add-on sites so you will probably have to test the behavior. He is an example of one of the plugins.
https://wordpress.org/plugins/wp-robots-txt/
hope this helps,
Anthony -
Ethan
It sounds like the issue has been resolved. I'm not too familiar with domain add-ons but if you have any more trouble let us know and I'll be sure another Moz Associate takes a look.
-Dan (Moz Associate)
-
-
Hi Ethan
Sorry, I wasn't clear. I was thinking you could drop the use of the robots.txt all together and just use the Meta Tag approach since it seems that the robots.txt is having a global impact to your sites. Search engines will still crawl the pages, but it should exclude them from the index.
Hope this helps,
Anthony -
Anthony, based on your response it's obvious you haven't read the question or follow-up.
-
Hi Ethan
One approach may be to try using the Robots Meta Tag. You can use noindex to tell Google not to index. This won't prevent crawling, but Google should respect the request to not index your site. I have included a good guide below to get you started.
https://developers.google.com/webmasters/control-crawl-index/docs/robots_meta_tag
Hope this helps,
Anthony B
Biondo Creative
biondocreative.com -
I've found a quick fix for now: http://ethanglover.biz/using-robots-txt-with-addon-domains/
This is still an issue, and it may be exclusive to WordPress.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Redirect Chain Domain
MozPro is highlighting some redirect chain issues with our domain that I do not recall ever setting up in our redirect list. In our Moz Pro Campaign I see the Site Crawl has flagged 36 Redirect Chain Issues. I understand how the redirect chain errors can happen but I do not recall ever manually redirecting our domain, yet I have http://stickylife.com, https://stickylife.com & https://www.stickylife.com all associated in one of our redirect chain errors. When looking at our redirect files I do not see any of these domain redirects and wonder how this has happened and how to fix it. It appears as though our HTTP and HTTPS is causing some redirection. I wonder if this is coming from our DNS settings?
Technical SEO | | StickyLife0 -
Is sitemap required on my robots.txt?
Hi, I know that linking your sitemap from your robots.txt file is a good practice. Ok, but... may I just send my sitemap to search console and forget about adding ti to my robots.txt? That's my situation: 1 multilang platform which means... ... 2 set of pages. One for each lang, of course But my CMS (magento) only allows me to have 1 robots.txt file So, again: may I have a robots.txt file woth no sitemap AND not suffering any potential SEO loss? Thanks in advance, Juan Vicente Mañanas Abad
Technical SEO | | Webicultors0 -
Country Specific Domain
Guyz, we are new startups and have one very simple question regarding domain name. Should we use example.com or example.com.au ? Our Goal initially would be to target customer from Australia and gradually go global. So if we opt for .com.au we may have an edge in terms of local SEO in the beginning but lose out in the long run. What is the best way to tackle this? Thanks
Technical SEO | | WayneRooney0 -
Why is there duplicates of my domain
When viewing crawl diagnostics in SEOmoz I can see both "www.website.com" and a truncated version "website.com" is this normal and why is it showing (I do not have duplicates of my site on the server)? E.g.: http://www.klinehimalaya.com/
Technical SEO | | gorillakid
http://klinehimalaya.com/0 -
Robots.txt Question
In the past, I had blocked a section of my site (i.e. domain.com/store/) by placing the following in my robots.txt file: "Disallow: /store/" Now, I would like the store to be indexed and included in the search results. I have removed the "Disallow: /store/" from the robots.txt file, but approximately one week later a Google search for the URL produces the following meta description in the search results: "A description for this result is not available because of this site's robots.txt – learn more" Is there anything else I need to do to speed up the process of getting this section of the site indexed?
Technical SEO | | davidangotti0 -
Removing robots.txt on WordPress site problem
Hi..am a little confused since I ticked the box in WordPress to allow search engines to now crawl my site (previously asked for them not to) but Google webmaster tools is telling me I still have robots.txt blocking them so am unable to submit the sitemap. Checked source code and the robots instruction has gone so a little lost. Any ideas please?
Technical SEO | | Wallander0 -
Google (GWT) says my homepage and posts are blocked by Robots.txt
I guys.. I have a very annoying issue.. My Wordpress-blog over at www.Trovatten.com has some indexation-problems.. Google Webmaster Tools data:
Technical SEO | | FrederikTrovatten22
GWT says the following: "Sitemap contains urls which are blocked by robots.txt." and shows me my homepage and my blogposts.. This is my Robots.txt: http://www.trovatten.com/robots.txt
"User-agent: *
Disallow: /wp-admin/
Disallow: /wp-includes/ Do you have any idea why it says that the URL's are being blocked by robots.txt when that looks how it should?
I've read a couple of places that it can be because of a Wordpress Plugin that is creating a virtuel robots.txt, but I can't validate it.. 1. I have set WP-Privacy to crawl my site
2. I have deactivated all WP-plugins and I still get same GWT-Warnings. Looking forward to hear if you have an idea that might work!0 -
I am trying to block robots from indexing parts of my site..
I have a few websites that I mocked up for clients to check out my work and get a feel for the style I produce but I don't want them indexed as they have lore ipsum place holder text and not really optimized... I am in the process of optimizing them but for the time being I would like to block them. Most of my warnings and errors on my seomoz dashboard are from these sites and I was going to upload the folioing to the robot.txt file but I want to make sure this is correct: User-agent: * Disallow: /salondemo/ Disallow: /salondemo3/ Disallow: /cafedemo/ Disallow: /portfolio1/ Disallow: /portfolio2/ Disallow: /portfolio3/ Disallow: /salondemo2/ is this all i need to do? Thanks Donny
Technical SEO | | Smurkcreative0