Subdomain Robots.txt
-
If I have a subdomain (a blog) that is having tags and categories indexed when they should not be, because they are creating duplicate content. Can I block them using a robots.txt file? Can I/do I need to have a separate robots file for my subdomain?
If so, how would I format it? Do I need to specify that it is a subdomain robots file, or will the search engines automatically pick this up?
Thanks!
-
Thanks Wissam. I was thinking this was the way to go, and I appreciate your input.
I do use the Yoast SEO plugin for Wordpress on another site, but the blog in question is through BlogEngine. I will do what you have suggested.
Cheers!
-
if the url is http://blog.website.com
then the Robots.txt should be accessable threw http://blog.website.com/robots.txt
I would suggest these steps
- Verify your blog the Google webmaster tools
- generate a robots .txt file with Google webmaster tools
- Upload it to the Subdomain.
There is another way if you are using Wordpress.
There is a All in One SEO plugin / Wordpress SEO by Yoast. threw the settings you can specify to add NOINDEX to all Category, tags, author and others. its faster and error free.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
One robots.txt file for multiple sites?
I have 2 sites hosted with Blue Host and was told to put the robots.txt in the root folder and just use the one robots.txt for both sites. Is this right? It seems wrong. I want to block certain things on one site. Thanks for the help, Rena
Technical SEO | | renalynd270 -
Blog on subdomain of e-commerce site
Hi guys. I've got an e-commerce site which we have very little control over. As such, we've created a subdomain and are hosting a WordPress install there, instead. This means that all the great content we're putting out (via bespoke pages on the subdomain) are less effective than if they were on the main domain. I've looked at proxy forwarding, but unfortunately it isn't possible through our servers, leaving the only option I can see being permenant redirects... What would be the best solution given the limitations of the root site? I'm thinking of wildcard rewrite rules (eg. link site.com/blog/articleTitle to blog.site.com/articleTitle) but I'm wondering if there's much of an SEO benefit in doing this? Thanks in advance for everyone's help 🙂
Technical SEO | | JAR8970 -
Robots.txt and Magento
HI, I am working on getting my robots.txt up and running and I'm having lots of problems with the robots.txt my developers generated. www.plasticplace.com/robots.txt I ran the robots.txt through a syntax checking tool (http://www.sxw.org.uk/computing/robots/check.html) This is what the tool came back with: http://www.dcs.ed.ac.uk/cgi/sxw/parserobots.pl?site=plasticplace.com There seems to be many errors on the file. Additionally, I looked at our robots.txt in the WMT and they said the crawl was postponed because the robots.txt is inaccessible. What does that mean? A few questions: 1. Is there a need for all the lines of code that have the “#” before it? I don’t think it’s necessary but correct me if I'm wrong. 2. Furthermore, why are we blocking so many things on our website? The robots can’t get past anything that requires a password to access anyhow but again correct me if I'm wrong. 3. Is there a reason Why can't it just look like this: User-agent: * Disallow: /onepagecheckout/ Disallow: /checkout/cart/ I do understand that Magento has certain folders that you don't want crawled, but is this necessary and why are there so many errors?
Technical SEO | | EcomLkwd0 -
Robots.txt query
Quick question, if this appears in a clients robots.txt file, what does it mean? Disallow: /*/_/ Does it mean no pages can be indexed? I have checked and there are no pages in the index but it's a new site too so not sure if this is the problem. Thanks Karen
Technical SEO | | Karen_Dauncey0 -
Same URL in "Duplicate Content" and "Blocked by robots.txt"?
How can the same URL show up in Seomoz Crawl Diagnostics "Most common errors and warnings" in both the "Duplicate Content"-list and the "Blocked by robots.txt"-list? Shouldnt the latter exclude it from the first list?
Technical SEO | | alsvik0 -
Impact of "restricted by robots" crawler error in WT
I have been wondering about this for a while now with regards to several of my sites. I am getting a list of pages that I have blocked in the robots.txt file. If I restrict Google from crawling them, then how can they consider their existence an error? In one case, I have even removed the urls from the index. And do you have any idea of the negative impact associated with these errors. And how do you suggest I remedy the situation. Thanks for the help
Technical SEO | | phogan0 -
Robots.txt question
What is this robots.txt telling the search engines? User-agent: * Disallow: /stats/
Technical SEO | | DenverKelly0 -
Starting a new product, should we use new domain or subdomain
I'm working with a company that has a high page rank on it's main domain and is looking to launch a new business / product offering. They are evaluating either creating a subdomain or launching a brand new domain. In either case, their current site will link contextually to the new site. Is there one method that would be better for SEO than the other? The new business / product is related to the main offering, but may appeal to different / new customers. The new business / product does need it's own homepage and will have a different conversion funnel than the existing business.
Technical SEO | | gallantc0