Can URLs blocked with robots.txt hurt your site?
-
We have about 20 testing environments blocked by robots.txt, and these environments contain duplicates of our indexed content. These environments are all blocked by robots.txt, and appearing in google's index as blocked by robots.txt--can they still count against us or hurt us?
I know the best practice to permanently remove these would be to use the noindex tag, but I'm wondering if we leave them they way they are if they can still hurt us.
-
90% not, first of all, check if google indexed them, if not, your robots.txt should do it, however I would reinforce that by making sure those URLs are our of your sitemap file and make sure your robots's disallows are set to ALL *, not just google for example.
Google's duplicity policies are tough, but they will always respect simple policies such as robots.txt.
I had a case in the past when a customer had a dedicated IP, and google somehow found it, so you could see both the domain's pages and IP's pages, both the same, we simply added a .htaccess rule to point the IP requests to the domain, and even when the situation was like that for long, it doesn't seem to have affected them. In theory google penalizes duplicity but not in this particular cases, it is a matter of behavior.
Regards!
-
I've seen people say that in "rare" cases, links blocked by Robots.txt will be shown as search results but there's no way I can imagine that would happen if it's duplicates of your content.
Robots.txt lets a search engine know not to crawl a directory - but if another resource links to it, they may know it exists, just not the content of it. They won't know if it's noindex or not because they don't crawl it - but if they know it exists, they could rarely return it. Duplicate content would have a better result, therefore that better result will be returned, and your test sites should not be...
As far as hurting your site, no way. Unless a page WAS allowed, is duplicate, is now NOT allowed, and hasn't been recrawled. In that case, I can't imagine it would hurt you that much either. I wouldn't worry about it.
(Also, noindex doesn't matter on these pages. At least to Google. Google will see the noindex first and will not crawl the page. Until they crawl the page it doesn't matter if it has one word or 300 directives, they'll never see it. So noindex really wouldn't help unless a page had already slipped through.)
-
I don't believe they are going to hurt you, it is more of a warning that if you are trying to have these indexed that at the moment they can't be accessed. When you don't want them to be indexed i.e. in this case, I don't believe you are suffering because of it.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
URL indexed but not submitted in sitemap, however the URL is in the sitemap
Dear Community, I have the following problem and would be super helpful if you guys would be able to help. Cheers Symptoms : On the search console, Google says that some of our old URLs are indexed but not submitted in sitemap However, those URLs are in the sitemap Also the sitemap as been successfully submitted. No error message Potential explanation : We have an automatic cache clearing process within the company once a day. In the sitemap, we use this as last modification date. Let's imagine url www.example.com/hello was modified last time in 2017. But because the cache is cleared daily, in the sitemap we will have last modified : yesterday, even if the content of the page did not changed since 2017. We have a Z after sitemap time, can it be that the bot does not understands the time format ? We have in the sitemap only http URL. And our HTTPS URLs are not in the sitemap What do you think?
Intermediate & Advanced SEO | | ZozoMe0 -
Can subdomains hurt your primary domain's SEO?
Our primary website https://domain.com has a subdomain https://subDomain.domain.com and on that subdomain we have a jive-hosted community, with a few links to and fro. In GA they are set up as different properties but there are many SEO issues in the jive-hosted site, in which many different people can create content, delete content, comment, etc. There are issues related to how jive structures content, broken links, etc. My question is this: Aside from the SEO issues with the subdomain, can the performance of that subdomain negatively impact the SEO performance and rank of the primary domain? I've heard and read conflicting reports about this and it would be nice to hear from the MOZ community about options to resolve such issues if they exist. Thanks.
Intermediate & Advanced SEO | | BHeffernan1 -
Is it posible to improve site rankings working only with an other site?
Hi everyone, i´ll try to explain a situation is happening to me, i´m goint to try to explain the case (im writing the sites without links for explication purposes. Site 1: Adventurerooms Site 2: Adventureroomsmallorca Site 3: Adventureroomsmadrid (the new site) What happen is that at first there was only Adventurerooms and Adventureroomsmallorca, Adventurerooms was for Madrid and linked to the one in Mallorca too, was kind of giving the information for Madrid but in first page split with a link to Mallorca. In a new strategy we create Adventureroomsmadrid for Madrid, and leave Adventurerooms for Spain (with links to Adventureroomsmadrid and Adventureroomsmallorca. We redirect the info for Madrid in Adventurerooms to Adventureroomsmadrid with 301 redirections. We work during this 3 months in Adventureroomsmadrid making content in the blog, and improving (now Adventureroomsmadrid is Moz 15 (perhaps even more), and Adventurerooms is Moz 10. Surprising Adventurerooms is getting better in its search rankings, even when we took away content from it and even without working well. Adventureroomsmadrid is also improving but not as much as Adventurerooms (i know that is a new site, only 3 months), but Adventurerooms gets better results with no content and only DA of 10. I hope i´ve explain the case with my english so the question is: "Is it posible to improve site rankings working only with an other site?" Thanks in advance
Intermediate & Advanced SEO | | webtematica0 -
Duplicate Multi-site Content, Duplicate URLs
We have 2 ecommerce sites that are 95% identical. Both sites carry the same 2000 products, and for the most part, have the identical product descriptions. They both have a lot of branded search, and a considerable amount of domain authority. We are in the process of changing out product descriptions so that they are unique. Certain categories of products rank better on one site than another. When we've deployed unique product descriptions on both sites, we've been able to get some double listings on Page 1 of the SERPs. The categories on the sites have different names, and our URL structure is www.domain.com/category-name/sub-category-name/product-name.cfm. So even though the product names are the same, the URLs are different including the category names. We are in the process of flattening our URL structures, eliminating the category and subcategory names from the product URLs: www.domain.com/product-name.cfm. The upshot is that the product URLs will be the same. Is that going to cause us any ranking issues?
Intermediate & Advanced SEO | | AMHC0 -
Can I add Title/Description tags to site map
I have started working on a website that it written in JAVA. It has 26 URL's But because of the way it is written it is all shown on the home page code and does not have the ability to add unique title and description tags. Is there a work around for SEO on websites like this aside from adding content? I was wondering if there is a way to submit a sitemat with title and description tags. Any advice? Chris.K
Intermediate & Advanced SEO | | CKerr0 -
Can you Canonical to a URL in a different folder under the same domain?
I want to know if it's possible to add a canonical tag to a URL that points to a URL under a different folder. Content is just about the same. Here's an example (fake urls and product, but structure and parameters are similar to my client's website): domain.com/toy-ducks-results.aspx?color=Purple&model=Elvis domain.com/toy-ducks-details.aspx?color=Purple&model=Elvis&style=Sparkly Let's say that my purple Elvis ducks are really popular. Is there any harm in putting a rel=canonical on the Sparkly Elvis ducks page to the purple Elvis ducks page? Even though they are two different folders? /toy-ducks-results and /toy-ducks-details So, in effect, the preferred folder is /toy-ducks-results Thanks in advance for any help.
Intermediate & Advanced SEO | | EEE30 -
Will disallowing in robots.txt noindex a page?
Google has indexed a page I wish to remove. I would like to meta noindex but the CMS isn't allowing me too right now. A suggestion o disallow in robots.txt would simply stop them crawling I expect or is it also an instruction to noindex? Thanks
Intermediate & Advanced SEO | | Brocberry0 -
Changing a URL structure within a site
Im moving a page over from www.domain.com/page to www.domain/category/page currently www.domain.com/page is ranking in google. Would a 301 do the job? is there anything else needed to help keep SE positions?
Intermediate & Advanced SEO | | SEODinosaur0