Robots.txt and redirected backlinks
-
Hey there,
since a client's global website has a very complex structure which lead to big duplicate content problems, we decided to disallow crawler access and instead allow access to only a few relevant subdirectories. While indexing has improved since this I was wondering if we might have cut off link juice. Since several backlinks point to the disallowed root directory and are from there redirected (301) to the allowed directory I was wondering if this could cause any problems?
Example: If there is a backlink pointing to example.com (disallowed in robots.txt) and is redirected from there to example.com/uk/en (allowed in robots.txt). Would this cut off the link juice?
Thanks a lot for your thoughts on this.
Regards,
Jochen
-
A noindexed page can still accumulate and pass link equity, although results vary on whether or not some of that link juice "evaporates" along the way. I'm inclined to agree with Chris, though, that there's probably no need to noindex a page that redirects to a page that you do want indexed.
-
Hi Jochen,
It's an interesting situation and to be honest, I don't know for sure how search engines will deal with that "link juice". This will come down to a question of whether search engines see robots.txt or htaccess first. If it looks at robots first (which is my suspicion), it can't see that page to pass the strength.
I suppose to test this, you could submit the redirected page to index via Search Console and see if it shows you the redirect or says it's blocked.
Interesting question aside, there's no real need to block access to a 301'd page
Also, apologies if I'm just highlighting the obvious here but it would be far better to clean up the site structure and remove that duplication rather than just masking it with robots; the user experience is at least as important as the algorithms!
Along the same lines, cleaning up those pages is going to help your crawl budget immensely.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Please Guide me - Is this good or bad backlink? Website have all same type of backlinks.
Website niche - Animation and 3D Rendering Studios Backlink from - http://www.adamfrisby.com/create-home-design-and-interior-decor-in-2d-3d.html the anchor tag is image URL from one of the many images in that post. Please let me know such types of links were good for bad?
Intermediate & Advanced SEO | | varunrupal0 -
Redirect Chain Advice
Hi, i hope you can help. My site crawl is showing that I have a redirect chain on my home page. Basically it shows I am going from : http: > https: > https://www. I need everything to go from http:// and http://www directly to https://www. without the chain. Below is a copy of the htaccess, can anyone see if there is an error in there that could be causing it. RewriteEngine On
Intermediate & Advanced SEO | | DaleZon
RewriteCond %{HTTPS} off
RewriteRule ^(.*)$ https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301] BEGIN WordPress <ifmodule mod_rewrite.c="">RewriteEngine On
RewriteBase /
RewriteRule ^index.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]</ifmodule> END WordPress In addition, i have seen that they have a plugin called SSL insecure content fixer installed. It is showing this under its status: Array ( [HTTPS] => on [PHPHANDLER] => /usr/local/php70/bin/php [HTTP_X_REAL_IP] => 109.158.20.158 [HTTP_X_FORWARDED_PROTO] => https ) I think possibly this might have something to do with the issue, any thoughts are appreciated Thanks0 -
What does Disallow: /french-wines/?* actually do - robots.txt
Hello Mozzers - Just wondering what this robots.txt instruction means: Disallow: /french-wines/?* Does it stop Googlebot crawling and indexing URLs in that "French Wines" folder - specifically the URLs that include a question mark? Would it stop the crawling of deeper folders - e.g. /french-wines/rhone-region/ that include a question mark in their URL? I think this has been done to block URLs containing query strings. Thanks, Luke
Intermediate & Advanced SEO | | McTaggart0 -
Robots.txt for Facet Results
Hi Does anyone know how to properly add facets URL's to Robots txt? E.g. of our facets URL - http://www.key.co.uk/en/key/platform-trolleys-trucks#facet:-10028265807368&productBeginIndex:0&orderBy:5&pageView:list& Everything after the # will need to be blocked on all pages with a facet. Thank you
Intermediate & Advanced SEO | | BeckyKey0 -
Robots.txt Blocking - Best Practices
Hi All, We have a web provider who's not willing to remove the wildcard line of code blocking all agents from crawling our client's site (user-agent: *, Disallow: /). They have other lines allowing certain bots to crawl the site but we're wondering if they're missing out on organic traffic by having this main blocking line. It's also a pain because we're unable to set up Moz Pro, potentially because of this first line. We've researched and haven't found a ton of best practices regarding blocking all bots, then allowing certain ones. What do you think is a best practice for these files? Thanks! User-agent: * Disallow: / User-agent: Googlebot Disallow: Crawl-delay: 5 User-agent: Yahoo-slurp Disallow: User-agent: bingbot Disallow: User-agent: rogerbot Disallow: User-agent: * Crawl-delay: 5 Disallow: /new_vehicle_detail.asp Disallow: /new_vehicle_compare.asp Disallow: /news_article.asp Disallow: /new_model_detail_print.asp Disallow: /used_bikes/ Disallow: /default.asp?page=xCompareModels Disallow: /fiche_section_detail.asp
Intermediate & Advanced SEO | | ReunionMarketing0 -
301 Redirect htaccess
Hi Guys, I have a website that has plenty of links with parameters. For example:
Intermediate & Advanced SEO | | UrbanMark
http://www.domainname.co.uk/index.php?app=ecom&ns=catshow&ref=Brandname-Golf-Shorts&sid=201v04gxs2hlozv161tfo43qk98583el I want to place a wildcard redirect on the .htaccess but don't know what exactly code for this. Ideally I want the URLs above to be: http://www.domainname.co.uk/Category/Brandname-Golf-Shorts Any help pls. Thanks,
Brucz0 -
301 redirects within same domain
If I 301 redirects all urls from http://domain.com/folder/keyword to http://domain.com/folder/keyword.htm Are new urls likely to keep most of link juicy from source url and maintain the rankings in SERP?
Intermediate & Advanced SEO | | Bull1350 -
302 redirect
Aloha, I do a small study of 302 redirects. I wonder if you have any examples of sites where the use of a 302 is made.
Intermediate & Advanced SEO | | android_lyon
For example, to ski resorts: where there is a summer version and a winter version. In this case, the field of 302 will return the version of the relevant season. ex: http://www.valmorel.com/ >> 302 >> http://www.valmorel.com/fr/hiver/accueil-hiver.html I wonder if the use of 302 is the right solution.
What do you think? D.0