Will blocking urls in robots.txt void out any backlink benefits? - I'll explain...
-
Ok...
So I add tracking parameters to some of my social media campaigns but block those parameters via robots.txt. This helps avoid duplicate content issues (Yes, I do also have correct canonical tags added)... but my question is -- Does this cause me to miss out on any backlink magic coming my way from these articles, posts or links?
Example url: www.mysite.com/subject/?tracking-info-goes-here-1234
- Canonical tag is: www.mysite.com/subject/
- I'm blocking anything with "?tracking-info-goes-here" via robots.txt
- The url with the tracking info of course IS NOT indexed in Google but IT IS indexed without the tracking parameters.
What are your thoughts?
- Should I nix the robots.txt stuff since I already have the canonical tag in place?
- Do you think I'm getting the backlink "juice" from all the links with the tracking parameter?
What would you do?
Why?
Are you sure?
-
Thanks Guys...
Yeah, I figure that's the right path to take based on what we know... But I love to hear others chime in so I can blame it all on you if something goes wrong - ha!
Another Note: Do you think this will cause some kind of unnatural anomaly when the robots.txt file is edited? All of a sudden these links will now be counted (we assume).
It's likely the answer is no because Google still knows about the links.. they just don't count them - but still thought I'd throw that thought out there.
-
I agree with what Andrea wrote above - just one additional point - blocking a file via robots.txt doesn't prevent the search engine from not indexing the page. It just prevents the search engine from crawling the page and seeing the content on the page. The page may very well still show up in the index - you'll just see a snippet that your robots.txt file is preventing google from crawling the site and caching it and providing a snippet or preview. If you have canonical tags put in place properly, remove the block on the parameters in your robots.txt and let the engines do things the right way and not have to worry about this question.
-
If you block with robots.txt link juice can't get passed along. If your canonicals are good, then ideally you wouldn't need the robots. Also, it really removes value of the social media postings.
So, to your question, if you have the tracking parameter blocked via robots, then no, I don't think you are getting the link juice.
http://www.rickrduncan.com/robots-txt-file-explained
When I want link juice passed on but want to avoid duplicate content, I'm more a fan of the no index, follow tags and using canonicals where it makes sense, too. But since you say your URLs with the parameters aren't being indexed then you must be using tags anyway to make that happen and not just relying on robots.
To your point of "are you sure":
http://www.evergreensearch.com/minimum-viable-seo-8-ways-to-get-startup-seo-right/
(I do like to cite sources - there's so many great articles out there!)
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Canonical URL's searchable in Google?
Hi - we have a newly built site using Drupal, and Drupal likes to create canonical tags on pretty much everything, from their /node/ url's to the URL Alias we've indicated. Now, when I pull a moz crawl report, I get a huge list of all the /node/ plus other URL's. That's beside the point though... Question: when I directly enter one of the /node/ url's into a google search, a result is found. Clicking on it redirects to the new URL, but should Google even be finding these non-canonical URL's?? I don't feel like I've seen this before.
Intermediate & Advanced SEO | | Jenny10 -
How would you address these URLS
Hey Mozzers, long time no post. Just a quick one for you regarding URLS, this is an example of a url on a site https://www.thisismyurl.co.uk/products/spacehoppers/special-spacehopper.html Many of these pages are getting flagged for having a url that is too long. The target of this page is "special spacehoppers". Should i be concerned with the url being to long given my keyword is at the end? Would this be a suitable idea? https://www.thisismyurl.co.uk/p/spacehoppers/special.html Would changing products to p be worthwhile? It would remove length from nearly all urls but would require a site wide re-direct. 2)Would removing the "spacehoppers" bit from the url be worth it? Yes it would shorten the url but would also remove the exact keyword from the url which could be detrimental to rankings.
Intermediate & Advanced SEO | | ATP0 -
Question about Syntax in Robots.txt
So if I want to block any URL from being indexed that contains a particular parameter what is the best way to put this in the robots.txt file? Currently I have-
Intermediate & Advanced SEO | | DRSearchEngOpt
Disallow: /attachment_id Where "attachment_id" is the parameter. Problem is I still see these URL's indexed and this has been in the robots now for over a month. I am wondering if I should just do Disallow: attachment_id or Disallow: attachment_id= but figured I would ask you guys first. Thanks!0 -
Why is /home used in this company's home URL?
Just working with a company that has chosen a home URL with /home latched on - very strange indeed - has anybody else comes across this kind of homepage URL "decision" in the past? I can't see why on earth anybody would do this! Perhaps simply a logic-defying decision?
Intermediate & Advanced SEO | | McTaggart0 -
Robots.txt Blocked Most Site URLs Because of Canonical
Had a bit of a "Gotcha" in Magento. We had Yoast Canonical Links extension which worked well , but then we installed Mageworx SEO Suite.. which broke Canonical Links. Unfortunately it started putting www.mysite.com/catalog/product/view/id/516/ as the Canonical Link - and all URLs with /catalog/productview/* is blocked in Robots.txt So unfortunately We told Google that the correct page is also a blocked page. they haven't been removed as far as I can see but traffic has certainly dropped. We have also , at the same time had some Site changes grouping some pages & having 301 redirects. Resubmitted site map & did a fetch as google. Any other ideas? And Idea how long it will take to become unblocked?
Intermediate & Advanced SEO | | s_EOgi_Bear0 -
Is our robots.txt file correct?
Could you please review our robots.txt file and let me know if this is correct. www.faithology.com/robots.txt Thank you!
Intermediate & Advanced SEO | | BMPIRE0 -
OSE Confusion on 'External' Links
Hello All, I am still very new to this but am starting to get a grasp of things in the SEO world, but there are still a few things that I just don't get yet. For example, I've been trying to find out a great strategy for Link Building, what better way than looking at already existing SEO companies? So I did a quick search on a website (http://www.opensiteexplorer.org/links?site=www.springer-marketing.co.uk) and tried to look at all of the External incoming links. So I did a filter of Followed+301, Only External and all subdomains. But about 20 of the links for this site are coming from itself. Now, i'm not an expert, but presumably you can't just give yourself strong links? Is this some kind of trick, how or why would somebody do this? Mind Blows Paul
Intermediate & Advanced SEO | | Paul_Tovey0 -
Will pages irrelevant to a site's core content dilute SEO value of core pages?
We have a website with around 40 product pages. We also have around 300 pages with individual ingredients used for the products and on top of that we have some 400 pages of individual retailers which stock the products. Ingredient pages have same basic short info about the ingredients and the retail pages just have the retailer name, adress and content details. Question is, should I add noindex to all the ingredient and or retailer pages so that the focus is entirely on the product pages? Thanks for you help!
Intermediate & Advanced SEO | | ArchMedia0