Will blocking urls in robots.txt void out any backlink benefits? - I'll explain...
-
Ok...
So I add tracking parameters to some of my social media campaigns but block those parameters via robots.txt. This helps avoid duplicate content issues (Yes, I do also have correct canonical tags added)... but my question is -- Does this cause me to miss out on any backlink magic coming my way from these articles, posts or links?
Example url: www.mysite.com/subject/?tracking-info-goes-here-1234
- Canonical tag is: www.mysite.com/subject/
- I'm blocking anything with "?tracking-info-goes-here" via robots.txt
- The url with the tracking info of course IS NOT indexed in Google but IT IS indexed without the tracking parameters.
What are your thoughts?
- Should I nix the robots.txt stuff since I already have the canonical tag in place?
- Do you think I'm getting the backlink "juice" from all the links with the tracking parameter?
What would you do?
Why?
Are you sure?
-
Thanks Guys...
Yeah, I figure that's the right path to take based on what we know... But I love to hear others chime in so I can blame it all on you if something goes wrong - ha!
Another Note: Do you think this will cause some kind of unnatural anomaly when the robots.txt file is edited? All of a sudden these links will now be counted (we assume).
It's likely the answer is no because Google still knows about the links.. they just don't count them - but still thought I'd throw that thought out there.
-
I agree with what Andrea wrote above - just one additional point - blocking a file via robots.txt doesn't prevent the search engine from not indexing the page. It just prevents the search engine from crawling the page and seeing the content on the page. The page may very well still show up in the index - you'll just see a snippet that your robots.txt file is preventing google from crawling the site and caching it and providing a snippet or preview. If you have canonical tags put in place properly, remove the block on the parameters in your robots.txt and let the engines do things the right way and not have to worry about this question.
-
If you block with robots.txt link juice can't get passed along. If your canonicals are good, then ideally you wouldn't need the robots. Also, it really removes value of the social media postings.
So, to your question, if you have the tracking parameter blocked via robots, then no, I don't think you are getting the link juice.
http://www.rickrduncan.com/robots-txt-file-explained
When I want link juice passed on but want to avoid duplicate content, I'm more a fan of the no index, follow tags and using canonicals where it makes sense, too. But since you say your URLs with the parameters aren't being indexed then you must be using tags anyway to make that happen and not just relying on robots.
To your point of "are you sure":
http://www.evergreensearch.com/minimum-viable-seo-8-ways-to-get-startup-seo-right/
(I do like to cite sources - there's so many great articles out there!)
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Do you know if there is a tool that can tell you if a url have backlink?
Hi, Do you know if there is a tool that I can check backlinks for thousands of URLs Thanks Roy
Intermediate & Advanced SEO | | kadut0 -
Syndicated content with meta robots 'noindex, nofollow': safe?
Hello, I manage, with a dedicated team, the development of a big news portal, with thousands of unique articles. To expand our audiences, we syndicate content to a number of partner websites. They can publish some of our articles, as long as (1) they put a rel=canonical in their duplicated article, pointing to our original article OR (2) they put a meta robots 'noindex, follow' in their duplicated article + a dofollow link to our original article. A new prospect, to partner with with us, wants to follow a different path: republish the articles with a meta robots 'noindex, nofollow' in each duplicated article + a dofollow link to our original article. This is because he doesn't want to pass pagerank/link authority to our website (as it is not explicitly included in the contract). In terms of visibility we'd have some advantages with this partnership (even without link authority to our site) so I would accept. My question is: considering that the partner website is much authoritative than ours, could this approach damage in some way the ranking of our articles? I know that the duplicated articles published on the partner website wouldn't be indexed (because of the meta robots noindex, nofollow). But Google crawler could still reach them. And, since they have no rel=canonical and the link to our original article wouldn't be followed, I don't know if this may cause confusion about the original source of the articles. In your opinion, is this approach safe from an SEO point of view? Do we have to take some measures to protect our content? Hope I explained myself well, any help would be very appreciated, Thank you,
Intermediate & Advanced SEO | | Fabio80
Fab0 -
When the site's entire URL structure changed, should we update the inbound links built pointing to the old URLs?
We're changing our website's URL structures, this means all our site URLs will be changed. After this is done, do we need to update the old inbound external links to point to the new URLs? Yes the old URLs will be 301 redirected to the new URLs too. Many thanks!
Intermediate & Advanced SEO | | Jade1 -
SSL and robots.txt question - confused by Google guidelines
I noticed "Don’t block your HTTPS site from crawling using robots.txt" here: http://googlewebmastercentral.blogspot.co.uk/2014/08/https-as-ranking-signal.html Does this mean you can't use robots.txt anywhere on the site - even parts of a site you want to noindex, for example?
Intermediate & Advanced SEO | | McTaggart0 -
Mixing static.htm urls and dynamic urls on a Windows IIS Server?
Hi all, We've had a website originally built using static html with .htm extensions ranking well in Google hence we want to keep those pages/urls. We are on a dedicated sever (Windows IIS). However our developer has custom made a new DYNAMIC section for the site which shows new added products dynamically and allows them to be booked online via shopping cart. We are having problems displaying them both on the same domain even if we put the dynamic section withing its own subfolder and keep the static htms in the root. Is it possible to have both function on IIS (even if they may have to function a little separately)? Does anyone have previous experience of this kind of issue or a way of making both work? What setup do we need to do on the dedicated server.
Intermediate & Advanced SEO | | emerald0 -
Google suddenly indexing and displaying URLs that haven't existed for years?
We recently noticed google is showing approx 23,000 indexed .jsp urls for our site. These are ancient pages that haven't existed in years and have long been 301 redirected to valid urls. I'm talking 6 years. Checking the serps the other day (and our current SEOMoz pro campaign), I see that a few of these urls are now replacing our correct ones in the serps for important, competitive phrases. What the heck is going on here? Is Google suddenly ignoring rewrite rules and redirects? Here's an example of the rewrite rules that we've used for 6+ years: RewriteRule ^(.*)/xref_interlux_antifoulingoutboards&keels.jsp$ $1/userportal/search_subCategory.do?categoryName=Bottom%20Paint&categoryId=35&refine=1&page=GRID [R=301] Now, this 'bottom paint' url has been incredibly stable in the serps for over a half decade. All of a sudden, a google search for 'bottom paint' (no quotes) brings up the jsp page at position 2-3. This is just one example of something very bizarre happening. Has anyone else had something similar happen lately? Thank You <colgroup><col width="64"></colgroup>
Intermediate & Advanced SEO | | jamestown
| RewriteRule ^(.*)/xref_interlux_antifoulingoutboards&keels.jsp$ $1/userportal/search_subCategory.do?categoryName=Bottom%20Paint&categoryId=35&refine=1&page=GRID [R=301] |0 -
202 error page set in robots.txt versus using crawl-able 404 error
We currently have our error page set up as a 202 page that is unreachable by the search engines as it is currently in our robots.txt file. Should the current error page be a 404 error page and reachable by the search engines? Is there more value or is it a better practice to use 404 over a 202? We noticed in our Google Webmaster account we have a number of broken links pointing the site, but the 404 error page was not accessible. If you have any insight that would be great, if you have any questions please let me know. Thanks, VPSEO
Intermediate & Advanced SEO | | VPSEO0 -
Charity project for local women's shelter - need help: will Google notice if you alter the document title with Javascript after the page loads?
I am doing some pro-bono work with a local shelter for female victims of domestic abuse. I am trying to help visitors to the site cover their tracks by employing a document.title change when the page loads using JavaScript. This shelter receives a lot of traffic from Google. I worry that the Google bots will see this javascript change and somehow penalize this site or modify the title in the SERPs. Has anyone had any experience with this kind of javascript maneuver? All help would be greatly appreciated!
Intermediate & Advanced SEO | | jkonowitch0