Should I block robots from URLs containing query strings?
-
I'm about to block off all URLs that have a query string using robots.txt. They're mostly URLs with coremetrics tags and other referrer info. I figured that search engines don't need to see these as they're always better off with the original URL.
Might there be any downside to this that I need to consider?
Appreciate your help / experiences on this one.
Thanks
Jenni
-
Thanks for your suggestions. I've already got canonical tags on every page, but they're not all being adhered to and lots of URLs with query strings are still getting organic traffic.
Passing referrer info behind scenes isn't an option with Coremetrics I don't think. Is it?
Interested to know more about number 1 though. How would you do that in WMT other than blocking with robots.txt?
Thanks
-
Instead of blocking them with robots.txt (which isn't very effective), try using the canonical tag instead.
For instance, a URL like this:
http://wwww.testdomain.com/page.html?utm_source=Google&utm_medium=Banner&utm_campaign=CampaignYou could add this canonical tag in the head:
With this solution you don't have to worry about losing quality links OR having your query tracking show up in any of the major search engines.
Cheers- Kyle
-
The downside to this would be if someone linked to the page with the query string, the search engines wouldn't crawl the page and flow link juice properly to the rest of your site.
Other options:
-
Use Google and Bing WMT to ignore those parameter query strings.
-
Make sure the canoncial tag is on those pages, pointing back to the version without the query string
-
Try to pass referrer info behind the scenes if possible
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Adding your sitemap to robots.txt
Hi everyone, Best practice question: When adding your sitemap to your robots.txt file, do you add the whole sitemap at once or do you add different subcategories (products, posts, categories,..) separately? I'm very curious to hear your thoughts!
Technical SEO | | WeAreDigital_BE0 -
301 for a Very Long URL
Hey gang, Thanks ahead of time for the help. I have a url somehow that is very very long: http://www.colbysphotography.com/wedding-caterers-knoxville-east-tennessee/Here is an extensive list of wedding venues in the Knoxville and East Tennessee region. If you find that any of these links are not working, that the venues are no longer in business, or have a suggestion for an additional venue (at no charge), please contact Colby. Colby's Photography works hard on keeping this list helpful. I have tried Yoast Premium on a wordpress site to redirect the url but it doesn't seem to keep. I've tried a few other redirect plugins with not help either. I would love some suggestions on this one! Colby
Technical SEO | | littlecolby0 -
Why does my mobile site have a "?mobiRedirect=1" string at the end of the URL?
Hello, When trying to access my site from a smart-phone, I'm getting a redirected to the mobile version (which is correct), however at the end of the URL there is a redirect string that shows every time. I'm not sure why its its showing or how it automatically gets appended to the end of the URL each time. How can I configure my mobile site to prevent the ?mobiRedirect=1" from showing? For example, if you search for "Columbus Regional Health" on Google with a smart-phone, the first result should be for www.crh.org. If you click that, you should get redirected to www.crh.org/mobile , however its displaying the URL as http://www.crh.org/mobile/default.aspx?mobiRedirect=1 Does anyone know how to fix this? Thank you,
Technical SEO | | Liamis
Brian0 -
Ajax Crawling | Blocked URLs Spike
http://www.zando.co.za/women/shoes/ (for example) Hello, I'm concerned that WMT is reporting a large spike in blocked URLs - now reporting more blocked URLs than good URLs. Our product recommendations get generated via an Ajax call and these autogenerated, unique, URLs are rendered in the /recommendations/ folder which sits in the root of our site: http://www.zando.co.za/recommendations/ I can't see how I can prevent Google from calling the Ajax - I can only assume that's what's happening.This is what the code typically looks like:
Technical SEO | | RocketZando0 -
/out/ URLs in GWMTs
I am recently seeing some URLs come up as 404s in GWMTs for a client. They look like this: http://client-url/out/www.linkedin.com/company/client-linkedin-name /out/client-url/sub-directory/postname/ We thought they might have something to do with the social plugins but they are all over the place and they are sometime for internal pages on the site. Anyone run into these and know why they are happening?
Technical SEO | | DragonSearch0 -
Warnings for blocked by blocked by meta-robots/meta robots Nofollow...how to resolve?
Hello, I see hundreds of notices for blocked by meta-robots/meta robots nofollow and it appears it is linked to the comments on my site which I assume I would not want to be crawled. Is this the case and these notices are actually a positive thing? Please advise how to clear them up if these notices can be potentially harmful for my SEO. Thanks, Talia
Technical SEO | | M80Marketing0 -
Robot.txt pattern matching
Hola fellow SEO peoples! Site: http://www.sierratradingpost.com robot: http://www.sierratradingpost.com/robots.txt Please see the following line: Disallow: /keycodebypid~* We are trying to block URLs like this: http://www.sierratradingpost.com/keycodebypid~8855/for-the-home~d~3/kitchen~d~24/ but we still find them in the Google index. 1. we are not sure if we need to specify the robot to use pattern matching. 2. we are not sure if the format is correct. Should we use Disallow: /keycodebypid*/ or /*keycodebypid/ or even /*keycodebypid~/? What is even more confusing is that the meta robot command line says "noindex" - yet they still show up. <meta name="robots" content="noindex, follow, noarchive" /> Thank you!
Technical SEO | | STPseo0 -
Are URL's with trailing slash seen as two different URLs
Hello, http://www.example.com and http://ww.example.com/ Are these seen as two different URL's ? Just as with www or non www ? Or it doesn't make any difference ?
Technical SEO | | seoug_20050