Should I disallow all URL query strings/parameters in Robots.txt?
-
Webmaster Tools correctly identifies the query strings/parameters used in my URLs, but still reports duplicate title tags and meta descriptions for the original URL and the versions with parameters. For example, Webmaster Tools would report duplicates for the following URLs, despite it correctly identifying the "cat_id" and "kw" parameters:
/Mulligan-Practitioner-CD-ROM
/Mulligan-Practitioner-CD-ROM?cat_id=87
/Mulligan-Practitioner-CD-ROM?kw=CROMAdditionally, theses pages have self-referential canonical tags, so I would think I'd be covered, but I recently read that another Mozzer saw a great improvement after disallowing all query/parameter URLs, despite Webmaster Tools not reporting any errors.
As I see it, I have two options:
- Manually tell Google that these parameters have no effect on page content via the URL Parameters section in Webmaster Tools (in case Google is unable to automatically detect this, and I am being penalized as a result).
- Add "Disallow: *?" to hide all query/parameter URLs from Google. My concern here is that most backlinks include the parameters, and in some cases these parameter URLs outrank the original.
Any thoughts?
-
Correct. They won't be indexed but are still followed.
-
The statement was in a response to a question I asked earlier.
"I was having an issue like this where moz was showing a lot more duplicate content than webmaster tools was, actually webmaster tools showed none, but I was being penalized. I realized this when I added an exclusion to robots.txt to exclude any query strings on my site. After I did this I saw my rankings shoot through the roof."
Thanks for the info. I did edit the settings in the URL parameters section to tell Google that these parameters do not change the page content, so it should now index only one representative URL. My only concern was that the kw (keyword) parameter does change page content for search result pages, but I just read that Matt Cutts encourages disallowing those pages anyway.
Just to verify, disallowing those pages with parameters won't affect the "link juice" passed from external links?
-
Hi there
I recently answered a question in a similar question in the Q+A that references resources that can help you help Google understand these parameters and categorize them. You can read that here.
That being said, blocking these parameters in your robots.txt will not affect your rankings, especially if those parameter or query strings are properly canonicalized to the proper product page.
That being said, I would make sure you understand the resources above and the options, as you understand your users and website better than anyone - test on a few pages to see what happens and go from there.
Hope this helps! Good luck!
-
"I recently read that another Mozzer saw a great improvement after disallowing all query/parameter URLs" - do you have a link for this?
Canonicals should be enough but Google does mess up and the more clues you can give them, the better.
You can also manually tell Google parameter meanings (if you check out your parameter page now in search console, you should see all of the parameters they've detected for you - you can just change their meaning).
I don't see any harm in disallowing parameters via robots.txt. They will still be crawled and internal links followed, just not indexed in serps.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Robots.txt - blocking JavaScript and CSS, best practice for Magento
Hi Mozzers, I'm looking for some feedback regarding best practices for setting up Robots.txt file in Magento. I'm concerned we are blocking bots from crawling essential information for page rank. My main concern comes with blocking JavaScript and CSS, are you supposed to block JavaScript and CSS or not? You can view our robots.txt file here Thanks, Blake
Intermediate & Advanced SEO | | LeapOfBelief0 -
Can you nofollow a URL?
Hey Moz Community, My questions sounds pretty simple but unfortunately, it isn't. I have a domain name (we'll use example.com for this) http://example.com which 301 re-directs to http://www.example.com. http://example.com has bad links pointing to it and http://www.example.com does not. So essentially, I want to stop negative influences from http://example.com being passed on to http://www.example.com. A 302 re-direct sounds like it would work in theory but is this the best way to go about this? Just so you know, we have completed a reconsideration request a long time ago but I think the bad links are still negatively affecting the website as it does not rank for it's own name which is bizarre. Actual Question: How do I re-direct http://example.com to http://www.example.com without passing on the negative SEO attached to http://example.com? Thanks in advance!
Intermediate & Advanced SEO | | RiceMedia0 -
Ending URLs in .html versus /
Hi there! Currently all the URLs on my website, even the home page, end it .html, such as http://www,consumerbase.com/index.html Is this bad?
Intermediate & Advanced SEO | | Travis-W
Is there any benefit to this? Should I remove it and just have them end with a forward slash?
If I 301 redirect the old .html URLs to the forward slash URLs, will I lose PA? Thanks!0 -
Can Anybody Link to my URL to Hurt SEO? Weird URL pointing at my Domaine!
Our ranking has drop since a few weeks. I did not do any major change in my site. Surfing WebMaster Tool, I found lots of new URL linking at our site: url.org linkarena.com seoprofiler.com folkd.com digitalhome.ca bustingprice.com surepurchase.com lowpricetoday.com oyax.com couponfollow.com aspringcleaning.com pamabuy.com etzone.ca How do I find if those was done intentionelly to hurt SEO? Could it be possible? Thank you, BigBlaze
Intermediate & Advanced SEO | | BigBlaze2050 -
Lots of incorrect urls indexed - Googlebot found an extremely high number of URLs on your site
Hi, Any assistance would be greatly appreciated. Basically, our rankings and traffic etc have been dropping massively recently google sent us a message stating " Googlebot found an extremely high number of URLs on your site". This first highligted us to the problem that for some reason our eCommerce site has recently generated loads (potentially thousands) of rubbish urls hencing giving us duplication everywhere which google is obviously penalizing us with in the terms of rankings dropping etc etc. Our developer is trying to find the route cause of this but my concern is, How do we get rid of all these bogus urls ?. If we use GWT to remove urls it's going to take years. We have just amended our Robot txt file to exclude them going forward but they have already been indexed so I need to know do we put a redirect 301 on them and also a HTTP Code 404 to tell google they don't exist ? Do we also put a No Index on the pages or what . what is the best solution .? A couple of example of our problems are here : In Google type - site:bestathire.co.uk inurl:"br" You will see 107 results. This is one of many lot we need to get rid of. Also - site:bestathire.co.uk intitle:"All items from this hire company" Shows 25,300 indexed pages we need to get rid of Another thing to help tidy this mess up going forward is to improve on our pagination work. Our Site uses Rel=Next and Rel=Prev but no concanical. As a belt and braces approach, should we also put concanical tags on our category pages whereby there are more than 1 page. I was thinking of doing it on the Page 1 of our most important pages or the View all or both ?. Whats' the general consenus ? Any advice on both points greatly appreciated? thanks Sarah.
Intermediate & Advanced SEO | | SarahCollins0 -
Duplicate content via dynamic URLs where difference is only parameter order?
I have a question about the order of parameters in an URL versus duplicate content issues. The URLs would be identical if the parameter order was the same. E.g.
Intermediate & Advanced SEO | | anthematic
www.example.com/page.php?color=red&size=large&gender=male versus
www.example.com/page.php?gender=male&size=large&color=red How smart is Google at consolidating these, and do these consolidated pages incur any penalty (is their combined “weight” equal to their individual selves)? Does Google really see these two pages as DISTINCT, or does it recognize that they are the same because they have the exact same parameters? Is this worth fixing in or does it have a trivial impact? If we have to fix it and can't change our CMS, should we set a preferred, canonical order for these URLs or 301 redirect from one version to the other? Thanks a million!0 -
Subdirectory URLs
If I have category pages for my site; is it better to use http://example.com/category/category or just http://example.com/category? Also, I'm creating a new section of the site; a resource center. Should the URLs of the pages in the resource center be http://example.com/learn/page or just http://example.com/page What are the reasons for the better choice?
Intermediate & Advanced SEO | | Visually0 -
Questions regarding Google's "improved url handling parameters"
Google recently posted about improving url handling parameters http://googlewebmastercentral.blogspot.com/2011/07/improved-handling-of-urls-with.html I have a couple questions: Is it better to canonicalize urls or use parameter handling? Will Google inform us if it finds a parameter issue? Or, should we have a prepare a list of parameters that should be addressed?
Intermediate & Advanced SEO | | nicole.healthline0