Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Should I block robots from URLs containing query strings?
-
I'm about to block off all URLs that have a query string using robots.txt. They're mostly URLs with coremetrics tags and other referrer info. I figured that search engines don't need to see these as they're always better off with the original URL.
Might there be any downside to this that I need to consider?
Appreciate your help / experiences on this one.
Thanks
Jenni
-
Thanks for your suggestions. I've already got canonical tags on every page, but they're not all being adhered to and lots of URLs with query strings are still getting organic traffic.
Passing referrer info behind scenes isn't an option with Coremetrics I don't think. Is it?
Interested to know more about number 1 though. How would you do that in WMT other than blocking with robots.txt?
Thanks
-
Instead of blocking them with robots.txt (which isn't very effective), try using the canonical tag instead.
For instance, a URL like this:
http://wwww.testdomain.com/page.html?utm_source=Google&utm_medium=Banner&utm_campaign=CampaignYou could add this canonical tag in the head:
With this solution you don't have to worry about losing quality links OR having your query tracking show up in any of the major search engines.
Cheers- Kyle
-
The downside to this would be if someone linked to the page with the query string, the search engines wouldn't crawl the page and flow link juice properly to the rest of your site.
Other options:
-
Use Google and Bing WMT to ignore those parameter query strings.
-
Make sure the canoncial tag is on those pages, pointing back to the version without the query string
-
Try to pass referrer info behind the scenes if possible
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google is indexing bad URLS
Hi All, The site I am working on is built on Wordpress. The plugin Revolution Slider was downloaded. While no longer utilized, it still remained on the site for some time. This plugin began creating hundreds of URLs containing nothing but code on the page. I noticed these URLs were being indexed by Google. The URLs follow the structure: www.mysite.com/wp-content/uploads/revslider/templates/this-part-changes/ I have done the following to prevent these URLs from being created & indexed: 1. Added a directive in my Htaccess to 404 all of these URLs 2. Blocked /wp-content/uploads/revslider/ in my robots.txt 3. Manually de-inedex each URL using the GSC tool 4. Deleted the plugin However, new URLs still appear in Google's index, despite being blocked by robots.txt and resolving to a 404. Can anyone suggest any next steps? I Thanks!
Technical SEO | | Tom3_150 -
Category URL Pagination where URLs don't change between pages
Hello, I am working on an e-commerce site where there are categories with multiple pages. In order to avoid pagination issues I was thinking of using rel=next and rel=prev and cannonical tags. I noticed a site where the URL doesn't change between pages, so whether you're on page 1,2, or 3 of the same category, the URL doesn't change. Would this be a cleaner way of dealing with pagination?
Technical SEO | | whiteonlySEO0 -
Is there a limit to how many URLs you can put in a robots.txt file?
We have a site that has way too many urls caused by our crawlable faceted navigation. We are trying to purge 90% of our urls from the indexes. We put no index tags on the url combinations that we do no want indexed anymore, but it is taking google way too long to find the no index tags. Meanwhile we are getting hit with excessive url warnings and have been it by Panda. Would it help speed the process of purging urls if we added the urls to the robots.txt file? Could this cause any issues for us? Could it have the opposite effect and block the crawler from finding the urls, but not purge them from the index? The list could be in excess of 100MM urls.
Technical SEO | | kcb81780 -
Special characters in URL
Will registered trademark symbol within a URL be bad? I know some special characters are unsafe (#, >, etc.) but can not find anything that mentions registered trademark. Thanks!
Technical SEO | | bonnierSEO0 -
Urls with or without .html ending
Hello, Can anyone show me some authority info on wheher links are better with or without a .html ending? Thanks is advance
Technical SEO | | sesertin0 -
Singular vs plural in urls
In keyword research for an ecommerce site, I've found that widget, singular gets a lot more searches than widgets, plural AND is much less competitive. Is it better for SEO purposes to have the URLs (and matching title tags) in the catalog as /brass-widget.html, /steel-widget.html, etc., or /brass-widgets.html, etc.? I'm worried that a) searches for widgets will pass by the singular urls but not vice versa, and b) the singular form will strike visitors as bad grammar. Any advice?
Technical SEO | | AmericanOutlets0 -
Products with discrete URLs for each color
here is the issue. i have an ecommerce site that on a category page, shows each individual color for each product sold. and there is a distinct URL for each color. each product page shares the same content, with the only potentially differentiating factor being customer reviews (not nearly enough of these to differentiate anything). so we have URLs like: www.domain.com/product-green www.domain.com/product-yellow www.domain.com/product-red and so on. i am looking for a way to consolidate these URL while still showing all colors on the category page. the first solution i am considering is using the hash tag. so we would create www.domain.com/product#green, www.domain.com/product#yellow, www.domain.com/product#red. if possible, i would set the canonical tag as www.domain.com/product. the second solution would be to use the canonical tag and keep the URLs as is. the issue i see here is that we would need to create www.domain.com/product and show that page somewhere. www.domain.com/product would the URL that the above color URLs would canonicalize to. what would be the preferred solution? or is there something else?
Technical SEO | | rakesh_patel0 -
How to handle sitemap with pages using query strings?
Hi, I'm working to optimize a site that currently has about 5K pages listed in the sitemap. There are not in face this many pages. Part of the problem is that one of the pages is a tool where each sort and filter button produces a query string URL. It seems to me inefficient to have so many items listed that are all really the same page. Not to mention wanting to avoid any duplicate content or low quality issues. How have you found it best to handle this? Should I just noindex each of the links? Canonical links? Should I manually remove the pages from the sitemap? Should I continue as is? Thanks a ton for any input you have!
Technical SEO | | 5225Marketing0