Layered navigation and hiding nav from user agent
-
I am trying to deal with the duplicate content issues presented by Magento's layered navigation feature (aka faceted navigation). I installed Amasty's Improved Navigation extension (https://amasty.com/improved-layered-navigation.html) and it offers the option to hide the layered navigation from specific user agents (ie googlebot, bingbot, etc).
This seems like cloaking to me and I hesitate to try it, unless hiding faceted navigation from specific user agents is known to be acceptable to Google (white hat practice). Does anyone know if this the case?
-
Great, thanks Carson! You're insights have been very helpful. I think we'll try to make the on-page ajax solution work.
-
If you're really worried about indexation I think that's a fine solution. It's definitely easier to manage, and it'll also be easier to track pageviews in most analytics platforms. The only downside is that if someone emails or links to a category page with filters applied the recipient won't see it. But generally people share products and not category pages, so it's not a big deal. I'd probably go that route.
Also make sure that your category pages still update the URL when you go to page 2, or that page 2 is somehow also being indexed. You don't want products that don't get indexed because categories can't be crawled.
-
Thanks for the link! I can see how Google offers me a way to tell it how to use my site variables. It seems like between managing parameters in webmaster tools, using canonical links and adding meta noindex tags on variable pages, there should be enough to keep things in order with the search engines. And I can just assume Google knows not to waste too much crawl budget on the variable pages.
I was considering one other option that would remove concerns about variables altogether. Using a different extension, I can set up Magento's layered navigation to work on the page without updating the URL. This eliminates the need for canonicals, parameters, and everything else that is more in Google's control than mine. What do you think of that as a solution?
-
Yes, the bots will crawl the pages, but they will not INDEX them.
There is a concern there, but mostly if the bots get caught in some kind of crawl trap - where they're trying out a near-infinite set of variables and getting stuck in a loop. Otherwise the spiders should understand the variables. You can actually check it in Webmaster tools to make sure Google understands. Instructions for that here:
https://support.google.com/webmasters/answer/6080550?hl=en
Ultimately Google will definitely not penalize you for having lots of duplicate content on URLs through variables, but it might be an issue with Googlebot not finding all your pages. You can make sure that doesn't happen by checking the indexation of your sitemap.
You could also try to block any URLs with the URL parameter in robots.txt. Make sure you get some help on the RegEx if you plan to do this. My advice is that blocking the variables in robots.txt is not worth it, as Google should have no problems with the variables - especially if the canonical tags are working.
Googlebot at least is smart enough these days to know when to stop crawling variable pages, so I think there are more important on-site things to worry about. Make sure your categories are linked to and optimized, for example.
-
This gets into an issue of bots and crawling where I am less clear. Even with canonicals, don't search engine bots crawl all of the pages produced with faceted navigation? That will easily reach 10,000+ pages on my site, which currently has a total number of pages in the low hundreds. I was under the impression I don't want to set up the faceted navigation in a way where the bots crawl through every combination of pages created by my products' attribute filters and bog the bots down in a quagmire of low-value pages. But I'm not sure if that's the case or how concerned I need to be about the bots spending their time on those pages.
-
If I'm not mistaken Magento has canonical tags on category pages by default, so you might be trying to solve an issue that doesn't exist. Take a look at the source code on faceted navigation to confirm. Or you can send me the site and I'll look over it.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Internal search pages (and faceted navigation) solutions for 2018! Canonical or meta robots "noindex,follow"?
There seems to conflicting information on how best to handle internal search results pages. To recap - they are problematic because these pages generally result in lots of query parameters being appended to the URL string for every kind of search - whilst the title, meta-description and general framework of the page remain the same - which is flagged in Moz Pro Site Crawl - as duplicate, meta descriptions/h1s etc. The general advice these days is NOT to disallow these pages in robots.txt anymore - because there is still value in their being crawled for all the links that appear on the page. But in order to handle the duplicate issues - the advice varies into two camps on what to do: 1. Add meta robots tag - with "noindex,follow" to the page
Intermediate & Advanced SEO | | SWEMII
This means the page will not be indexed with all it's myriad queries and parameters. And so takes care of any duplicate meta /markup issues - but any other links from the page can still be crawled and indexed = better crawling, indexing of the site, however you lose any value the page itself might bring.
This is the advice Yoast recommends in 2017 : https://yoast.com/blocking-your-sites-search-results/ - who are adamant that Google just doesn't like or want to serve this kind of page anyway... 2. Just add a canonical link tag - this will ensure that the search results page is still indexed as well.
All the different query string URLs, and the array of results they serve - are 'canonicalised' as the same.
However - this seems a bit duplicitous as the results in the page body could all be very different. Also - all the paginated results pages - would be 'canonicalised' to the main search page - which we know Google states is not correct implementation of canonical tag
https://webmasters.googleblog.com/2013/04/5-common-mistakes-with-relcanonical.html this picks up on this older discussion here from 2012
https://moz.com/community/q/internal-search-rel-canonical-vs-noindex-vs-robots-txt
Where the advice was leaning towards using canonicals because the user was seeing a percentage of inbound into these search result pages - but i wonder if it will still be the case ? As the older discussion is now 6 years old - just wondering if there is any new approach or how others have chosen to handle internal search I think a lot of the same issues occur with faceted navigation as discussed here in 2017
https://moz.com/blog/large-site-seo-basics-faceted-navigation1 -
May Faceted Navigation via ajax #parameter cause duplicated content issues?
We are going to implement a faceted navigation for an ecommerce site of about 1000 products.
Intermediate & Advanced SEO | | lcourse
Faceted navigation is implemented via ajax/javascript which adds to the URL a large number of #parameters.
Faceted pages are canonicalizing to page without any parameters. We do not want google to index any of the faceted pages at this point. Will google include pages with #parameters in their index?
Can I tell google somehow to ignore #parameters and not to index them?
Could this setup cause any SEO problems for us in terms of crawl bandwidth and or link equity?0 -
Duplicate content issue with pages that have navigation
We have a large consumer website with several sections that have navigation of several pages. How would I prevent the pages from getting duplicate content errors and how best would I handle SEO for these? For example we have about 500 events with 20 events showing on each page. What is the best way to prevent all the subsequent navigation pages from getting a duplicate content and duplicate title error?
Intermediate & Advanced SEO | | roundbrix0 -
Can using nofollow on magento layered navigation hurt?
Howdy Mozzers! We would like to use no follow, no index on our magento layered navigation pages after any two filters are selected. (We are using single filter pages as landing page, so we would liked them indexed) Is it ok to use nofollow, noindex on these filter pages? Are there disadvantages of using nofollow on internal pages? Matt mentioned refraining from using nofollow internally https://www.youtube.com/watch?v=4SAPUx4Beh8 But we would like to conserve crawling bandwidth and PR flow on potentially 100's of thousands of irrelevant/duplicate filter pages.
Intermediate & Advanced SEO | | MozAddict0 -
Pros vs Cons - Navigation/content embedded within javascript
My programmer showed me this demo website where all the navigation and content is embedded within javascript: http://sailsjs.org/#! Google site search returned 51 in results, all pages pretty much unique Title Tags and Meta Descriptions Bing site search returned 24 results with pretty much identical Title Tags and Meta Descriptions Matt Cutts said it's fine but to test first: http://www.youtube.com/watch?v=Mibrj2bOFCU Has anyone seen any reason to avoid this web convention? My gut is to avoid this approach with the main drawback I see is that websites like this won't do well on search engines other than Google that have less sophisticated algorithms. thoughts?
Intermediate & Advanced SEO | | Rich_Coffman0 -
Page loads fine for users but returns a 404 for Google & Moz
I have an e-commerce website that is built using Wordpress and the WP E-commerce plug-in, the products have always worked fine and the pages when you view them in a browser work fine and people can purchase the products with no problems. However in the Google merchant feed and in the Moz crawl diagnostics certain product pages are returning a 404 error message and I can't work out why, especially as the pages load fine in the browser. I had a look at the page headers and can see when the page does load the initial request does return a 404 error message, then every other request goes through and loads fine. Can anyone help me as to why this is happening? A link to the product I have been using to test is: http://earthkindoriginals.co.uk/organic-clothing/lounge-wear/organic-tunic-top/ Here is a part of the header dump that I did: http://earthkindoriginals.co.uk/organic-clothing/lounge-wear/organic-tunic-top/
Intermediate & Advanced SEO | | leapSEO
GET /organic-clothing/lounge-wear/organic-tunic-top/ HTTP/1.1
Host: earthkindoriginals.co.uk
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:21.0) Gecko/20100101 Firefox/21.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8
Accept-Language: en-gb,en;q=0.5
Accept-Encoding: gzip, deflate
Cookie: __utma=159840937.1804930013.1369831087.1373619597.1373622660.4; __utmz=159840937.1369831087.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); wp-settings-1=imgsize%3Dmedium%26hidetb%3D1%26editor%3Dhtml%26urlbutton%3Dnone%26mfold%3Do%26align%3Dcenter%26ed_size%3D160%26libraryContent%3Dbrowse; wp-settings-time-1=1370438004; __utmb=159840937.3.10.1373622660; PHPSESSID=e6f3b379d54c1471a8c662bf52c24543; __utmc=159840937
Connection: keep-alive
HTTP/1.1 404 Not Found
Date: Fri, 12 Jul 2013 09:58:33 GMT
Server: Apache
X-Powered-By: PHP/5.2.17
X-Pingback: http://earthkindoriginals.co.uk/xmlrpc.php
Expires: Wed, 11 Jan 1984 05:00:00 GMT
Cache-Control: no-cache, must-revalidate, max-age=0
Pragma: no-cache
Vary: Accept-Encoding
Content-Encoding: gzip
Content-Length: 6653
Connection: close
Content-Type: text/html; charset=UTF-80 -
How do I reduce internal links & cannibalisation from primiary navigation?
SEOmoz tools is reporting each page on our site containing in excess of 200 internal links mostly from our primary navigation menu which it says is too many. This also causes cannibalization on the word towels which i would like to avoid if possible. Is there a way to reduce the number of internal links whilst maintaining a good structure to allow link juice to filter through the site and also reduce cannibalization?
Intermediate & Advanced SEO | | Towelsrus0 -
Do links in the nav bar help SEO?
If I am building a Nav bar should I use my keywords or make it easier for the user to find what they are looking for. IMO one should ALWAYS make a site based on user experience. If it Google and other SEs do count Nav links, would it be best to place more important keys first?
Intermediate & Advanced SEO | | SEODinosaur0