Internal search : rel=canonical vs noindex vs robots.txt
-
Hi everyone,
I have a website with a lot of internal search results pages indexed. I'm not asking if they should be indexed or not, I know they should not according to Google's guidelines. And they make a bunch of duplicated pages so I want to solve this problem.
The thing is, if I noindex them, the site is gonna lose a non-negligible chunk of traffic : nearly 13% according to google analytics !!!
I thought of blocking them in robots.txt. This solution would not keep them out of the index. But the pages appearing in GG SERPS would then look empty (no title, no description), thus their CTR would plummet and I would lose a bit of traffic too...
The last idea I had was to use a rel=canonical tag pointing to the original search page (that is empty, without results), but it would probably have the same effect as noindexing them, wouldn't it ? (never tried so I'm not sure of this)
Of course I did some research on the subject, but each of my finding recommanded one of the 3 methods only ! One even recommanded noindex+robots.txt block which is stupid because the noindex would then be useless...
Is there somebody who can tell me which option is the best to keep this traffic ?
Thanks a million
-
Yeah, normally I'd say to NOINDEX those user-generated search URLs, but since they're collecting traffic, I'd have to side with Alan - a canonical may be your best bet here. Technically, they aren't "true" duplicates, but you don't want the 1K pages in the index, you don't want to lose the traffic (which NOINDEX would do), and you don't want to kill those pages for users (which a 301 would do).
Only thing I'd add is that, if some of these pages are generating most of the traffic (e.g. 10 pages = 90% of the traffic for these internal searches), you might want to make those permanent pages, like categories in your site architecture, and then 301 the custom URLs to those permanent pages.
-
Huh not sure since I'm not a developer (and didn't work on that website dev) but I'd say all of the above^^. If useful, here are their url structure, there's two kind :
- /searchpage.htm?action=search&pagenumber=xx&query=product+otherterms
So I guess they are generated when a user makes a search
paginated (about 15 pages generally),
and I can approximately know how much they are duplicates, I can tell some are probably overlapping when there's a lot of variations for the product. There are just a few complete duplicates (when the product searched is the same with different added terms, doesn't happen a lot in this list).
- /searchpage-searchterm-addedterm-number.htm
Those I find surprising, I don't know if they are pages generated with a fixed url, or if they are rewritten (Haven't looked at the htaccess yet, but I will, god I have a headache just thinking about reading that thing lol)
There's about a thousand of them all (from GGanalytics, about half of each sort, and nearly all are indexed by Google), on a website with about 12 thou total in pages.
Maybe the traffic loss will be compensated by the removed competition between those search pages and the product pages (and the rel=canonical is surely way less brutal than a noindex for that matter), but without experience in these kind of situations it's hard to make a decision...
Really appreciate you guys taking the time to help !
-
Alan's absolutely right about how canonical works, but I just want to clarify something - what about these pages is duplicated? In other words, are these regular searches (like product searches) with duplicate URLs, are these paginated searches (with page 2, 3, etc. that appear thin), or are these user-generated searches spinning out into new search pages (not exact duplicates but overlapping)? The solutions can vary a bit with the problem, and internal search is tricky.
-
Just one more point, a canonical is just a hint to the search engines, it is not a directive, so if they think that the pages should not be merged, they will ignore them, so in that way, they may make the decision for you
-
Not a lot of real duplicates, they're more alike, and the most visited are unique, so I'll keep the most important ones and just toss a few duplicates.
Thanks a lot for your help, problem solved !
-
no not like a noindex. more like a merge.
will it make you rank for many keywords? not necessarly, as a page all about blue widgets is going to rank higher then a page has many different subjects including blue widgets.
A canonical is really for duplicate content, or very alike content.
So you have to decide what your page is, is it duplicate or alike content, or is it unique?
if the pages are unique then do nothing, let them rank. if yopu think they are alike, then use a canonical. if there are only a few, then i would not worry either way.
if you decide they are unique, they I would look at making the page title unique also, maybe even description too.
-
Thanks for your answer
Ok you're saying indeed it will act like a noindex over time.
So if one of the result page would have ranked for a particular query, it will not rank any more, like with a noindex => it will lose the 13% of traffic it generated...
Otherwise it would be too easy to make a page rank for the keywords used in a bunch of other pages that refer to it via rel=canonical... wouldn't it ?
I'm starting to think I can't do anything... Maybe just noindex a bunch of them that cause duplicates, and leave the rest in the index.
-
Rel=canonical is tge way to go, it will tell the search results that all credit for all diffrent urls go to the original search page. eventual onl;y the original search page will exist in the index.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Does "google selected canonical" pass link juice the same as "user selected canonical"?
We are in a bit of a tricky situation since a key top-level page with lots of external links has been selected as a duplicate by Google. We do not have any canonical tag in place. Now this is fine if Google passes the link juice towards the page they have selected as canonical (an identical top-level page)- does anyone know the answer to this question? Due to various reasons, we can't put a canonical tag ourselves at this moment in time. So my question is, does a Google selected canonical work the same way and pass link juice as a user selected canonical? Thanks!
Technical SEO | | Lewald10 -
Rel=canonical or 301 to pass on page authority/juice
I have a large body of product support documentation and there are similar pages for each of versions of the product, with minor changes as the product changes. The two oldest versions of this documentation get the best ranking and are powering Google snippets--however, this content is out of date. The team responsible for the support documentation wants current pages to rank higher. I suggested 301 redirects but they want to maintain the old page content for clients still using the older version of the product. Is there a way to move a page's power to a more updated version of the page, but without wiping out the old content? Considering recommending canonical tags, but I'm not sure this will get me all the way there either as there are some differences between pages, especially as the product has changed over time. Thoughts?
Technical SEO | | rachelholdgrafer0 -
Canonical redirects
Hello, I have a quick question: I use wordpress for my website. I have a plugin for translating the website in other languages. Thus, I have 2 versions of urls, one with /en, one without (original languale). This has been seen as duplicate content. I have been advised that the best to do is to use canonical redirect. Should I use it on the general header.php (the only header I can find in the CMS), or should I redirect each page singularly? I believe the second is the best way, but I can't find headers and txt documents for each page in my FTP. As well I have seen this post, in which is explained that canonical redirects can be done directly in the general header.php http://www.bin-co.com/blog/2009/02/avoid-duplicate-content-use-canonical-url-in-wordpress-fix-plugin/ Is it true? You have any suggestion?
Technical SEO | | socialengaged
Thanks! 🙂 Eugenio0 -
What if I point my canonicals to a URL version that is not used in internal links
My web developer has pointed the "good" URLs that I use in my internal link structure (top-nav/footer) to another duplicate version of my pages. Now the URLs that receive all the canonical link value are not the ones I use on my website. is this a problem and why??? In theory the implementation is good because both have equal content. But does it harm my link equity if it directs to a URL which is not included in my internal link architecture.
Technical SEO | | DeptAgency0 -
Canonical warnings
[1] My site development tool (XSP) has recently added the canonical reference as an auto-generated tag, so every page of my site now has it. Why is SEOmoz warning me that I have hundreds of pages of canonicals if it's supposed to be a GOOD thing? [2] Google is still seeing the pages without the canonical tag because that's how they were indexed. Will they eventually get purged from their index, or should I be proactive about that, and if so, how? Thanks for any input.
Technical SEO | | PatioLifeStyle0 -
How can I make Google Webmaster Tools see the robots.txt file when I am doing a .htacces redirec?
We are moving a site to a new domain. I have setup an .htaccess file and it is working fine. My problem is that Google Webmaster tools now says it cannot access the robots.txt file on the old site. How can I make it still see the robots.txt file when the .htaccess is doing a full site redirect? .htaccess currently has: Options +FollowSymLinks -MultiViews
Technical SEO | | RalphinAZ
RewriteEngine on
RewriteCond %{HTTP_HOST} ^(www.)?michaelswilderhr.com$ [NC]
RewriteRule ^ http://www.s2esolutions.com/ [R=301,L] Google webmaster tools is reporting: Over the last 24 hours, Googlebot encountered 1 errors while attempting to access your robots.txt. To ensure that we didn't crawl any pages listed in that file, we postponed our crawl. Your site's overall robots.txt error rate is 100.0%.0 -
Which is best of narrow by search URLs? Canonical or NOINDEX
I have set canonical to all narrow by search URLs. I think, it's not working well. You can get more idea by following URLs. http://www.vistastores.com/table-lamps?material_search=1328 http://www.vistastores.com/table-lamps?finish_search=146 These kind of page have canonical tag which is pointing to following one. http://www.vistastores.com/table-lamps Because, it's actual page which I want to out rank. But, all narrow by search URLs have very different products compare to base URLs. So, How can we say it duplicate one? Which is best solution for it. Canonical or NOINDEX it by Robots?
Technical SEO | | CommercePundit0 -
On-Page Report Card, rel canonical
My site has the rel canonical tags set up for it. The developers say that it is set up correctly. Looking at the source code myself, it looks (to my untutored eyes) to be set up correctly. However, on the On Page Report Card for every page I have checked, it says that it doesn't point to the right page. I'd really like to change all my 'B's to 'A's, but I simply can't see what the issue is.
Technical SEO | | Breakout0