Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Internal search : rel=canonical vs noindex vs robots.txt
-
Hi everyone,
I have a website with a lot of internal search results pages indexed. I'm not asking if they should be indexed or not, I know they should not according to Google's guidelines. And they make a bunch of duplicated pages so I want to solve this problem.
The thing is, if I noindex them, the site is gonna lose a non-negligible chunk of traffic : nearly 13% according to google analytics !!!
I thought of blocking them in robots.txt. This solution would not keep them out of the index. But the pages appearing in GG SERPS would then look empty (no title, no description), thus their CTR would plummet and I would lose a bit of traffic too...
The last idea I had was to use a rel=canonical tag pointing to the original search page (that is empty, without results), but it would probably have the same effect as noindexing them, wouldn't it ? (never tried so I'm not sure of this)
Of course I did some research on the subject, but each of my finding recommanded one of the 3 methods only ! One even recommanded noindex+robots.txt block which is stupid because the noindex would then be useless...
Is there somebody who can tell me which option is the best to keep this traffic ?
Thanks a million
-
Yeah, normally I'd say to NOINDEX those user-generated search URLs, but since they're collecting traffic, I'd have to side with Alan - a canonical may be your best bet here. Technically, they aren't "true" duplicates, but you don't want the 1K pages in the index, you don't want to lose the traffic (which NOINDEX would do), and you don't want to kill those pages for users (which a 301 would do).
Only thing I'd add is that, if some of these pages are generating most of the traffic (e.g. 10 pages = 90% of the traffic for these internal searches), you might want to make those permanent pages, like categories in your site architecture, and then 301 the custom URLs to those permanent pages.
-
Huh not sure since I'm not a developer (and didn't work on that website dev) but I'd say all of the above^^. If useful, here are their url structure, there's two kind :
- /searchpage.htm?action=search&pagenumber=xx&query=product+otherterms
So I guess they are generated when a user makes a search
paginated (about 15 pages generally),
and I can approximately know how much they are duplicates, I can tell some are probably overlapping when there's a lot of variations for the product. There are just a few complete duplicates (when the product searched is the same with different added terms, doesn't happen a lot in this list).
- /searchpage-searchterm-addedterm-number.htm
Those I find surprising, I don't know if they are pages generated with a fixed url, or if they are rewritten (Haven't looked at the htaccess yet, but I will, god I have a headache just thinking about reading that thing lol)
There's about a thousand of them all (from GGanalytics, about half of each sort, and nearly all are indexed by Google), on a website with about 12 thou total in pages.
Maybe the traffic loss will be compensated by the removed competition between those search pages and the product pages (and the rel=canonical is surely way less brutal than a noindex for that matter), but without experience in these kind of situations it's hard to make a decision...
Really appreciate you guys taking the time to help !
-
Alan's absolutely right about how canonical works, but I just want to clarify something - what about these pages is duplicated? In other words, are these regular searches (like product searches) with duplicate URLs, are these paginated searches (with page 2, 3, etc. that appear thin), or are these user-generated searches spinning out into new search pages (not exact duplicates but overlapping)? The solutions can vary a bit with the problem, and internal search is tricky.
-
Just one more point, a canonical is just a hint to the search engines, it is not a directive, so if they think that the pages should not be merged, they will ignore them, so in that way, they may make the decision for you
-
Not a lot of real duplicates, they're more alike, and the most visited are unique, so I'll keep the most important ones and just toss a few duplicates.
Thanks a lot for your help, problem solved !
-
no not like a noindex. more like a merge.
will it make you rank for many keywords? not necessarly, as a page all about blue widgets is going to rank higher then a page has many different subjects including blue widgets.
A canonical is really for duplicate content, or very alike content.
So you have to decide what your page is, is it duplicate or alike content, or is it unique?
if the pages are unique then do nothing, let them rank. if yopu think they are alike, then use a canonical. if there are only a few, then i would not worry either way.
if you decide they are unique, they I would look at making the page title unique also, maybe even description too.
-
Thanks for your answer
Ok you're saying indeed it will act like a noindex over time.
So if one of the result page would have ranked for a particular query, it will not rank any more, like with a noindex => it will lose the 13% of traffic it generated...
Otherwise it would be too easy to make a page rank for the keywords used in a bunch of other pages that refer to it via rel=canonical... wouldn't it ?
I'm starting to think I can't do anything... Maybe just noindex a bunch of them that cause duplicates, and leave the rest in the index.
-
Rel=canonical is tge way to go, it will tell the search results that all credit for all diffrent urls go to the original search page. eventual onl;y the original search page will exist in the index.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Dynamic Canonical Tag for Search Results Filtering Page
Hi everyone, I run a website in the travel industry where most users land on a location page (e.g. domain.com/product/location, before performing a search by selecting dates and times. This then takes them to a pre filtered dynamic search results page with options for their selected location on a separate URL (e.g. /book/results). The /book/results page can only be accessed on our website by performing a search, and URL's with search parameters from this page have never been indexed in the past. We work with some large partners who use our booking engine who have recently started linking to these pre filtered search results pages. This is not being done on a large scale and at present we only have a couple of hundred of these search results pages indexed. I could easily add a noindex or self-referencing canonical tag to the /book/results page to remove them, however it’s been suggested that adding a dynamic canonical tag to our pre filtered results pages pointing to the location page (based on the location information in the query string) could be beneficial for the SEO of our location pages. This makes sense as the partner websites that link to our /book/results page are very high authority and any way that this could be passed to our location pages (which are our most important in terms of rankings) sounds good, however I have a couple of concerns. • Is using a dynamic canonical tag in this way considered spammy / manipulative? • Whilst all the content that appears on the pre filtered /book/results page is present on the static location page where the search initiates and which the canonical tag would point to, it is presented differently and there is a lot more content on the static location page that isn’t present on the /book/results page. Is this likely to see the canonical tag being ignored / link equity not being passed as hoped, and are there greater risks to this that I should be worried about? I can’t find many examples of other sites where this has been implemented but the closest would probably be booking.com. https://www.booking.com/searchresults.it.html?label=gen173nr-1FCAEoggI46AdIM1gEaFCIAQGYARS4ARfIAQzYAQHoAQH4AQuIAgGoAgO4ArajrpcGwAIB0gIkYmUxYjNlZWMtYWQzMi00NWJmLTk5NTItNzY1MzljZTVhOTk02AIG4AIB&sid=d4030ebf4f04bb7ddcb2b04d1bade521&dest_id=-2601889&dest_type=city& Canonical points to https://www.booking.com/city/gb/london.it.html In our scenario however there is a greater difference between the content on both pages (and booking.com have a load of search results pages indexed which is not what we’re looking for) Would be great to get any feedback on this before I rule it out. Thanks!
Technical SEO | | GAnalytics1 -
Discrepancy in actual indexed pages vs search console
Hi support, I checked my search console. It said that 8344 pages from www.printcious.com/au/sitemap.xml are indexed by google. however, if i search for site:www.printcious.com/au it only returned me 79 results. See http://imgur.com/a/FUOY2 https://www.google.com/search?num=100&safe=off&biw=1366&bih=638&q=site%3Awww.printcious.com%2Fau&oq=site%3Awww.printcious.com%2Fau&gs_l=serp.3...109843.110225.0.110430.4.4.0.0.0.0.102.275.1j2.3.0....0...1c.1.64.serp..1.0.0.htlbSGrS8p8 Could you please advise why there is discrepancy? Thanks.
Technical SEO | | Printcious0 -
2 sitemaps on my robots.txt?
Hi, I thought that I just could link one sitemap from my site's robots.txt but... I may be wrong. So, I need to confirm if this kind of implementation is right or wrong: robots.txt for Magento Community and Enterprise ...
Technical SEO | | Webicultors
Sitemap: http://www.mysite.es/media/sitemap/es.xml
Sitemap: http://www.mysite.pt/media/sitemap/pt.xml Thanks in advance,0 -
Ok to internally link to pages with NOINDEX?
I manage a directory site with hundreds of thousands of indexed pages. I want to remove a significant number of these pages from the index using NOINDEX and have 2 questions about this: 1. Is NOINDEX the most effective way to remove large numbers of pages from Google's index? 2. The IA of our site means that we will have thousands of internal links pointing to these noindexed pages if we make this change. Is it a problem to link to pages with a noindex directive on them? Thanks in advance for all responses.
Technical SEO | | OMGPyrmont0 -
Are robots.txt wildcards still valid? If so, what is the proper syntax for setting this up?
I've got several URL's that I need to disallow in my robots.txt file. For example, I've got several documents that I don't want indexed and filters that are getting flagged as duplicate content. Rather than typing in thousands of URL's I was hoping that wildcards were still valid.
Technical SEO | | mkhGT0 -
Is there any value in having a blank robots.txt file?
I've read an audit where the writer recommended creating and uploading a blank robots.txt file, there was no current file in place. Is there any merit in having a blank robots.txt file? What is the minimum you would include in a basic robots.txt file?
Technical SEO | | NicDale0 -
Internal vs external blog and best way to set up
I have a client that has two domians registered - one uses www.keywordaustralia.com the other uses www.keywordaelaide.com He had already bought and used the first domain when he came to me I suggested the second as being worth buying as going for a more local keyword would be more appropriate. Now I have suggested to him that a blog would be a worthy use of the second domain and a way to build links to his site - however I am reading that as all links will be from the same site it wont be worth much in the long run and an internal blog is better as it means updated content on his site. should i use the second domain for blog, or just 301 the second domain to his first domain. Or is it viable to use the second domain as the blog and just set up an rss feed on his page ? Is there a way to have the second domain somehow 'linked' to his first domain with the blog so that google sees them as connected ? NOOBIE o_0
Technical SEO | | mamacassi0 -
Robots.txt file getting a 500 error - is this a problem?
Hello all! While doing some routine health checks on a few of our client sites, I spotted that a new client of ours - who's website was not designed built by us - is returning a 500 internal server error when I try to look at the robots.txt file. As we don't host / maintain their site, I would have to go through their head office to get this changed, which isn't a problem but I just wanted to check whether this error will actually be having a negative effect on their site / whether there's a benefit to getting this changed? Thanks in advance!
Technical SEO | | themegroup0