Similar pages: noindex or rel:canonical or disregard parameters?!
-
Hey all!
We have a hotel booking website that has search results pages per destinations (e.g. hotels in NYC is dayguest.com/nyc). Pages are also generated for destinations depending on various parameters, that can be star rating, amenities, style of the properties, etc. (e.g. dayguest.com/nyc/4stars, dayguest.com/nyc/luggagestorage, dayguest.com/nyc/luxury, etc.).
In general, all of these pages are very similar, as for example, there might be 10 hotels in NYC and all of them will offer luggage storage. Pages can be nearly identical. Come the problems of duplicate content and loss of juice by dilution.
I was wondering what was the best practice in such a situation: should I just put all pages except the most important ones (e.g. dayguest.com/nyc) as noindex? Or set it as canonical page for all variations? Or in google webmaster tool ask google to disregard the URLs for various parameters? Or do something else altogether?!
Thanks for the help!
-
Sorry, I don't think I explained (1) very well. What I mean is that you may want to gradually change the site architecture so that not all of the search options are crawlable pages. This could mean putting some filters in form variables, for example (instead of links). It could also mean making sure that certain paths always converge. There's no easy solution. This is a problem all big sites face, and it's very dependent on the platform/CMS.
With (2), a "level" could be anything. Maybe there are major cities you need to cover but everything else could stay out of the index. This really depends on your information architecture, but there's always something that's high priority and something that's low priority. If you can focus Google on the high-priority pages, it can definitely work in your favor. The trick is figuring out how to build the logic such that you can code that dynamically. I've found there's almost always an answer, but it can take some creative thinking. I definitely don't encourage doing it manually.
If the results are easy to group by city and you can code that logic, the canonical may be fine. Since the search results could be different in some cases, canonical isn't technically the best choice, but it does often work. It really depends on how different they can be, so it's a bit tricky.
-
Honestly, option 1 would be a nightmare. Imagine that we add one property in a city not covered. There are about 50 amenities, and most hotels feature most, so as much new pages generated. That would become quickly unmanageable, to handle manually.
Not sure I understand your second option. There are not several "level", only one under the "city" in which the property is. But mutliplied by several cities, they quickly become hundreds, if not thousands.
Why would it not be possible/desirable to code all such pages as canonical pages of each city?
-
Ugh - that's what I was afraid you'd say. Unfortunately, the coincidental problem can't really be easily solved with code, which makes it hard to use canonical tags. There's no good way to tell the site when to use them.
So, a couple of options:
(1) Try to gradually rework the structure so that there are less of these paths.
(2) Consider using META NOINDEX on some lower-value paths. Internal search results don't have great value for Google, so you could let the major categories/options be indexed, but the cut off a certain level (index nothing "below" it). That may be more feasible from a code standpoint.
(3) Use rel=prev/next, use unique TITLEs if possible (based on the query) and just clean things up the best you can, but leave everything indexed.
It depends a lot on your scope, structure, and your future plans. I'm not sure there's one "right" answer.
-
Ugh - that's what I was afraid you'd say. Unfortunately, the coincidental problem can't really be easily solved with code, which makes it hard to use canonical tags. There's no good way to tell the site when to use them.
So, a couple of options:
(1) Try to gradually rework the structure so that there are less of these paths.
(2) Consider using META NOINDEX on some lower-value paths. Internal search results don't have great value for Google, so you could let the major categories/options be indexed, but the cut off a certain level (index nothing "below" it). That may be more feasible from a code standpoint.
(3) Use rel=prev/next, use unique TITLEs if possible (based on the query) and just clean things up the best you can, but leave everything indexed.
It depends a lot on your scope, structure, and your future plans. I'm not sure there's one "right" answer.
-
These pages return the same results coincidentally, that's the issue... The more properties we get on board, the less likely it is that these pages will be similar. But it might take a long time to build that up, and we may never achieve it.
-
Ah, got it - yeah, I think rel=canonical would be fine there, but I'd want to understand your architecture better. Are these pages returning the same results coincidentally, or are these two URLs that basically land on the same combination of search options/filters. If it's the former, it's a lot tougher, because that's just a coincidence happening at large scale. If it's the latter, a solid canonical scheme could help a lot, but I'd also explore whether these paths are useful (or should be indexed at all). In other words, in the long term, it might be better to use one URL consistently, even if people navigate by different paths to reach it.
-
That's odd, they were supposed to be the same. And yeah, results come and go as properties are added/removed from our inventory.
The following is what I wanted to highlight:
http://www.dayguest.com/rome-dayuse/concierge
http://www.dayguest.com/rome-dayuse/air-conditioning
As you can see, the pages are identical, except that one has 5 properties and the other one has 6. Most overlap. There are so manies property "features" or "category", that some list have exactly the same list. Actually, SEOMOZ find that I have over 1700 pages with duplicate content, most being search results page with closely similar contents such as these.
Hence my issue...
-
Are they duplicates in the sense that there are currently no results? I wouldn't generally use rel=canonical on these, because the search results should (theoretically) be different. These are distinct regions and, I assume, have unique properties.
If they're just returning no results, I'd actually consider a META NOINDEX until there are results available. Otherwise, this is likely to be treated as a soft 404 by Google (not a disaster, honestly). It depends on whether results come and go or if you're just building out the site and there will be data later. If the data isn't ready, I think META NOINDEX is a good way to go. Until results are available, these pages have no search value.
-
Well, let me give you an example, look at this page: http://www.dayguest.com/milan-city-centre-dayuse?amenities=10
And this page: http://www.dayguest.com/milan-central-station-dayuse?amenities=10
Do you see what I'm talking about? The pages are identical but for the page title/description & a few words on the page.
So, you'd go for canonical?
-
The relation is more hierarchal then next/previous. Judging from the post you mentioned, canonical would be more appropriate...
-
Sorry, I'm not clear on whether these are paginated search results or actual property pages that vary only by a small amount. As @SEO5 said, if these are paginated search results, you could use rel=prev/next. It's a bit tricky to set up with search filters (you need rel=prev/next + rel=canonical).
If these are nearly identical property pages, then it depends on how they differ. If they only differ by one attribute, I'd probably lean toward the canonical tag.
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
[Organization schema] Which Facebook page should be put in "sameAs" if our organization has separate Facebook pages for different countries?
We operate in several countries and have this kind of domain structure:
Technical SEO | | Telsenome
example.com/us
example.com/gb
example.com/au For our schemas we've planned to add an Organization schema on our top domain, and let all pages point to it. This introduces a problem and that is that we have a separate Facebook page for every country. Should we put one Facebook page in the "sameAs" array? Or all of our Facebook pages? Or should we skip it altogether? Only one Facebook page:
{
"@type": "Organization",
"@id": "https://example.com/org/#organization",
"name": "Org name",
"url": "https://example.com/org/",
"sameAs": [
"https://www.linkedin.com/company/xxx",
"https://www.facebook.com/xxx_us"
], All Facebook pages:
{
"@type": "Organization",
"@id": "https://example.com/org/#organization",
"name": "Org name",
"url": "https://example.com/org/",
"sameAs": [
"https://www.linkedin.com/company/xxx",
"https://www.facebook.com/xxx_us"
"https://www.facebook.com/xxx_gb"
"https://www.facebook.com/xxx_au"
], Bonus question: This reasoning springs from the thought that we only should have one Organization schema? Or can we have a multiple sub organizations?0 -
how to set rel canonical on wordpress.com sites
I know how to do this with a wordpress.org site but I have a client that does not want to switch and without a plugin I am lost. any help would be greatly appreciated. Jeremy Wood
Technical SEO | | SOtBOrlando0 -
After I 301 redirect duplicate pages to my rel=canonical page, do I need to add any tags or code to the non canonical pages?
I have many duplicate pages. Some pages have 2-3 duplicates. Most of which have Uppercase and Lowercase paths (generated by Microsoft IIS). Does this implementation of 301 and rel=canonical suffice? Or is there more I could do to optimize the passing of duplicate page link juice to the canonical. THANK YOU!
Technical SEO | | PFTools0 -
Rel=Canonical Help
The site in question is www.example.com/example. The client has added a rel=canonical tag to this page as . In other words, instead of putting the tag on the pages that are not to be canonical and pointing them to this one, they are doing it backwards and putting the same URL as the canonical one as the page they are putting the tag on. They have done this with thousands of pages. I know this is incorrect, but my question is, until the issue is resolved, are these tags hurting them at all just being there?
Technical SEO | | rock220 -
Using Rel Nofollow on Duplicate Pages
Hi there, I have a rather large site that has duplicate content on many pages due to how it's being spidered by google. I was hoping I could set the internal link to this page as "nofollow." My question is that I have hundreds of other sites with backlinks to these duplicate content pages.. will this affect me negatively if I tell google not to index the duplicated pages?
Technical SEO | | trialminecraftserverfinder0 -
Categories and rel canonical
Hello everyone, I have a doubt in how to approach this problem. I have a local business website, and i want to rank this website for our main KW. So the idea is to rank the main Keyword to the home page www.sitename.com At the same time we blog every week and one of the categories is the same has the main Keyword. It makes sense because the majority of the blog posts are about it. In a certain way the homepage and the category page are competing for the same keyword. How can i approach this problem? Should i use rel canonical in the category page, pointing to the homepage? Thanks for your help.
Technical SEO | | Barbio0 -
Hreflang on non-canonical pages
Hi! I've been trying to figure out what is the best way to solve this dilemma with duplicate content and multiple languages across domains. 1 product info page 2 same product but GREEN
Technical SEO | | LarsEriksson
3 same product but RED
4 same product but YELLOW **Question: ** Since pages 2,3,4 just varies slightly I use the canonical tag to indicate they are duplicates of page 1. Now I also want to indicate there are other language versions with the_ rel="alternate" hreflang="x" _element. Should I place the _rel="alternate" hreflang="x" _on the canonical page only pointing to the canonical page with "x" language. Should I place the _rel="alternate" hreflang="x" _on all pages pointing to the canonical page with the "x" language? Should I place the _rel="alternate" hreflang="x" _on all pages and then point it to the translated page (even if it is not a canonical page) ? /Lars0 -
On-Page Report Card & Rel Canonical
Hello, I ran one of our pages through the On-Page Report Card. Among the results we are getting a lower grade due to the following "critical factor" : Appropriate Use of Rel Canonical Explanation If the canonical tag is pointing to a different URL, engines will not count this page as the reference resource and thus, it won't have an opportunity to rank. Make sure you're targeting the right page (if this isn't it, you can reset the target above) and then change the canonical tag to reference that URL. Recommendation We check to make sure that IF you use canonical URL tags, it points to the right page. If the canonical tag points to a different URL, engines will not count this page as the reference resource and thus, it won't have an opportunity to rank. If you've not made this page the rel=canonical target, change the reference to this URL. NOTE: For pages not employing canonical URL tags, this factor does not apply. This is for an e-commerce site, and the canonical links are inserted automatically by the cart software. The cart is also creating the canonical url as a relative link, not an absolute URL. In this particular case it's a self-referential link. I've read a ton on this and it seems that this should be okay (I also read that Bing might have an issue with this). Is this really an issue? If so, what is the best practice to pass this critical factor? Thanks, Paul
Technical SEO | | rwilson-seo0