Duplication, pagination and the canonical
-
Hi all, and thank you in advance for your assistance.
We have an issue of paginated pages being seen as duplicates by pro.moz crawlers.
The paginated pages do have duplicated by content, but are not duplicates of each other. Rather they pull through a summary of the product descriptions from other landing pages on the site.
I was planing to use rel=canonical to deal with them, however I am concerned as the paginated pages are not identical to each other, but do feature their own set of duplicate content!
We have a similar issue with pages that are not paginated but feature tabs that alter the URL parameters like so:
?st=BlueWidgets
?st=RedSocks
?st=Offers
These are being seen as duplicates of the main URL, and again all feature duplicate content pulled from elsewhere in the site, but are not duplicates of each other. Would a canonical tag be suitable here?
Many Thanks
-
The rel next prev is not for duplicated content - it just shows google how the parts relate to the whole.
An alternative to the rel next prev is the "Classic Pagination for SEO" that uses noindex another article by Adam
http://searchengineland.com/the-latest-greatest-on-seo-pagination-114284
If you have a duplicate issue, this would solve it as you would noindex all the duplicate pages.
What you need to do (and I can't do this for you), is to look at all the crawl paths that you are providing Google. As I mention above, you are not doing any favors to Google or to your site when you show Google an infinite number of paths to get to the same content. It just wastes Google's time and you don't want to do that when Google also has to crawl the rest of the internet. If you solve this issue, you will solve your duplicate issue.
AJ Kohn just posted an article on the concept of crawl budget that talks about this. I think the article is quite good and it explains why we need to look at all the topics of noindex, nofollow, robots, canonical and rel next prev http://www.blindfiveyearold.com/crawl-optimization
-
Thanks CleverPhD,
That's a very interesting read by Adam Audette too, thanks.
I should say that there's no internal search, each tab has a series of duplicated 'blurbs' taken from the product's unique landing page, while the body copy remains the same across the slight variations in the URL. So with:
example.com/example/?st=BlueWidgets
example.com/example/?st=RedSocks
all of these will feature the same body copy, while the last two will have a series of small descriptions from other landing pages in the site. Would the canonical tag be appropriate in this case? We only need to index 'example.com/example'.
Also, does the rel next prev take into account duplicate content? We want only the main URL indexed as all the paginated pages feature duplicate content, there is no view all page however.
Many thanks
-
If I am understanding the question - I think pulling in some body copy from each search result (and not just the whole page) would be fine. I think Google will see that this is a search result and that you are pointing to other pages. You are probably going to pull in text from the title too. This is common practice in search results - heck Google does it!
If you are still concerned about the pulled in descriptions, your option is to setup the system to have an alternate description for each page. Use the alternate description when you pull it into your main page. It is more work, but it will eliminate this issue.
Separately, paginated pages no longer need to be canonicaled to the index page. You can use rel next and prev.
http://googlewebmastercentral.blogspot.com/2011/09/pagination-with-relnext-and-relprev.html
https://support.google.com/webmasters/answer/1663744?hl=en
It explains to Google the relationship between P1 and P2,3,4,5,n etc.
Beyond that, you need to watch that you do not get into too many paginated pages to get to the exact same product pages. Lets say you had 1,000 widgets that were blue, red and green and also were Free, Expensive or Cheap. You would have several sets of paginated pages (one set for Blue, one for Red, Green, Free, Cheap, Expensive, one for Red and Expensive) etc. It gets to be a little crazy as they all lead to the same set of widget product pages. You need to manage how to have Google crawl all that and not have your Paginated Category pages look like duplicated. Adam Audette writes great stuff on this. Look here for things to consider
http://www.rimmkaufman.com/blog/site-search-dynamic-content-and-seo/01032013/
-
Thank you Robert, and for the helpful link.
You did read my question correctly, however I failed to ask it ask entirely correctly. Just to complicate matters, I neglected to mention that there is body copy on each page, which technically will be duplicated.
It sits above the tabs and does not change, while the tabbed pages - under new URL parameters - pull in a sentence or two of product description from elsewhere (a unique landing page).
So,
?st=BlueWidgets
?st=RedSocks
?st=Offers
will all feature the same body copy and different duplicate content. For obvious reasons, we only want the SE to index the main URL.
Any ideas?
Thanks again
-
Hi
It doesn't sound like rel=canonical is the solution, as each one of your pages might feature multiple pieces of content from various other parts of your website (if I've read your question correctly) - so which would be the canonical version of the page?
You could use Parameter Handling in Webmaster Tools to ensure Google knows what to do with your various parameters. Moz doesn't matter here, as long as Search Engines are aware of how to handle your pages correctly.
There's a good overview here.
I hope that's helpful
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate content: using the robots meta tag in conjunction with the canonical tag?
We have a WordPress instance on an Apache subdomain (let's say it's blog.website.com) alongside our main website, which is built in Angular. The tech team is using Akamai to do URL rewrites so that the blog posts appear under the main domain (website.com/more-keywords/here). However, due to the way they configured the WordPress install, they can't do a wildcard redirect under htaccess to force all the subdomain URLs to appear as subdirectories, so as you might have guessed, we're dealing with duplicate content issues. They could in theory do manual 301s for each blog post, but that's laborious and a real hassle given our IT structure (we're a financial services firm, so lots of bureaucracy and regulation). In addition, due to internal limitations (they seem mostly political in nature), a robots.txt file is out of the question. I'm thinking the next best alternative is the combined use of the robots meta tag (no index, follow) alongside the canonical tag to try to point the bot to the subdirectory URLs. I don't think this would be unethical use of either feature, but I'm trying to figure out if the two would conflict in some way? Or maybe there's a better approach with which we're unfamiliar or that we haven't considered?
Technical SEO | | prasadpathapati0 -
Content incorrectly being duplicated on microsite
So bear with me here as this is probably a technical issue and i am not that technical. We have a microsite for one of our partner organisations and recently we have detected that content from our main site appearing in the URLs for the microsite - both in search results and then when you click through to the SERP. However, this content does not exist on the actual website at all. Anyone have a possible explanation for this? I have tried searching the web but nothing. I assume there is something in the set up of the microsite that is associating it with the content on the main site.
Technical SEO | | Discovery_SA0 -
Duplicate Home Page
Hi everyone! So, I;m using the crawl diagnostics in Moz and it's telling that I've got duplicate content for these two pages: http://www.bridgelanguages.com/
Technical SEO | | Bridge_Education_Group
http://www.bridgelanguages.com/index.php?p=3233&source=3 Would a redirect from the 2nd page to the 1st one be a solution? I'm not even sure where that 2nd link is on the site? Any suggestions or has anyone experienced the same? Thanks! Kelly0 -
Duplicate content on user queries
Our website supports a unique business industry where our users will come to us to look for something very specific (a very specific product name) to find out where they can get it. The problem that we're facing is that the products are constantly changing due to the industry. So, for example, one month, one product might be found on our website, and the next, it might be removed completely... and then might come back again a couple months later. All things that are completely out of our control - and we have no way of receiving any sort of warning when these things might happen. Because of this, we're seeing a lot of duplicate content issues arise... For Example... Product A is not active today... so www.mysite.com/search/productA will return no results... Product B is also not active today... so www.mysite.com/search/productB will also return no results. As per Moz Analytics, these are showing up as duplicate content because both pages indicate "No results were found for {your searched term}." Unfortunately, it's a bit difficult to return a 204 in these situations (which I don't know if a 204 would help anyway) or a 404, because, for a faster user experience, we simultaneously render different sections of the page... so in the very beginning of the page load - we start rendering the faster content (template type of content) that says "returning 200 code, we got the query successfully & we're loading the page".. the unique content results finish loading last since they take the longest. I'm still very new to the SEO world, so would greatly appreciate any ideas or suggestions that might help with this... I'm stuck. 😛 Thanks in advance!
Technical SEO | | SFMoz0 -
Unavoidable duplicate page
Hi, I have an issue where I need to duplicate content on a new site that I am launching. Visitors to the site need to think that product x is part of two different services. e.g. domain.com/service1/product-x domain.com/service2/product-x Re-writing content for product x for each service section is not an option but possibly I could get over that only one product-x page is indexed by search engines. What's the best way to do this? Any advice would be appreciated. Thanks, Stuart
Technical SEO | | Stuart260 -
Should I use canonical?
I'm working on a site that sells audio tracks, the site is a Wordpress build. I've got Yoast and XML Sitemaps running for SEO. The site has been developed (not by myself) to use a flash based audio player. Now this player offers the ability to share, sell products etc... The player has been placed on the homepage and the main music catalog page. The main catalog page has had a custom page type created for itself. This page has been created in such a way that if you visit the actual page from dashboard > Pages and add content then no content will appear on the page. Even the page header is pulled from the PHP. So really as far as I am aware no real content is being seen on the page by a search engine. Except the content on the side bars (it has 2 sidebars on either side of the page.) The homepage has an introductory paragraph and header which are editable via the normal method in Wordpress. A custom post type has been created specifically for music items. When a music item is uploaded it is added to the music item feed on the homepage and music catalog pages. It also creates a separate post for the item itself. These items at the moment also have 'no content' as they are only sidebars with a flash music player. I've started to add short paragraphs and headers to them so there is content on the music item posts. I cannot however, in the time frame/budget start entering deeply descriptive content about each item. (I considered adding the intro paragraph from the homepage and using a canonical tag to the homepage on every music item). So here is my question. What do I do with these music items? Do I use canonical and point them toward the music catalog or the homepage? If so which one? I want the homepage or music catalog page to rank well and I am concerned that search engines aren't going to see these most vital parts of the site. I don't think individual items ranking is helpful, so what do i do?!?! The home and catalog pages are the two main pages of the site. I am going to advise a new player, page type etc... be utilised but at the moment I need a solution quickly. Any help will be much appreciated.
Technical SEO | | benyamin0 -
Canonical Question
Can someone please help me with a question, I am learning about Canonical URls at the moment and have had some errors come up, it is saying ```![Priority 1](http://try.powermapper.com/Reports/89db420a-2cf2-46dc-bae4-543efbefc241/report/Report/p1.png)This page has multiple rel=canonical tags.Line 9 Best Practice[![](http://try.powermapper.com/Reports/89db420a-2cf2-46dc-bae4-543efbefc241/report/Report/dropbox.png)](http://try.powermapper.com/Reports/89db420a-2cf2-46dc-bae4-543efbefc241/report/res/2.view.htm#)![Help](http://try.powermapper.com/Reports/89db420a-2cf2-46dc-bae4-543efbefc241/report/Report/help.png)Search engine behavior is unpredictable when a page has multiple canonical tags. <link rel="canonical" href="http://www.finalduties.co.uk/" /><link rel="alternate" type="application/rss+xml" title="Final Duties – Low cost probate RSS Feed" href="http://www.finalduties.co.uk/feed/" /> <link rel="alternate" type="application/atom+xml" title="Final Duties – Low cost probate Atom Feed" href="http://www.finalduties.co.uk/feed/atom/" /><link rel="pingback" href="http://www.finalduties.co.uk/xmlrpc.php" />That canonical link to Feed? should that be there, I know the Plugin has done this but I am lost to what should be there, I have no duplicate pages as far as I am aware than needs a canonical URL ??Thanks ``` >
Technical SEO | | Chris__Chris0 -
Duplicate Content
Many of the pages on my site are similar in structure/content but not exactly the same. What amount of content should be unique for Google to not consider it duplicate? If it is something like 50% unique would it be preferable to choose one page as the canonical instead of keeping them both as separate pages?
Technical SEO | | theLotter0