Duplicate Content
-
I'm currently working on a site that sells appliances. Currently, there are thousands of "issues" with this site, many of them dealing with duplicate content. Now, the product pages can be viewed in "List" or "Grid" format. As Lists, they have very little in the way of content.
My understanding is that the duplicate content arises from different URLs going to the same site. For instance, the site might have a different URL when told to display 9 items than when told to display 15. This could then be solved by inserting rel = canonical.
Is there a way to take a site and get a list of all possible duplicates? This would be much easier than slogging through every iteration of the options and copying down the URLs. Also, is there anything I might be missing in terms of why there is duplicate content? Thank you.
-
Thank you.
-
Essentially, you need to figure out the primary causes of duplicate content and then pick a way to handle it. A great spot to find your duplicate content is in Google Webmaster Tools under the HTML Improvements section. Look at the section titled "Duplicate Title Tags" and this will show you a spot where you very well may have duplicate content.
The primary ways to take care of it will be:
- NoIndexing
- Canonicalizing
- Parameter Handling in Google Webmaster Tools
Choosing which technique you use will likely be a result of what you are technically able to implement, based on each unique challenge from the different causes of duplicate content. You likely won't be able to kill all of the duplicate content at once. I suggest handling it in chunks. For example, first tackle the Items Shown problem you reference in your question. As you mentioned, you could canonicalize it. Basically, whenever the URL reflects your Item Parameter, you could canonicalize it back to the representative URL.
ie: yoursite.com/category-results&items=15 --> would canonicalize to yoursite.com/category-results
Once you have the Number of Item pages out of the index, focus on the next biggest cause of duplicate content.
-
Have you created a Moz campaign for the site? As Mozbot crawls your site and tells you about all the duplicate content issues that you may have.
To solve that, instead of checking of changing code all over the place, make the changes on those pages that you already know have duplicate content issues (like in the example you gave) and then let Mozbot re-crawl the site so you can see which pages still have issues to solve them.
The rel canonical should point to the one page that has the most info (as you said list has less, grid will be better for the canonical).
If your site uses several categories and subcategories, you should also have a look at the noindex tag, as sometimes that creates duplicate content issues too (subcategory products listed in the root category). The same applies to any kind of listings, such as search results (which should be noindexed).
Hope this helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
On Site Question: Duplicate H2...
Hi All A few on-site audit tools pull information on duplicate H2 tags on pages. This implies it's a bad thing and should be fixed - is that the case? On one of my sites the tag-line is in H2 in the header, so appears on every page... Just wondering if this is something worth fixing. Thanks
On-Page Optimization | | GTAMP0 -
Is This A Reason To Move Content?
Dear All, I am questioning my initial decisions when I planned a site due to reading lots of info on moz. Although what I have read has made me question what I have already done, I can't find anything that is specific to my exact case, so here goes. I recently built a shopping cart in OpenCart. I want the site to have lots of information on the products it sells. I have populated each category with at least 1000 words of content that is specific to the products in that category, also I have some information pages that have no products in them at all, just copy. So the shopping site actually has a few pages that look like a static website and a few that look like a normal shopping cart. My thought behind this was I wanted the pages with lots of info to rank and become authoritative, in some way elevating the whole site. I have recently put a blog on the site, and a combination of that, and reading Moz has lead me think that I should move all the content from the category pages to the blog, and deep link each blog post to it's relevant products and category. From what I have read it would be easier to get the blog ranking and acknowledged as an authority rather than 30 category pages. Also each 1500+ word category page will make at least 3-4 nice blog posts, and each post can be focused on a single keyword rather than a large category page that has maybe 3-4 keywords it's trying to rank for. Also the blog is much better optimised than a standard OC category page (even using extensions with them). The only negative I can see is moving the content, but the site is less that 2 months old, and the amount of link juice it has is negligible. Does google cut new sites a bit of slack in these situations of moving content around, or will I be seen as 'up to something' by google? I guess my question is, am I barking up the right tree? Or is the old adage 'a little information is dangerous' true in this case, and I just about to make a load of work for the sake of it with no real benefit. However, if I am to make such a dramatic change to the sites architecture I think the time is now, before things start gaining juice & rank. I hope I have explained my situation clearly and I thank anyone who can offer me any advice. Great forum, Thank you, Ian
On-Page Optimization | | cookie7770 -
Empty public profiles are viewed as duplicate content. What to do?
Hi! I manage a social networking site. We have a lot of public user profiles that are viewed as duplicate content. This is because these users haven't filled out any public profile info and thus the profiles are "empty" (except for the name). Is this something I should worry about? If yes, what are my options to solve this? Thanks!
On-Page Optimization | | thomasvanderkleij0 -
Duplicate Content Issues with Forum
Hi Everyone, I just signed up last night and received the crawl stats for my site (ShapeFit.com). Since April of 2011, my site has been severely impacted by Google's Panda and Penguin algorithm updates and we have lost about 80% of our traffic during that time. I have been trying to follow the guidelines provided by Google to fix the issues and help recover but nothing seems to be working. The majority of my time has been invested in trying to add content to "thin" pages on the site and filing DMCA notices for copyright infringement issues. Since this work has not produced any noticeable recovery, I decided to focus my attention on removing bad backlinks and this is how I found SEOmoz. My question is about duplicate content. The crawl diagnostics showed 6,000 errors for duplicate page content and the same for duplicate page title. After reviewing the details, it looks like almost every page is from the forum (shapefit.com/forum). What's the best way to resolve these issues? Should I completely block the "forum" folder from being indexed by Google or is there something I can do within the forum software to fix this (I use phpBB)? I really appreciate any feedback that would help fix these issues so the site can hopefully start recovering from Panda/Penguin. Thank you, Kris
On-Page Optimization | | shapefit0 -
Copyscape Duplicate Content Ownership Question
We have a site that has had its content copied verbatim to numerous other sites and articles. We were advised to change our content but the content is originally ours. Does google take that into account before they apply duplicate penalties? And shouldn't copyscape be able to show this information in their reports? It just doesnt seem right that the originating author would have to change content because everyone else is stealing it. Any clarification on this?
On-Page Optimization | | anthonytjm0 -
Content for ecommerce site
How important on site/page contents are for ecommerce site. Keeping in mind the page layout. Its not that important to have page copy/content at all for ecommerce sites If yes, does position of content is an important factor? if putting page copy/content in upper fold of a page then the most important thing which is product itself will have less exposure if putting near the footer of the page, does that seem like doing just for the sake of SEs and ranking. How important internal linking form that content would be compare to left panel links or links at the header of a website Thanks Rick
On-Page Optimization | | RickGa0 -
What is the best solution for printable product pages (duplicate content)?
What do you think is the best solution for preventing duplicate content issues on printable versions of product pages? The printable versions are identical in content. Disallow in Robots.txt? Meta Robots No Index, Follow? Meta Robots No Index No Follow? Rel Canonical?
On-Page Optimization | | BlinkWeb1 -
Content within JavaSccript code
I know that it is not a good practice to inlcude SEO content within JavaScript, but are there exceptions to what Google can spider or is it best to just avoid completely?
On-Page Optimization | | mjmorse0