Am I doing enough to rid duplicate content?
-
I'm in the middle of a massive cleanup effort of old duplicate content on my site, but trying to make sure I'm doing enough.
My main concern now is a large group of landing pages. For example:
http://www.boxerproperty.com/lease-office-space/office-space/dallas
http://www.boxerproperty.com/lease-office-space/executive-suites/dallas
http://www.boxerproperty.com/lease-office-space/medical-space/dallas
And these are just the tip of the iceberg. For now, I've put canonical tags on each sub-page to direct to the main market page (the second two both point to the first, http://www.boxerproperty.com/lease-office-space/office-space/dallas for example). However this situation is in many other cities as well, and each has a main page like the first one above. For instance:
http://www.boxerproperty.com/lease-office-space/office-space/atlanta
http://www.boxerproperty.com/lease-office-space/office-space/chicago
http://www.boxerproperty.com/lease-office-space/office-space/houston
Obviously the previous SEO was pretty heavy-handed with all of these, but my question for now is should I even bother with canonical tags for all of the sub-pages to the main pages (medical-space or executive-suites to office-space), or is the presence of all these pages problematic in itself? In other words, should http://www.boxerproperty.com/lease-office-space/office-space/chicago and http://www.boxerproperty.com/lease-office-space/office-space/houston and all the others have canonical tags pointing to just one page, or should a lot of these simply be deleted?
I'm continually finding more and more sub-pages that have used the same template, so I'm just not sure the best way to handle all of them. Looking back historically in Analytics, it appears many of these did drive significant organic traffic in the past, so I'm going to have a tough time justifying deleting a lot of them.
Any advice?
-
Heather,
I'm confused as to what the duplicate content is. The three Dallas pages you mentioned have different content. Sure there's a decent amount that's the same from the site-wide content (nav menus, etc.), but each has different text and information about different locations that are available. How is it duplicate?
Kurt Steinbrueck
OurChurch.Com -
Heather,
First things: 1. Are they still driving traffic? 2. Rel=canonicals are supposed to be used on identical pages or on a page whose content is a subset of the canonical version.
Those pages are very thin content and I certainly wouldn't leave them as they are. If they're still driving content, I'd keep them, but for fear of panda, I'd 302 them to the main pages while I work steadily on putting real content on them and then remove the redirects as the content goes on.
If they're not still driving traffic, it seems to me that it wouldn't be very hard to justifying their removal (or 301 redirection to their main pages). Panda is a tough penalty and you don't want to get caught in that.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How much content is duplicate content? Differentiate between website pages, help-guides and blog-posts.
Hi all, I wonder that duplicate content is the strong reason beside our ranking drop. We have multiple pages of same "topic" (not exactly same content; not even 30% similar) spread across different pages like website pages (product info), blog-posts and helpguides. This happens with many websites and I wonder is there any specific way we need to differentiate the content? Does Google find the difference across website pages and blog-pots of same topic? Any good reference about this? Thanks
Algorithm Updates | | vtmoz0 -
Page content is not very similar but topic is same: Will Google considers the rel canonical tags?
Hi Moz community, We have multiple pages from our own different sub-domains for same topics. These pages even rank in SERP for related keywords. Now we are planning to show only one of the pages in SERP. We cannot redirect unfortunately. We are planning to use rel canonical tags. But the page content is not same, only 20% is similar and 80% is different but the context is same. If we use rel canonicals, does Google accepts this? If not what should I do? Making header tags similar works? How Google responds if content is not matching? Just ignore or any negative score? Thanks
Algorithm Updates | | vtmoz0 -
How to hide our duplicate pages from SERP? Best practice to increase visibility to new pages?
Hi all, We have total 4 pages about same topic and similar keywords. These pages are from our main domain and sub domains too. As the pages from sub domains are years old and been receiving visits from SERP, they stick to 1st position. But we have recently created new pages on our main domain which we are expecting to rank on 1st position. I am planning to hide the sub domain pages from SERP using "Remove URLs" for some days to increase visibility to new pages from main domain. Is this the right and best practice to proceed with? Thanks
Algorithm Updates | | vtmoz0 -
SEO Myth-Busters -- Isn't there a "duplicate content" penalty by another name here?
Where is that guy with the mustache in the funny hat and the geek when you truly need them? So SEL (SearchEngineLand) said recently that there's no such thing as "duplicate content" penalties. http://searchengineland.com/myth-duplicate-content-penalty-259657 by the way, I'd love to get Rand or Eric or others Mozzers aka TAGFEE'ers to weigh in here on this if possible. The reason for this question is to double check a possible 'duplicate content" type penalty (possibly by another name?) that might accrue in the following situation. 1 - Assume a domain has a 30 Domain Authority (per OSE) 2 - The site on the current domain has about 100 pages - all hand coded. Things do very well in SEO because we designed it to do so.... The site is about 6 years in the current incarnation, with a very simple e-commerce cart (again basically hand coded). I will not name the site for obvious reasons. 3 - Business is good. We're upgrading to a new CMS. (hooray!) In doing so we are implementing categories and faceted search (with plans to try to keep the site to under 100 new "pages" using a combination of rel canonical and noindex. I will also not name the CMS for obvious reasons. In simple terms, as the site is built out and launched in the next 60 - 90 days, and assume we have 500 products and 100 categories, that yields at least 50,000 pages - and with other aspects of the faceted search, it could create easily 10X that many pages. 4 - in ScreamingFrog tests of the DEV site, it is quite evident that there are many tens of thousands of unique urls that are basically the textbook illustration of a duplicate content nightmare. ScreamingFrog has also been known to crash while spidering, and we've discovered thousands of URLS of live sites using the same CMS. There is no question that spiders are somehow triggering some sort of infinite page generation - and we can see that both on our DEV site as well as out in the wild (in Google's Supplemental Index). 5 - Since there is no "duplicate content penalty" and there never was - are there other risks here that are caused by infinite page generation?? Like burning up a theoretical "crawl budget" or having the bots miss pages or other negative consequences? 6 - Is it also possible that bumping a site that ranks well for 100 pages up to 10,000 pages or more might very well have a linkuice penalty as a result of all this (honest but inadvertent) duplicate content? In otherwords, is inbound linkjuice and ranking power essentially divided by the number of pages on a site? Sure, it may be some what mediated by internal page linkjuice, but what's are the actual big-dog issues here? So has SEL's "duplicate content myth" truly been myth-busted in this particular situation? ??? Thanks a million! 200.gif#12
Algorithm Updates | | seo_plus0 -
Duplicate pages in language versions, noindex in sitemap and canonical URLs in sitemap?
Hi SEO experts! We are currently in the midst of reducing our amount of duplicate titles in order to optimize our SEO efforts. A lot of the "duplicate titles" come from having several language versions of our site. Therefore, I am wondering: 1. If we start using "" to make Google (and others) aware of alternative language versions of a given site/URL, how big a problem will "duplicate titles" then be across our domains/site versions? 2. Is it a problem that we in our sitemap include (many) URL's to pages that are marked with noindex? 3. Are there any problems with having a sitemap that includes pages that includes canonical URL's to other pages? Thanks in advance!
Algorithm Updates | | TradingFloor.com0 -
Duplicate Content?
My client is a manufacturers representative for highly technical controls. The manufacturers do not sell their products directly, relying on manufacturers reps to sell and service them. Most but not all of them publish their specs on their sites, usually in PDF only. As a service to our customers and with permission of the manufacturers we publish the manufacturers specs on our site for our customers in HTML with images and downloadable PDF's — this constitutes our catalogue. The pages are lengthy and technical, and are pretty much the opposite of thin content. The URLS for these (technical) queries rank well, so Google doesn't seem to mind. Does this constitute duplicate content and can we be penalized for it?
Algorithm Updates | | waynekolenchuk0 -
Content, for the sake of the search engines
So we all know the importance of quality content for SEO; providing content for the user as opposed to the search engines. It used to be that copyrighting for SEO was treading the line between readability and keyword density, which is obviously no longer the case. So, my question is this, for a website which doesn't require a great deal of content to be successful and to fullfil the needs of the user, should we still be creating relavent content for the sake of SEO? For example, should I be creating content which is crawlable but may not actually be needed / accessed by the user, to help improve rankings? Food for thought 🙂
Algorithm Updates | | underscorelive0 -
Frequency & Percentage of Content Change to get Google to Cache Every Day?
What is the frequency at which your homepage (for example) would have to update and what percentage of the page's content would need to be updated to get cached every day? What are your opinions on other factors.
Algorithm Updates | | bozzie3110