SEO Myth-Busters -- Isn't there a "duplicate content" penalty by another name here?
-
Where is that guy with the mustache in the funny hat and the geek when you truly need them?
So SEL (SearchEngineLand) said recently that there's no such thing as "duplicate content" penalties.
http://searchengineland.com/myth-duplicate-content-penalty-259657
by the way, I'd love to get Rand or Eric or others Mozzers aka TAGFEE'ers to weigh in here on this if possible.
The reason for this question is to double check a possible 'duplicate content" type penalty (possibly by another name?) that might accrue in the following situation.
1 - Assume a domain has a 30 Domain Authority (per OSE)
2 - The site on the current domain has about 100 pages - all hand coded. Things do very well in SEO because we designed it to do so.... The site is about 6 years in the current incarnation, with a very simple e-commerce cart (again basically hand coded). I will not name the site for obvious reasons.
3 - Business is good. We're upgrading to a new CMS. (hooray!) In doing so we are implementing categories and faceted search (with plans to try to keep the site to under 100 new "pages" using a combination of rel canonical and noindex. I will also not name the CMS for obvious reasons.
In simple terms, as the site is built out and launched in the next 60 - 90 days, and assume we have 500 products and 100 categories, that yields at least 50,000 pages - and with other aspects of the faceted search, it could create easily 10X that many pages.
4 - in ScreamingFrog tests of the DEV site, it is quite evident that there are many tens of thousands of unique urls that are basically the textbook illustration of a duplicate content nightmare. ScreamingFrog has also been known to crash while spidering, and we've discovered thousands of URLS of live sites using the same CMS.
There is no question that spiders are somehow triggering some sort of infinite page generation - and we can see that both on our DEV site as well as out in the wild (in Google's Supplemental Index).
5 - Since there is no "duplicate content penalty" and there never was - are there other risks here that are caused by infinite page generation?? Like burning up a theoretical "crawl budget" or having the bots miss pages or other negative consequences?
6 - Is it also possible that bumping a site that ranks well for 100 pages up to 10,000 pages or more might very well have a linkuice penalty as a result of all this (honest but inadvertent) duplicate content? In otherwords, is inbound linkjuice and ranking power essentially divided by the number of pages on a site? Sure, it may be some what mediated by internal page linkjuice, but what's are the actual big-dog issues here?
So has SEL's "duplicate content myth" truly been myth-busted in this particular situation?
???
Thanks a million!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is "Author Rank," User Comments Driving Losses for YMYL Sites?
Hi, folks! So, our company publishes 50+ active, disease-specific news and perspectives websites -- mostly for rare diseases. We are also tenacious content creators: between news, columns, resource pages, and other content, we produce 1K+ pieces of original content across our network. Authors are either PhD scientists or patients/caregivers. All of our sites use the same design. We were big winners with the August Medic update in 2018 and subsequent update in September/October. However, the Medic update in March and de-indexing bug in April were huge losers for us across our monetized sites (about 10 in total). We've seen some recovery with this early June update, but also some further losses. It's a mixed bag. Take a look at this attached MOZ chart, which shows the jumps and falls around the various Medic updates. The pattern is very similar on many of our sites. As per JT Williamson's stellar article on EAT, I feel like we've done a good job in meeting those criteria, which has left we wondering what isn't jiving with the new core updates. I have two theories I wanted to run past you all: 1. Are user comments on YMYL sites problematic for Google now? I was thinking that maybe user comments underneath health news and perspectives articles might be concerning on YMYL sites now. On one hand, a healthy commenting community indicates an engaged user base and speaks to the trust and authority of the content. On the other hand, while the AUTHOR of the article might be a PhD researcher or a patient advocate, the people commenting -- how qualified are they? What if they are spouting off crazy ideas? Could Google's new update see user comments such as these as degrading the trust/authority/expertise of the page? The examples I linked to above have a good number of user comments. Could these now be problematic? 2. Is Google "Author Rank" finally happening, sort of? From what I've read about EAT -- particularly for YMYL sites -- it's important that authors have “formal expertise” and, according to Williamson, "an expert in the field or topic." He continues that the author's expertise and authority, "is informed by relevant credentials, reviews, testimonials, etc. " Well -- how is Google substantiating this? We no longer have the authorship markup, but is the algorithm doing its due diligence on authors in some more sophisticated way? It makes me wonder if we're doing enough to present our author's credentials on our articles, for example. Take a look -- Magdalena is a PhD researcher, but her user profile doesn't appear at the bottom of the article, and if you click on her name, it just takes you to her author category page (how WordPress'ish). Even worse -- our resource pages don't even list the author. Anyhow, I'd love to get some feedback from the community on these ideas. I know that Google has said there's nothing to do to "fix" these downturns, but it'd sure be nice to get some of this traffic back! Thanks! 243rn10.png
Algorithm Updates | | Michael_Nace1 -
Google ranking penalty: Limited to specific pages or complete website?
Hi all, Let's say few pages on the website dropped in the rankings due to poor optimisation of the pages or hit by algo updates. Does Google limits the ranking drop only to these pages or the entire website will have any impact? I mean will this cause ranking drop to the homepage for primary keyword? Will Google pose the penalty to other pages in the website if few pages drop in the rankings. Thanks
Algorithm Updates | | vtmoz0 -
Google indexing site content that I did not wish to be indexed
Hi is it pretty standard for Google to index content that you have not specifically asked them to index i.e. provided them notification of a page's existence. I have just been alerted by 'Mention' about some new content that they have discovered, the page is on our site yes and may be I should have set it to NO INDEX but the page only went up a couple of days ago and I was making it live so that someone could look at it and see how the page was going to look in its final iteration. Normally we go through the usual process of notifying Google via GWMT, adding it to our site map.xml file, publishing it via our G+ stream and so on. Reviewing our Analytics it looks like there has been no traffic to this page yet and I know for a fact there are no links to this page. I am surprised at the speed of the indexation, is it a example of brand mention? Where an actual link is now no longer required? Cheers David
Algorithm Updates | | David-E-Carey0 -
A client asked: "Are you guys aware of any recent changes to Google noquery traffic? I am seeing some chatter around this." Is he referring to "not provided" traffic?
I'm not sure what my client means by this question. I assume he's talking about "not provided" traffic. Is there something I'm missing? Thanks for reading!
Algorithm Updates | | DA20130 -
Benefits of "other" search engines
Hi all, Newbie here. I am looking at ways to drive traffic to my new site and wondered if it's worth attempting to rank in the lesser known search engines. Getting to the top of these may be easier but I wondered if anyone has ever targeted traffic in this way? Thanks
Algorithm Updates | | stebutty0 -
New Google "Knowledge Graph"
So according to CNN an hour ago regarding new Google update: "With Knowledge Graph, which will begin rolling out to some users immediately, results will be arranged according to categories with which the search term has been associated" http://www.cnn.com/2012/05/16/tech/web/google-search-knowledge-graph/index.html?hpt=hp_t3 Does this mean we need to start optimizing for Categories as well as Keywords?
Algorithm Updates | | JFritton0 -
Is it hurting my seo ranking if robots.txt is forbidden?
robots.txt is forbidden - I have read up on what the robots.txt file does and how to configure it but what about if it is not able to be accessed at all?
Algorithm Updates | | Assembla0 -
Using Brand Name in Page titles
Is it a good practice to append our brand name at the end of every page title? We have a very strong brand name but it is also long. Right now what we are doing is saying: Product Name | Long brand name here Product Category | Long brand name here Is this the right way to do it or should we just be going with ONLY the product and category names in our page titles? Right now we often exceed the 70 character recommendation limit.
Algorithm Updates | | mlentner1