Is Google able to determine duplicate content every day/ month?
-
A while ago I talked to somebody who used to work for MSN a couple of years ago within their engineering department. We talked about a recent dip we had with one of our sites.We argued this could be caused by the large amount of duplicate content we have on this particular website (+80% of our site).
Then he said, quoted: "Google seems only to be able to determine every couple of months instead of every day if the content is actually duplicate content". I clearly don't doubt that duplicate content is a ranking factor. But I would like to know you guys opinions about Google being only able to determine this every couple of X months instead of everyday.
Have you seen or heard something similar?
-
Sorting out Google's timelines is tricky these days, because they aren't the same for every process and every site. In the early days, the "Google dance" happened about once a month, and that was the whole mess (index, algo updates, etc.). Over time, index updates have gotten a lot faster, and ranking and indexation are more real-time (especially since the "Caffeine" update), but that varies wildly across sites and pages.
I think you also have to separate a couple of impacts of duplicate content. When it comes to filtering - Google excluding a piece of duplicate content from rankings (but not necessarily penalizing the site), I don't see any evidence that this takes a couple of months. It can Google days or weeks to re-cache any given page, and to detect a duplicate they would have to re-cache both copies, so that may take a month in some cases, realistically. I strongly suspect, though, that the filter itself happens in real-time. There's no good way to store a filter for every scenario, and some filters are query-specific. Computationally, some filters almost have to happen on the fly.
On the other hand, you have updates like Panda, where duplicate content can cause something close to a penalty. Panda data was originally updated outside of the main algorithm, to the best of our knowledge, and probably about once/month. Over the more than a year since Panda 1.0 rolled out, though, it seems that this timeline accelerated. I don't think it's real-time, but it may be closer to 2 weeks (that's speculation, I admit).
So, the short answer is "It's complicated" I don't have any evidence to suggest that filtering duplicates takes Google months (and, actually, have anecdotal evidence that it can happen much faster). It is possible that it could take weeks or months to see the impact of duplicates on some sites and in some situations, though.
-
Hi Donnie,
Thanks for your reply, but I was already aware of the fact that Google had/ has a sandbox. I had to mention this within my question. I'm looking more for an answer around the fact if Google is able to determine on what basis if pages are duplicate.
Because I saw dozens of cases where our content was indexed and we linked/ linked not back to the 'original' source.
Also want to make clear that in all of these cases the duplicate content was in agreement with the original sources just to be sure.
-
In the past google had a sandbox period before any page (content) would rank. However, now everything is instant. (just learned this today @seomoz)
If you release something, Google will index it as fast as possible. If that info gets duplicated Google will only count the first one indexed. Everyone else loses brownie points unless they trackback/link back to the main article (first indexed).
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Optimize via Google Tag manager not working properly
Hi All, I am using google optimize via tag manager. I have successfully implemented it and my experiment appeared in google analytic but google optimize chrome addon always give me add script not installed properly. How to solve solve error? Thanks!
Reporting & Analytics | | amu1230 -
Google search console
Checking our links to our site on GSC and we've dropped from 515 to 324 over night - is this normal?
Reporting & Analytics | | RayflexGroup0 -
How to track in google analytic which filters selected in which category?
Hi All, Can anyone tell me how to track which filters selected in which category like in category I have 1) price ( then different - different price) 2) color ( then difference - difference colors) same for size, width, brand etc. Is it possible to track in detail via tag manager? Other than tag manager what are the best way to track filters? Please explain me technical guidance too. Thanks!
Reporting & Analytics | | adamjack0 -
Google Analytics: Dashboard to show popular content per directory
Hello, I work for a furniture business and I would like to set up a dashboard in Google Analytics to show a table for each of the 10 sections to show the most popular content, ie. /Sofas
Reporting & Analytics | | Bee159
/Sofas/black-leather-sofa | 987 PVs
/Sofas/brown-leather-sofa | 782 PVs
/Sofas/classic-material-sofa | 636 PVs
etc. /Beds
/Beds/king-size-bed | 900 PVs
etc How would I go about doing this? Thank you0 -
Google Tag Assistant for Chrome
I'm using the Google Tag Assistant for Chrome, and I noticed something really weird. No matter what pages I look at, the same two GA tags show up. It's weird. You can see the tag that is "working", and then there are two repeats. For example, when I look at this page, I see the GA tag that is working and then all the remarketing tags. Then I see UA-36732895-1 repeated twice. Anyone have any idea what this is? Thanks!
Reporting & Analytics | | PGD20110 -
Why is Google Analytics showing index.php after every page URL?
Hi, My client's site has GA tracking code gathering correct data on the site, but the pages are listed in GA as having /index.php at the end of every URL, although this does not appear when you visit the site pages. Even if there is a redirect happening for site visitors, shouldn't GA be showing the pages as their redirect destination, i.e. the URL that visitors actually see? Could this discrepancy be adversely affecting my search performance? Example page: http://freshstarttax.com/innocent-spouse/ shows up in GA as http://freshstarttax.com/innocent-spouse/index.php thanks
Reporting & Analytics | | JMagary0 -
Google is just plain confusing now
I know, many people are up in arms with Google with their very frequent recent changes. I guess some of this is good - but at times I am also warming to the opinion that they are just losing the plot. To illustrate my point - check this ranking history for a keyword: Toyota South Africa I'm not sure how this image will display - but for no obvious apparent reason, from 02/10 - we were ranked 5, and now on 9/10 dropped right down to 44. I mean how is on supposed to explain, and rectify this when Google just keeps on changing the playing fields? shrug Ranking.png
Reporting & Analytics | | ZakD0 -
How do I find backlinks and/or SERPs for a subdomain
I realize this is probably a very basic question but if I want to find backlinks to a sub-domain of my site, how would I do that? Also, if I wanted to see what search terms were bringing traffic to one specific page of my site, how would I do that?
Reporting & Analytics | | aarnswife0