Is Google able to determine duplicate content every day/ month?
-
A while ago I talked to somebody who used to work for MSN a couple of years ago within their engineering department. We talked about a recent dip we had with one of our sites.We argued this could be caused by the large amount of duplicate content we have on this particular website (+80% of our site).
Then he said, quoted: "Google seems only to be able to determine every couple of months instead of every day if the content is actually duplicate content". I clearly don't doubt that duplicate content is a ranking factor. But I would like to know you guys opinions about Google being only able to determine this every couple of X months instead of everyday.
Have you seen or heard something similar?
-
Sorting out Google's timelines is tricky these days, because they aren't the same for every process and every site. In the early days, the "Google dance" happened about once a month, and that was the whole mess (index, algo updates, etc.). Over time, index updates have gotten a lot faster, and ranking and indexation are more real-time (especially since the "Caffeine" update), but that varies wildly across sites and pages.
I think you also have to separate a couple of impacts of duplicate content. When it comes to filtering - Google excluding a piece of duplicate content from rankings (but not necessarily penalizing the site), I don't see any evidence that this takes a couple of months. It can Google days or weeks to re-cache any given page, and to detect a duplicate they would have to re-cache both copies, so that may take a month in some cases, realistically. I strongly suspect, though, that the filter itself happens in real-time. There's no good way to store a filter for every scenario, and some filters are query-specific. Computationally, some filters almost have to happen on the fly.
On the other hand, you have updates like Panda, where duplicate content can cause something close to a penalty. Panda data was originally updated outside of the main algorithm, to the best of our knowledge, and probably about once/month. Over the more than a year since Panda 1.0 rolled out, though, it seems that this timeline accelerated. I don't think it's real-time, but it may be closer to 2 weeks (that's speculation, I admit).
So, the short answer is "It's complicated" I don't have any evidence to suggest that filtering duplicates takes Google months (and, actually, have anecdotal evidence that it can happen much faster). It is possible that it could take weeks or months to see the impact of duplicates on some sites and in some situations, though.
-
Hi Donnie,
Thanks for your reply, but I was already aware of the fact that Google had/ has a sandbox. I had to mention this within my question. I'm looking more for an answer around the fact if Google is able to determine on what basis if pages are duplicate.
Because I saw dozens of cases where our content was indexed and we linked/ linked not back to the 'original' source.
Also want to make clear that in all of these cases the duplicate content was in agreement with the original sources just to be sure.
-
In the past google had a sandbox period before any page (content) would rank. However, now everything is instant. (just learned this today @seomoz)
If you release something, Google will index it as fast as possible. If that info gets duplicated Google will only count the first one indexed. Everyone else loses brownie points unless they trackback/link back to the main article (first indexed).
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Can you arrange Google Analytics source/medium traffic by percentage change?
I'm doing a year to year traffic audit for a client. I would like to analyze Google Analytics source/medium traffic by percent change. Is there a way to do this? Do I have to create a custom variable? 9BH70RO
Reporting & Analytics | | VanguardCommunications0 -
Google Analytics - Adding a sub-domain
Hi I have a google analytics query.
Reporting & Analytics | | Niki_1
I have a main site with a google analytics tag and I have 2 forms that sit on a subdomain with a different GA code. As I would like to measure end to end tracking, I would like the same GA code on the subdomain. What is the best way for me to implement this? Would I need to make some changes to the GA code that sits on the main site or can I add the the GA code from the main site onto the subdomain? Thanks0 -
Google Analytics Data
What tools can one use to audit a site to ensure that Google Analytics is capturing all the visits that we are getting? TIA Asif
Reporting & Analytics | | prsntsnh0 -
Google Analytics Code
We have a quick question about our Google Analytics code: We recently updated to a re-marketing Analytics code, and some of our traffic numbers seem to be off by a bit. I used the Google Tag Assistant Chrome Extension today, and noticed that it's finding our old Analytics code on our pages, but it's coming up as an error due to "no HTTP Response". I am attempting to remove this code from the website, but it is nowhere to be found in the HTML coding. Only the current one is there. So I'm wondering if this second Analytics code is effecting our traffic and reporting, even if the code is currently non-functional? and if it is, how could I go about removing it if it's not currently in our HTML? Thanks!
Reporting & Analytics | | PlanetDISH0 -
Google Analytics Myth?
Dear Moz-members, I have read several discussions regarding not to implement Google analytics on your website. Reason: Google is tracking your website behavior, for example you have a high bounce rate it will affect your rankings negative. See it as we are opening the doors too much to Google so it can have a negative impact on your online business, especially when your site is new and you are still building your credit. Because of this i have not chosen to install GA. What is your experience with this? Most of the time if i read analytics posts at Seomoz, people do have GA installed. I am also in doubt whether to switch over at GA, but then again mine site is relative new and i am still working on the negative bounce rate/daily visitors. Would like to hear from you pro's whether this is a myth or not 🙂 Thanks!
Reporting & Analytics | | mcweb0 -
What is s.ytimg.com in google analytics?
My clients GA reports 273 visits from s.ytimg.com. I go to the site, it doesn't exist. I googled it, there were some code with s.ytimg.com in it, but nothing I could understand. Anybody have an idea where this comes from?
Reporting & Analytics | | endlessrange0 -
Google analytics for your mobile sub-domain
Hi All, I have just started mobile sub-domain for my desktop site. But I don't to know how to configure analytics for mobile sub-domain. I want to track result between desktop and mobile site. Can any one help me to sort out this problem? Waiting for reply
Reporting & Analytics | | Hexpress0 -
Duplicate content warnings
I have a ton of duplicate content warnings for my site poker-coaching.net, but I can't see where there are duplicate URLs. I cannot find any function where I could check the original URL vs a list of other URLs where the duplicate content is?
Reporting & Analytics | | CatfishTPA0