Is Google able to determine duplicate content every day/ month?
-
A while ago I talked to somebody who used to work for MSN a couple of years ago within their engineering department. We talked about a recent dip we had with one of our sites.We argued this could be caused by the large amount of duplicate content we have on this particular website (+80% of our site).
Then he said, quoted: "Google seems only to be able to determine every couple of months instead of every day if the content is actually duplicate content". I clearly don't doubt that duplicate content is a ranking factor. But I would like to know you guys opinions about Google being only able to determine this every couple of X months instead of everyday.
Have you seen or heard something similar?
-
Sorting out Google's timelines is tricky these days, because they aren't the same for every process and every site. In the early days, the "Google dance" happened about once a month, and that was the whole mess (index, algo updates, etc.). Over time, index updates have gotten a lot faster, and ranking and indexation are more real-time (especially since the "Caffeine" update), but that varies wildly across sites and pages.
I think you also have to separate a couple of impacts of duplicate content. When it comes to filtering - Google excluding a piece of duplicate content from rankings (but not necessarily penalizing the site), I don't see any evidence that this takes a couple of months. It can Google days or weeks to re-cache any given page, and to detect a duplicate they would have to re-cache both copies, so that may take a month in some cases, realistically. I strongly suspect, though, that the filter itself happens in real-time. There's no good way to store a filter for every scenario, and some filters are query-specific. Computationally, some filters almost have to happen on the fly.
On the other hand, you have updates like Panda, where duplicate content can cause something close to a penalty. Panda data was originally updated outside of the main algorithm, to the best of our knowledge, and probably about once/month. Over the more than a year since Panda 1.0 rolled out, though, it seems that this timeline accelerated. I don't think it's real-time, but it may be closer to 2 weeks (that's speculation, I admit).
So, the short answer is "It's complicated" I don't have any evidence to suggest that filtering duplicates takes Google months (and, actually, have anecdotal evidence that it can happen much faster). It is possible that it could take weeks or months to see the impact of duplicates on some sites and in some situations, though.
-
Hi Donnie,
Thanks for your reply, but I was already aware of the fact that Google had/ has a sandbox. I had to mention this within my question. I'm looking more for an answer around the fact if Google is able to determine on what basis if pages are duplicate.
Because I saw dozens of cases where our content was indexed and we linked/ linked not back to the 'original' source.
Also want to make clear that in all of these cases the duplicate content was in agreement with the original sources just to be sure.
-
In the past google had a sandbox period before any page (content) would rank. However, now everything is instant. (just learned this today @seomoz)
If you release something, Google will index it as fast as possible. If that info gets duplicated Google will only count the first one indexed. Everyone else loses brownie points unless they trackback/link back to the main article (first indexed).
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Some goal conversion in Google analytics showing under referral
Yesterday I have created Google analytics account for a new website but few goal conversions comes from payment gateway site (paypal.com, epdq.co.uk) and showing under referral. How to fix this issue so I can know the real source of Goal conversion. *Note - utm_nooverride=1 on thank you page applied, payment gateway URL is already placed in Referral Exclusion List. So please don't suggest either of them. Thanks
Reporting & Analytics | | Alick3000 -
Google Analtyics during site redesign
Hi, We will be launching a new redesign for our website. There will be new URLs and navigation and almost everything (except for static pages like about and contact) will be different. The overwhelming opinion seems to say that it's important to keep the same Google Analytics profile. How can we compare the past URLs to the new ones if they are completely different. Does anyone have any experience in this? Did you create any segmentation? Thanks 🙂
Reporting & Analytics | | WSteven0 -
Where does the organic keyword information come from in Google Analytics?
I know that Google switch to all encrypted search, but I still show some keywords. Are those keywords that slipped through? Or are they all from Yahoo/Bing?
Reporting & Analytics | | EcommerceSite0 -
How do I set a filter in Google Analytics?
I want to filter out direct traffic from the service provider msn. For some reason they are crawling our traffic and its throwing off my data. So I want to exclude direct, but only from msn. Would I set my first filter for direct then msn or vice versa? Thanks
Reporting & Analytics | | EcommerceSite0 -
Google Analytics Not Working
I added the code before tag but still google not showing it is installed. Status: Tracking Not Installed Last checked: Mar 15, 2013 10:38:10 PM PDT Can someone check my domain - www.plugnbuy.com
Reporting & Analytics | | chandubaba0 -
Google Analytics: Multi channel funnel
Hi 🙂 I have a little problem. In Multi-channel Funnel Overview I see 50 conversions from Paid Search (screenshot 1). But when I click on Assisted Conversion and then choose Paid Search, I see 31 Assisted Conversions and 32 Last Interaction Conversion. (screenshot 2) So my question is - why the number of converstions in overview (50) is different than number in Assisted Conversions (assisted or last interaction conversion, or both together)? Probably, the answer is so simply, but I can found out it! 🙂 thanks a lot. X6UihL7.png lCNsw4T.png
Reporting & Analytics | | visibilitysk0 -
Weird drop in ranks... google.fr
Hello folks, Before our domain: <cite>www.convertisseurvideo.net</cite> Keyword: convertisseur video ( convert your video ) is at the first(page) google.fr for this keyword search. Right now, its at second page, really dropped. And our CTR = 1% . How can we improve the CTR? Any toughts ? Thanks. ss.JPG
Reporting & Analytics | | augustos0 -
Differences in keyword rankings in Google and Bing and Yahoo
Hi there, We have some keywords that are ranked so far apart on the search engines its puzzling. For example we have keywords ranked at say 10, 9, 7 etc on Google, not in top 50 on Bing or Yahoo. Stuff like that. Surely the algorithms can’t be that far apart? Is this indeed normal? Does anyone have the same issues? Thanks
Reporting & Analytics | | inhouseninja0