Is Google able to determine duplicate content every day/ month?
-
A while ago I talked to somebody who used to work for MSN a couple of years ago within their engineering department. We talked about a recent dip we had with one of our sites.We argued this could be caused by the large amount of duplicate content we have on this particular website (+80% of our site).
Then he said, quoted: "Google seems only to be able to determine every couple of months instead of every day if the content is actually duplicate content". I clearly don't doubt that duplicate content is a ranking factor. But I would like to know you guys opinions about Google being only able to determine this every couple of X months instead of everyday.
Have you seen or heard something similar?
-
Sorting out Google's timelines is tricky these days, because they aren't the same for every process and every site. In the early days, the "Google dance" happened about once a month, and that was the whole mess (index, algo updates, etc.). Over time, index updates have gotten a lot faster, and ranking and indexation are more real-time (especially since the "Caffeine" update), but that varies wildly across sites and pages.
I think you also have to separate a couple of impacts of duplicate content. When it comes to filtering - Google excluding a piece of duplicate content from rankings (but not necessarily penalizing the site), I don't see any evidence that this takes a couple of months. It can Google days or weeks to re-cache any given page, and to detect a duplicate they would have to re-cache both copies, so that may take a month in some cases, realistically. I strongly suspect, though, that the filter itself happens in real-time. There's no good way to store a filter for every scenario, and some filters are query-specific. Computationally, some filters almost have to happen on the fly.
On the other hand, you have updates like Panda, where duplicate content can cause something close to a penalty. Panda data was originally updated outside of the main algorithm, to the best of our knowledge, and probably about once/month. Over the more than a year since Panda 1.0 rolled out, though, it seems that this timeline accelerated. I don't think it's real-time, but it may be closer to 2 weeks (that's speculation, I admit).
So, the short answer is "It's complicated"
I don't have any evidence to suggest that filtering duplicates takes Google months (and, actually, have anecdotal evidence that it can happen much faster). It is possible that it could take weeks or months to see the impact of duplicates on some sites and in some situations, though.
-
Hi Donnie,
Thanks for your reply, but I was already aware of the fact that Google had/ has a sandbox. I had to mention this within my question. I'm looking more for an answer around the fact if Google is able to determine on what basis if pages are duplicate.
Because I saw dozens of cases where our content was indexed and we linked/ linked not back to the 'original' source.
Also want to make clear that in all of these cases the duplicate content was in agreement with the original sources just to be sure.
-
In the past google had a sandbox period before any page (content) would rank. However, now everything is instant. (just learned this today @seomoz)
If you release something, Google will index it as fast as possible. If that info gets duplicated Google will only count the first one indexed. Everyone else loses brownie points unless they trackback/link back to the main article (first indexed).
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How To Stop Google's "Fetch & Render" From Showing Up In Google Analytics
Hi all, Within Google's "Fetch & Render" (found in Google Search Console) is the ability to index certain pages from my website on-demand. Unfortunately, every time I ask Google to index a page, it registers as a bounce in Google Analytics. Also, if it means anything, my website (www.knowtro.com) is a single-page application, functioning similarly to Google. If you guys know of any solution to this problem, please help! I originally thought that Google would know to block its own Fetch & Render crawler from Google Analytics but that doesn't seem to be the case. Thanks, Austin
Reporting & Analytics | | A_Krauss0 -
How to hook up a ppc campaign to a google + Page
Greetings,
Reporting & Analytics | | Nightwing
Sometimes you just want to give Google a big slap for making straight forward requests damn impossible. So all i ma trying to ad is point a ppc ad at this Google + account <a>https://plus.google.com/118393512656496298734#118393512656496298734/posts</a> But i get a warning sign saying:
"The URL must be for a Google+ page, not a personal profile" I then spend half an hour tring to find a Google + page but get no where fast 😞 Warning message illustrated here:
http://i216.photobucket.com/albums/cc53/zymurgy_bucket/google-page-plus_zps46ff995a.jpg So my question is please how to a get the Google + page for this account:
<a>https://plus.google.com/118393512656496298734#118393512656496298734/posts</a> Any insights welcome!
David0 -
Google Analytics Content Experiments
Has anyone else found that Google Analytics Content Experiments seems to quite quickly favor the best performing variant in an experiment, and then show that variant many times more often than other/s - not split the traffic evenly? What is Google's thinking behind 'optimizing' during an experiment? It seems odd to me.
Reporting & Analytics | | David_ODonnell0 -
Setting up goals in Google Analytics
Hello! This question is seems so obvious that I'm also ashamed to ask... almost. If the goal of a website is to have a visitor complete a contact form, when setting up the Goal in Google Analytics (as a "URL destination") the URL of choice would be the form's thank you page, correct? Because that's the page that proves the visitor completed the task we want them the achieve. Right? Thanks!
Reporting & Analytics | | SmileMoreSEO
Erik0 -
Number of clicks / organic traffic - different data in Google Analytics and Webmaster Tools
Hi. I have a little problem. When I open Google Webmaster Tools I see 3000 clicks (in Traffic - Search queries - Clicks). But when I open Google Analytics I see much more visits from search engine (Google) - it´s 4-5 times more! It´s a huge difference, don´t you think? Do you know, where is the problem? What causes this diffence? thanks a lot
Reporting & Analytics | | mysho0 -
Google Maps not passing referral data
Google Maps is not passing referral data (URLs, not KWs). Google+ Local is referring, but nothing from maps. Maps referrals appear to be coming across as direct. Any ideas? We haven't found anything online, one of the guys at the office documented what we did find, using Chrome's debugger - http://manofactionmetrics.com/2012/11/02/google-maps-not-passing-any-referral-data/
Reporting & Analytics | | Danieljacobree0 -
301 redirect visible on Google.
Hello, This is a strange one for me. We have 301 redirected our .com domain to the .co.uk domain. The strange thing is that if you gogole the .com domain (theloanengine.com) there is a result with the website description. If you click on the result you are redirected tot he website homepage. One other thing: I've discovered this 301 issue because Google Analytics started to show me a few days ago referrals from the .com domain. I don't know if these things are connected. Cornel
Reporting & Analytics | | Cornel_Ilea0 -
Google analytics
I updates all my google analytics accounts after the update and now i went to each campaign and it appears there is a problem and now if I reset it i loose history. Is this a glitch or a problem
Reporting & Analytics | | JAGlasvegas0