Is Google able to determine duplicate content every day/ month?
-
A while ago I talked to somebody who used to work for MSN a couple of years ago within their engineering department. We talked about a recent dip we had with one of our sites.We argued this could be caused by the large amount of duplicate content we have on this particular website (+80% of our site).
Then he said, quoted: "Google seems only to be able to determine every couple of months instead of every day if the content is actually duplicate content". I clearly don't doubt that duplicate content is a ranking factor. But I would like to know you guys opinions about Google being only able to determine this every couple of X months instead of everyday.
Have you seen or heard something similar?
-
Sorting out Google's timelines is tricky these days, because they aren't the same for every process and every site. In the early days, the "Google dance" happened about once a month, and that was the whole mess (index, algo updates, etc.). Over time, index updates have gotten a lot faster, and ranking and indexation are more real-time (especially since the "Caffeine" update), but that varies wildly across sites and pages.
I think you also have to separate a couple of impacts of duplicate content. When it comes to filtering - Google excluding a piece of duplicate content from rankings (but not necessarily penalizing the site), I don't see any evidence that this takes a couple of months. It can Google days or weeks to re-cache any given page, and to detect a duplicate they would have to re-cache both copies, so that may take a month in some cases, realistically. I strongly suspect, though, that the filter itself happens in real-time. There's no good way to store a filter for every scenario, and some filters are query-specific. Computationally, some filters almost have to happen on the fly.
On the other hand, you have updates like Panda, where duplicate content can cause something close to a penalty. Panda data was originally updated outside of the main algorithm, to the best of our knowledge, and probably about once/month. Over the more than a year since Panda 1.0 rolled out, though, it seems that this timeline accelerated. I don't think it's real-time, but it may be closer to 2 weeks (that's speculation, I admit).
So, the short answer is "It's complicated" I don't have any evidence to suggest that filtering duplicates takes Google months (and, actually, have anecdotal evidence that it can happen much faster). It is possible that it could take weeks or months to see the impact of duplicates on some sites and in some situations, though.
-
Hi Donnie,
Thanks for your reply, but I was already aware of the fact that Google had/ has a sandbox. I had to mention this within my question. I'm looking more for an answer around the fact if Google is able to determine on what basis if pages are duplicate.
Because I saw dozens of cases where our content was indexed and we linked/ linked not back to the 'original' source.
Also want to make clear that in all of these cases the duplicate content was in agreement with the original sources just to be sure.
-
In the past google had a sandbox period before any page (content) would rank. However, now everything is instant. (just learned this today @seomoz)
If you release something, Google will index it as fast as possible. If that info gets duplicated Google will only count the first one indexed. Everyone else loses brownie points unless they trackback/link back to the main article (first indexed).
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Excluding Cookieless Static Content Sub-domain from GA/GTM
For the purposes of this question our ecommerce site url is www.ecommerce.com Our TLD is ecommerce.com We have, following advice from Yslow, Pagespeed and others, moved our static content to a subdomain - static.ecommerce.com We have Google Analytics and Enhance Ecommerce installed, fired from GTM. The cookieDomain setting in GTM is 'auto' At present cookies are being attached to our static resources. What changes do I need to make to to prevent this happening? Many thanks Julian
Reporting & Analytics | | jdeb0 -
Tasks for Google Analytics training
Hi Mozzers, I'm delivering some Google Analytics (Fundamentals level) training, and trying to make it was fun and as interesting as possible... which is quite a challenge when it comes to GA. I was just wondering if you're aware of training tasks, or interactions, I could bring into this kind of training session? The group are particularly interested in user journeys and the effectiveness of content. Thanks!
Reporting & Analytics | | A_Q0 -
Understanding Average Position in Google Anaylitics
Hello here, I have a question about the Queries report under "Search Engine Optimization" in Google Analytics: is the "Average Position" information a reliable one? I have a lot of queries that appear, from that report, to average first position, but when I verify that on Google by connecting anonymously, I can't even find my result on the first page! To me, that information is worthless and makes me think all the rest of that report is unreliable. If anyone can help me to understand it, I'd really appreciate it. Thank you in advance for any thoughts.
Reporting & Analytics | | fablau0 -
Goal tracking in Google Analytics
Hi folks I read from various sources that if you setup goals in Google Analytics each of these goals can only be fulfilled once per visit. Also some sources suggesting that only one goal from each goal group can be fulfilled per visit. On our site we have a goal for external links since this provides value to partners. Some users do open an external link in a new tab, then come back to the main site. Any further goal completions would then not get tracked. Since we apply a result based payment model for our work this means we are literally loosing money. Anyone has official info from Google on this? Can it be configured? How long is a visit? Thanks a million and have a great day. Fredrik
Reporting & Analytics | | Resultify0 -
URL Structure Q - /UniqueURL/ProductA or /SubcategoryURL/ProductA?
Hi Mozers, I have a niche ecommerce site http://www.ecustomfinishes.com that sells custom barn wood furniture. I have about 600 products online. 2 weeks ago I started rewriting my urls from /subcategoryurl/ProductA to /UNIQUEURL/productA for my individual products, For example for my subcategory farm tables (150 products) I had /rustic-farm-tables/productA, /rustic-farm-tables/ProductB ...."rustic-farm-table" about 150 times. 2 weeks ago I started changing the 150x "/rustic-farm-table/" to a more descriptive URL such as /white-farm-table/producA /rustic-square-dining-table/ProductB /Black-harvest-table/ProductC Here is why I am need advice: I have 1181 pages, the page with the most entrances with "rustic-farm-tables" is #31/1181 based on entrances. the 2nd most is #71/1181 Alternatively, I have 13 table product pages such a as /12ft-Rustic-Farm-Dining-Table-p/12-foot-table-with-inlay.htm" that get more entrances than any product that includes "rustic-farm-tables" Since changing the urls to be product specific, my overall traffic has dropped 20%!!! So here is my question: do i continue to have the /UNIQUEURL/product be unique to the product, which is consistant amongst my best preforming pages, yet has dropped my traffic 20% in the last 2 weeks, OR do i keep /SAME-URL/product which written as a best practice, and be happy with the traffic I had? Could the 20% drop just be a temporary shock? Why would this happen? This would be a good long tail/head term experiment. Try to get more head terms, or do what you can do focus on long tail. I hope i was able to explain this well, I say follow the best practices of my best preforming pages, however the 20% drop has me worried. Thank you in advance for your help
Reporting & Analytics | | longdenc_gmail.com0 -
Google.co.uk & Google.com difference of ranking
How can our website rank on page 3 in google.co.uk and yet it ranks on page 20 for the same keyword on google.com? This doesn't seem to affect our competitors though and its only our site that is being affected/penalized.
Reporting & Analytics | | dobersby0 -
AW Stats vs Google Analytics
Hey Moz Community, I am looking to get opinions on the best practice for analytics/traffic analysis. From experience I know that AW Stats reads high and Google Analytics reads low for traffic for reason in this article http://www.smartz.com/blog/2009/01/23/analytic-confusion-%E2%80%93-awstats-vs-google-analytics/ It drives me a little nuts how far off both are for some pages. I have one article that shows 100 views (GA) and AW stats shows 5 times that number of views. Any suggestions or systems you recommend? Thanks
Reporting & Analytics | | johnshearer0