Is it possible that Google may have erroneous indexing dates?
-
I am consulting someone for a problem related to copied content. Both sites in question are WordPress (self hosted) sites. The "good" site publishes a post. The "bad" site copies the post (without even removing all internal links to the "good" site) a few days after.
On both websites it is obvious the publishing date of the posts, and it is clear that the "bad" site publishes the posts days later. The content thief doesn't even bother to fake the publishing date.
The owner of the "good" site wants to have all the proofs needed before acting against the content thief. So I suggested him to also check in Google the dates the various pages were indexed using Search Tools -> Custom Range in order to have the indexing date displayed next to the search results.
For all of the copied pages the indexing dates also prove the "bad" site published the content days after the "good" site, but there are 2 exceptions for the very 2 first posts copied.
First post:
On the "good" website it was published on 30 January 2013
On the "bad" website it was published on 26 February 2013
In Google search both show up indexed on 30 January 2013!Second post:
On the "good" website it was published on 20 March 2013
On the "bad" website it was published on 10 May 2013
In Google search both show up indexed on 20 March 2013!Is it possible to be an error in the date shown in Google search results?
I also asked for help on Google Webmaster forums but there the discussion shifted to "who copied the content" and "file a DMCA complain". So I want to be sure my question is better understood here.
It is not about who published the content first or how to take down the copied content, I am just asking if anybody else noticed this strange thing with Google indexing dates.How is it possible for Google search results to display an indexing date previous to the date the article copy was published and exactly the same date that the original article was published and indexed?
-
Thanks Doug. Really an eye-opener.
-
Thanks Doug for your response. It really cleared up the questions I had about that date Google shows next to the search results.
I was not able to find official details about it, all I was able to find was different referencing as the indexing date of a page.
But I knoew here in the MOZ community there are people who really know things, that's why I asked.
So that date is just Google's estimation of the publishing date, not the date Google indexed the content!
Thanks again for taking the time to answer me!
-
Hiya Sorina,
When you use the custom date range, Google isn't listing results based on the date they were indexed. Google is using an estimated publication date.
Google tries to estimate the the publication date based on meta-data and other features of the page such as dates in the content, title and URL. The date Google first indexed the page is just one of the things that Google can use to estimate the publication date.
I also suspect that dates in any sitemap.xml files will also be taken into consideration.
But, given that even Google can't guarantee that it'll crawl and index articles on the day they've been published the crawl data may not be an accurate estimate.
Also, if the scraped content is being re-published with intact internal links (are these the full URL - do you they resolve to your original website?) then it's pretty obvious where the content came from.
Hope this help answer your question.
-
Hi Sorina,
I can tell you that the index dates shown by Google are accurate but is not the case with the Cache date sometimes as the date shown in the Cache and the copy shown in the cache don't match many times but the index dates are accurate. Send me a private message with the actual URLs under discussion and I will be able to comment with more clarity.
Best,
Devanur Rafi
-
Thank you for your response Devanur Rafi, but the "good" site doesn't have problems getting indexed.
Actually all posts on the "good" site are indexed the very same day they are published.My question was more about the indexing date shown in Google search results
How come, for a post from the "bad" site, Google is displaying an indexing date previous to the actual date the post was published on that site?!
And how come this date is exactly the same as the date Google says it indexed the post from the "good" site?
-
Hi Sorina,
This is a common thing and it all depends on a site's crawlability (how easy is it to crawl for the bot) and crawl frequency for that site. Google would have picked up that post first on the bad site and then from the good site. However, just because one or two posts were picked up late does not mean that the good site is not crawler friendly. It also depends on how far the resource is from the root. Let us take an example:
A page on a good site: abc.com/folder1/folder2/folder3/page.html
Now a bad site copies that page: xyz.com/page.html
In this case, Google might first pickup the copied page from the bad site as it is just a click away from the root which is not the case with the good site where the page is nested deep inside multiple folders.
You can also give the way back machine (archive.org) a try to find which website published the post first. Sometimes this might work out pretty well. You can also try to look at the cache dates of the posts on both the sites in Google to get some info in this regard.
Hope those help. I wish you good luck.
Best,
Devanur Rafi.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
A Google Update Happened?
I'm curious to know what us MOZ folks have to say about an update on Google. Article here: http://searchengineland.com/big-google-search-update-happening-chatter-thinks-258142 Any ideas?
Algorithm Updates | | Chenzo0 -
What are top 3 directives to prepare for a Google algorithm update?
Company's site fluctuated in keyword rankings last Friday, due to Unnamed algorithm. Our directives are on-page optimization and continual content generation. What are other directives to take?
Algorithm Updates | | ejcruz0 -
Reduced traffic from google news
I have a question for my site. Can you advice me for situation? Last 1 months my site reduced traffic from google and google news.. And i dont know why? I need your advice. Thanks. Yunus Altindag Diyarbakir/Turkey
Algorithm Updates | | maxbilgisayar0 -
Changes in Google "Site:" Search Algorithm Over Time?
I was wondering if anyone has noticed changes in how Google returns 'site:' searches over the past few years or months. I remember being able to do a search such as "site:example.com" and Google would return a list of webpages where the order may have shown the higher page rank pages (due to link building, etc) first and/or parent category pages higher up in the list of the first page (if relevant) first (as they could have higher PR naturally, anyways). It seems that these days I can hardly find quality / target pages that have higher page rank on the first page of Google's site: search results. Is this just me... or has Google perhaps purposely scrambled the SERPS somewhat for site: searches to not give away their page ranking secrets?
Algorithm Updates | | OrionGroup1 -
Organic listing & map listing on 1st page of Google
Hi, Back then, a company could get multiple listings in SERP, one in Google Maps area and a homepage or internal pages from organic search results. But lately, I've noticed that Google are now putting together the maps & organic listings. This observation has been confirmed by a couple of SEO people and I thought it made sense, but one day I stumble with this KWP "bmw dealership phoenix" and saw that www.bmwnorthscottsdale.com has separate listing for google places and organic results. Any idea how this company did this? Please see the attached image
Algorithm Updates | | ao5000000 -
Urls have dates - bad? terrible?
My URLs include dates: example.com/2009-05/post-about-something.html I know this isn't the 'best', but is there any reason to be concerned? Some panda, duplicate content, google hates date in URLs, I should know about?
Algorithm Updates | | comforteagle0 -
Domain Authority and Google keywords
Hi there, We have a domain authority of 33, one of our competitors has an authority of 10, yet they appear to list higher on many keyword searches in google. Is there a reason for this? Our site is 5 months old, and their site is over 3 yrs old. Thanks for your feedback 🙂
Algorithm Updates | | PHDAustralia680 -
Website "penalized" 3 times by Google
I have a website that I'm working with that has had the misfortune of gaining rankings/traffic on Google, then having the rankings/traffic removed...3 times! (Very little was changed on the site to gain or lose "favor" with Google, either.) Notes: Site is a mixture of high quality original content and duplicate content (vacation rental listings) When traffic crashes, we lose nearly all rankings and traffic (90+%) When traffic crashes, we lose all rankings sitewide, including those gained by our high quality, unique pages None of the "crash" dates appear to coincide with any Panda update dates We are working on adding unique content to our pages with duplicate content, but it's a long process and so far doesn't seem to have made any difference I'm confounded why Google keeps "changing its mind" about our site We have an XML sitemap, and Google keeps our site indexed pretty well, even when we lose our rankings Due to the drastic and sitewide loss of rankings, I'm assuming we are dealing with some sort of algorithmic penalty Timeline: Traffic steadily grows starting in Jan 2011 Traffic crashes on Feb 19, 2011. We assumed it was due to a pre-panda anti-scraper update, but don't know. Google sends traffic to our site on March 1, then none the next day On June 16th, I block part of the site using robots.txt (most of the section wasn't indexed anyway) On June 17th, Google starts ranking our site again. I thought it might be due to the robots.txt change, but I had just made the change a few hours ago, and Google wasn't even indexing the part of the site I blocked Traffic/rankings crash again on July 6th. No theory why. Site URL: http://www.floridaisbest.com Traffic Stats: Attached I know that we need more backlinks and less duplicate content, but I can't explain why our Google rankings are "on again, off again". I have never seen a site gain and lose all of its rankings/traffic so drastically multiple times, for no apparent reason. Any thoughts or ideas would be welcome. Thanks! t8IqB
Algorithm Updates | | AdamThompson0