Is it possible that Google may have erroneous indexing dates?
-
I am consulting someone for a problem related to copied content. Both sites in question are WordPress (self hosted) sites. The "good" site publishes a post. The "bad" site copies the post (without even removing all internal links to the "good" site) a few days after.
On both websites it is obvious the publishing date of the posts, and it is clear that the "bad" site publishes the posts days later. The content thief doesn't even bother to fake the publishing date.
The owner of the "good" site wants to have all the proofs needed before acting against the content thief. So I suggested him to also check in Google the dates the various pages were indexed using Search Tools -> Custom Range in order to have the indexing date displayed next to the search results.
For all of the copied pages the indexing dates also prove the "bad" site published the content days after the "good" site, but there are 2 exceptions for the very 2 first posts copied.
First post:
On the "good" website it was published on 30 January 2013
On the "bad" website it was published on 26 February 2013
In Google search both show up indexed on 30 January 2013!Second post:
On the "good" website it was published on 20 March 2013
On the "bad" website it was published on 10 May 2013
In Google search both show up indexed on 20 March 2013!Is it possible to be an error in the date shown in Google search results?
I also asked for help on Google Webmaster forums but there the discussion shifted to "who copied the content" and "file a DMCA complain". So I want to be sure my question is better understood here.
It is not about who published the content first or how to take down the copied content, I am just asking if anybody else noticed this strange thing with Google indexing dates.How is it possible for Google search results to display an indexing date previous to the date the article copy was published and exactly the same date that the original article was published and indexed?
-
Thanks Doug. Really an eye-opener.
-
Thanks Doug for your response. It really cleared up the questions I had about that date Google shows next to the search results.
I was not able to find official details about it, all I was able to find was different referencing as the indexing date of a page.
But I knoew here in the MOZ community there are people who really know things, that's why I asked.
So that date is just Google's estimation of the publishing date, not the date Google indexed the content!
Thanks again for taking the time to answer me!
-
Hiya Sorina,
When you use the custom date range, Google isn't listing results based on the date they were indexed. Google is using an estimated publication date.
Google tries to estimate the the publication date based on meta-data and other features of the page such as dates in the content, title and URL. The date Google first indexed the page is just one of the things that Google can use to estimate the publication date.
I also suspect that dates in any sitemap.xml files will also be taken into consideration.
But, given that even Google can't guarantee that it'll crawl and index articles on the day they've been published the crawl data may not be an accurate estimate.
Also, if the scraped content is being re-published with intact internal links (are these the full URL - do you they resolve to your original website?) then it's pretty obvious where the content came from.
Hope this help answer your question.
-
Hi Sorina,
I can tell you that the index dates shown by Google are accurate but is not the case with the Cache date sometimes as the date shown in the Cache and the copy shown in the cache don't match many times but the index dates are accurate. Send me a private message with the actual URLs under discussion and I will be able to comment with more clarity.
Best,
Devanur Rafi
-
Thank you for your response Devanur Rafi, but the "good" site doesn't have problems getting indexed.
Actually all posts on the "good" site are indexed the very same day they are published.My question was more about the indexing date shown in Google search results
How come, for a post from the "bad" site, Google is displaying an indexing date previous to the actual date the post was published on that site?!
And how come this date is exactly the same as the date Google says it indexed the post from the "good" site?
-
Hi Sorina,
This is a common thing and it all depends on a site's crawlability (how easy is it to crawl for the bot) and crawl frequency for that site. Google would have picked up that post first on the bad site and then from the good site. However, just because one or two posts were picked up late does not mean that the good site is not crawler friendly. It also depends on how far the resource is from the root. Let us take an example:
A page on a good site: abc.com/folder1/folder2/folder3/page.html
Now a bad site copies that page: xyz.com/page.html
In this case, Google might first pickup the copied page from the bad site as it is just a click away from the root which is not the case with the good site where the page is nested deep inside multiple folders.
You can also give the way back machine (archive.org) a try to find which website published the post first. Sometimes this might work out pretty well. You can also try to look at the cache dates of the posts on both the sites in Google to get some info in this regard.
Hope those help. I wish you good luck.
Best,
Devanur Rafi.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
PDFs With No Index Contribute To Page Ranks?
I have a question I'm hoping you can help me with. If I upload a PDF and add a no index under the meta robots index so that the PDF doesn't appear in search results when I send people the link to this PDF, does it still contribute to my site traffic/ranking etc? Basically we are deciding whether to put some PDFs with pricing options etc onto our website or on a google drive. We will be sending the links to potential clients. If visitors clicking on the link would still help with increasing traffic and increasing our google rank (without that PDF showing in results) we thought this might be the best solution.
Algorithm Updates | | whiterabbitnz0 -
How long for google to de-index old pages on my site?
I launched my redesigned website 4 days ago. I submitted a new site map, as well as submitted it to index in search console (google webmasters). I see that when I google my site, My new open graph settings are coming up correct. Still, a lot of my old site pages are definitely still indexed within google. How long will it take for google to drop off or "de-index" my old pages? Due to the way I restructured my website, a lot of the items are no longer available on my site. This is on purpose. I'm a graphic designer, and with the new change, I removed many old portfolio items, as well as any references to web design since I will no longer offering that service. My site is the following:
Algorithm Updates | | rubennunez
http://studio35design.com0 -
Google & Tabbed Content
Hi I wondered if anyone had a case study or more info on how Google treats content under tabs? We have an ecommerce site & I know it is common to put product content under tabs, but will Google ignore this? Becky
Algorithm Updates | | BeckyKey1 -
Google not crawling click to expand content - suggestions?
It seems like Google confirmed this week in a G+ hangout that content in click to expand content e.g. 'read more' dropdown and tabbed content scenarios will be discounted. The suggestion was if you have content it needs to be visible on page load. Here's more on it https://www.seroundtable.com/google-index-click-to-expand-19449.html and the actual hangout, circa 11 mins in https://plus.google.com/events/cjcubhctfdmckph433d00cro9as. From a UX and usability point of view having a lot of content that was otherwise tabbed or in click to expand divs can be terrible, especially on mobile. Does anyone have workable solutions or can think of examples of really great landing pages (i'm mostly thinking ecommerce) that also has a lot of visible content? Thanks Andy
Algorithm Updates | | AndyMacLean0 -
How to get indexed in yahoo and bing?
Hello all I have uploaded sitemap to bing webmaster on January 21st, 2014. However, the site has not been indexed yet. I see few pages crawled and some crawl page error but it does not show what pages have an error. Can anyone help me please on how to get this done right so i can can our website indexed in Bing and Yahoo quickly. By the way our website address is : http://www.eduniche.com
Algorithm Updates | | Eva20140 -
Fetch as Google - removes start words from Meta Title ?? Help!
Hi all, I'm experiencing some strange behaviour with Google Webmaster Tools. I noticed that some of our pages from our ecom site were missing start keywords - I created a template for meta titles that uses Manufacturer - Ref Number - Product Name - Online Shop; all trimmed under 65 chars just in case. To give you an idea, an example meta title looks like:
Algorithm Updates | | bjs2010
Weber 522053 - Electric Barbecue Q 140 Grey - Online Shop The strange behaviour is if I do a "Fetch as Google" in GWT, no problem - I can see it pulls the variables and it's ok. So I click submit to index. Then I do a google site:URL search, to see what it has indexed, and I see the meta description has changed (so I know it's working), but the meta title has been cut so it looks like this:
Electric Barbecue Q 140 Grey - Online Shop So I am confused - why would Google cut off some words at start of meta title? Even after the Fetch as Googlebot looks perfectly ok? I should point out that this method works perfect on our other pages, which are many hundreds - but it's not working on some pages for some weird reason.... Any ideas?0 -
Website dance on Google Map results and organic seo results
My website is daily showing different position on maps.google.com and for the last few days like yesterday it was on 21st position on some keyword and today it is no where and same with other keywords. Is this a Google Dance ?? what can be its period ? and what is tyhe solution to handle it ??
Algorithm Updates | | mnkpso0 -
Can AJAX implementation affect the rankings in Google Panda?
Hi there, I have the following situation with one of our job sites. We migrate the site to a new application, which is better from design point of view and also usability. For this we use a lot AJAX especially in searches. So every time a user is filtering down their search new results will be shown on the page, at the same url and with no page load. But, having this implementation. affected Bounce rate - which increased from 38% to nearly 60%, PI/visits - which are now half, at 3 and also Avg Time on Site is half that is used to be coming to 2,5 min from nearly 6 min. From Rand post, it is clearly that the content is very important in Google Panda, and all of these parameters we should consider, as it is telling the quality of the content. So, my question will be, can this site be hit by Panda updates (maybe later on) because Bounce Rate, PI/Visits and Avg Time on site, decreased in such way? At the moment we don't measure the Ajax impresion, but as I understood that we can do that though virtual pages in GA, does anyone of you have the experience how to handle this? Won't be this an artificial increase? Thanks, Irina
Algorithm Updates | | InformMedia0