Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Is it possible that Google may have erroneous indexing dates?
-
I am consulting someone for a problem related to copied content. Both sites in question are WordPress (self hosted) sites. The "good" site publishes a post. The "bad" site copies the post (without even removing all internal links to the "good" site) a few days after.
On both websites it is obvious the publishing date of the posts, and it is clear that the "bad" site publishes the posts days later. The content thief doesn't even bother to fake the publishing date.
The owner of the "good" site wants to have all the proofs needed before acting against the content thief. So I suggested him to also check in Google the dates the various pages were indexed using Search Tools -> Custom Range in order to have the indexing date displayed next to the search results.
For all of the copied pages the indexing dates also prove the "bad" site published the content days after the "good" site, but there are 2 exceptions for the very 2 first posts copied.
First post:
On the "good" website it was published on 30 January 2013
On the "bad" website it was published on 26 February 2013
In Google search both show up indexed on 30 January 2013!Second post:
On the "good" website it was published on 20 March 2013
On the "bad" website it was published on 10 May 2013
In Google search both show up indexed on 20 March 2013!Is it possible to be an error in the date shown in Google search results?
I also asked for help on Google Webmaster forums but there the discussion shifted to "who copied the content" and "file a DMCA complain". So I want to be sure my question is better understood here.
It is not about who published the content first or how to take down the copied content, I am just asking if anybody else noticed this strange thing with Google indexing dates.How is it possible for Google search results to display an indexing date previous to the date the article copy was published and exactly the same date that the original article was published and indexed?
-
Thanks Doug. Really an eye-opener.
-
Thanks Doug for your response. It really cleared up the questions I had about that date Google shows next to the search results.
I was not able to find official details about it, all I was able to find was different referencing as the indexing date of a page.
But I knoew here in the MOZ community there are people who really know things, that's why I asked.
So that date is just Google's estimation of the publishing date, not the date Google indexed the content!
Thanks again for taking the time to answer me!
-
Hiya Sorina,
When you use the custom date range, Google isn't listing results based on the date they were indexed. Google is using an estimated publication date.
Google tries to estimate the the publication date based on meta-data and other features of the page such as dates in the content, title and URL. The date Google first indexed the page is just one of the things that Google can use to estimate the publication date.
I also suspect that dates in any sitemap.xml files will also be taken into consideration.
But, given that even Google can't guarantee that it'll crawl and index articles on the day they've been published the crawl data may not be an accurate estimate.
Also, if the scraped content is being re-published with intact internal links (are these the full URL - do you they resolve to your original website?) then it's pretty obvious where the content came from.
Hope this help answer your question.
-
Hi Sorina,
I can tell you that the index dates shown by Google are accurate but is not the case with the Cache date sometimes as the date shown in the Cache and the copy shown in the cache don't match many times but the index dates are accurate. Send me a private message with the actual URLs under discussion and I will be able to comment with more clarity.
Best,
Devanur Rafi
-
Thank you for your response Devanur Rafi, but the "good" site doesn't have problems getting indexed.
Actually all posts on the "good" site are indexed the very same day they are published.My question was more about the indexing date shown in Google search results
How come, for a post from the "bad" site, Google is displaying an indexing date previous to the actual date the post was published on that site?!
And how come this date is exactly the same as the date Google says it indexed the post from the "good" site?
-
Hi Sorina,
This is a common thing and it all depends on a site's crawlability (how easy is it to crawl for the bot) and crawl frequency for that site. Google would have picked up that post first on the bad site and then from the good site. However, just because one or two posts were picked up late does not mean that the good site is not crawler friendly. It also depends on how far the resource is from the root. Let us take an example:
A page on a good site: abc.com/folder1/folder2/folder3/page.html
Now a bad site copies that page: xyz.com/page.html
In this case, Google might first pickup the copied page from the bad site as it is just a click away from the root which is not the case with the good site where the page is nested deep inside multiple folders.
You can also give the way back machine (archive.org) a try to find which website published the post first. Sometimes this might work out pretty well. You can also try to look at the cache dates of the posts on both the sites in Google to get some info in this regard.
Hope those help. I wish you good luck.
Best,
Devanur Rafi.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How can I discover the Google ranking number for a keyword in Brazil?
Hello, how can I discover the Google ranking number for a keyword in Brazil location. I need to know what is the position in Brazil location for the keyword "ligação internacional" in the Google search engine for the webpage www.solaristelecom.com/ligacao-internacional. I tried to use the Moz tools to discover it but only shows that I am not in the top 50, then I want to know where I am, and if I am listed or not. I tried to search it in my browser and didn't show the name of my website. Thank you.
Algorithm Updates | | lmoraes1 -
Google Cache
So, when I gain a link I always check to see if the page that is linking is in the Google cache. I've noticed recently that more and more pages are actually not showing up in Google's cache, yet still appear in search results. I did read an article from someone whoo works at Google a few weeks back that there is sometimes an error with the cache and occasionally the cache will not display. This week, my own website isn't showing up in the cache yet I'm still ranking in SERP's. I'm not worried about it, mostly whitehat, but has there been any indication that Google are phasing out the ability to check cache's of websites?
Algorithm Updates | | ThorUK0 -
Very strange, inconsistent and unpredictable Google ranking
I have been searching through these forums and haven't come across someone that faces the same issue I am. The folks on the Google forums are certain this is an algorithm issue, but I just can't see the logic in that because this appears to be an issue fairly unique to me. I'll take you through what I've gone through. Sorry for it being long. Website URL: https://fenixbazaar.com 1. In early February, I made the switch to https with some small hiccups. Overall however the move was smooth, had redirects all in place, sitemap, indexing was all fine. 2. One night, my organic traffic dropped by almost 100%. All of my top-ranking articles completely disappeared from rank. Top keyword searches were no longer yielding my best performing articles on the front page of results, nor on the last page of results. My pages were still being indexed, but keyword searches weren't delivering my pages in results. I went from 70-100 active users to 0. 3. The next morning, everything was fine. Traffic back up. Top keywords yielding results for my site on the front page. All was back to normal. Traffic shot up. Only problem was the same issue happened that night, and again for the next three nights. Up and down. 4. I had a developer and SEO guy look into my backend to make sure everything was okay. He said there were some redirection issues but nothing that would cause such a significant drop. No errors in Search Console. No warnings. 5. Eventually, the issue stopped and my traffic improved back to where it was. Then everything went great: the site was accepted into Google News, I installed AMP pages perfectly and my traffic boomed for almost 2 weeks. 6. At this point numerous issues with my host provider, price increases, and incredibly outdated cpanel forced me to change hosts. I did without any issues, although I lost a number of articles albeit low-traffic ones in the move. These now deliver 404s and are no longer indexed in the sitemap. 7. After the move there were a number of AMP errors, which I resolved and now I sit at 0 errors. Perfect...or so it seems. 8. Last week I applied for hsts preload and am awaiting submission. My site was in working order and appeared set to get submitted. I applied after I changed hosts. 9. The past 5 days or so has seen good traffic, fantastic traffic to my AMP pages, great Google News tracking, linking from high-authority sites. Good performance all round. 10. I wake up this morning to find 0 active people on my site. I do a Google search and notice my site isn't even the first result whenever I do an actual search for my name. The site doesn't even rank for its own name! My site is still indexed but search results do not yield results for my actual sites. Check Search Console and realised the sitemap had been "processed" yesterday with most pages indexed, which is weird because it was submitted and processed about a week earlier. I resubmitted the sitemap and it appears to have been processed and approved immediately. No changes to search results. 11. All top-ranking content that previously placed in carousal or "Top Stories" in Google News have gone. Top-ranking keywords no longer bring back results with my site: I went through the top 10 ranking keywords for my site, my pages don't appear anywhere in the results, going as far back as page 20 (last page). The pages are still indexed when I check, but simply don't appear in search results. It's happening all over again! Is this an issue any of you have heard of before? Where a site is still being indexed, but has been completely removed from search results, only to return within a few hours? Up and down? I suspect it may be a technical issue, first with the move to https, and now with changing hosts. The fact the sitemap says processed yesterday, suggests maybe it updated and removed the 404s (there were maybe 10), and now Google is attempting to reindexed? Could this be viable? The reason I am skeptical of it being an algorithm issue is because within a matter of hours my articles are ranking again for certain keywords. And this issue has only happened after a change to the site has been applied. Any feedback would be greatly appreciated 🙂
Algorithm Updates | | fenixbazaar0 -
Best and easiest Google Depersonalization method
Hello, Moz hasn't written anything about depersonalization for years. This article has methods, but I don't know if they are valid anymore. What's an easy, effective way to depersonalize Google search these days? I would just log out of Google, but that shows different ranking results than Moz's rank tracker for one of our main keywords, so I don't know if that method is correct. Thanks
Algorithm Updates | | BobGW0 -
Homepage Index vs Home vs Default?
Should your home page be www.yoursite.com/index.htm or home.htm or default.htm on an apache server? Someone asked me this, and I have no idea. On our wordpress site, I have never even seen this come up, but according to my friend, every homepage HAS to be one of those three. So my question is which one is best for an apache server site AND does it actually have to be one of those three? Thanks, Ruben
Algorithm Updates | | KempRugeLawGroup0 -
Google is forcing a 301 by truncating our URLs
Just recently we noticed that google has indexed truncated urls for many of our pages that get 301'd to the correct page. For example, we have:
Algorithm Updates | | mmac
http://www.eventective.com/USA/Massachusetts/Bedford/107/Doubletree-Hotel-Boston-Bedford-Glen.html as the url linked everywhere and that's the only version of that page that we use. Google somehow figured out that it would still go to the right place via 301 if they removed the html filename from the end, so they indexed just: http://www.eventective.com/USA/Massachusetts/Bedford/107/ The 301 is not new. It used to 404, but (probably 5 years ago) we saw a few links come in with the html file missing on similar urls so we decided to 301 them instead thinking it would be helpful. We've preferred the longer version because it has the name in it and users that pay attention to the url can feel more confident they are going to the right place. We've always used the full (longer) url and google used to index them all that way, but just recently we noticed about 1/2 of our urls have been converted to the shorter version in the SERPs. These shortened urls take the user to the right page via 301, so it isn't a case of the user landing in the wrong place, but over 100,000 301s may not be so good. You can look at: site:www.eventective.com/usa/massachusetts/bedford/ and you'll noticed all of the urls to businesses at the top of the listings go to the truncated version, but toward the bottom they have the full url. Can you explain to me why google would index a page that is 301'd to the right page and has been for years? I have a lot of thoughts on why they would do this and even more ideas on how we could build our urls better, but I'd really like to hear from some people that aren't quite as close to it as I am. One small detail that shouldn't affect this, but I'll mention it anyway, is that we have a mobile site with the same url pattern. http://m.eventective.com/USA/Massachusetts/Bedford/107/Doubletree-Hotel-Boston-Bedford-Glen.html We did not have the proper 301 in place on the m. site until the end of last week. I'm pretty sure it will be asked, so I'll also mention we have the rel=alternate/canonical set up between the www and m sites. I'm also interested in any thoughts on how this may affect rankings since we seem to have been hit by something toward the end of last week. Don't hesitate to mention anything else you see that may have triggered whatever may have hit us. Thank you,
Michael0 -
Why has my homepage been replaced in Google by my Facebook page?
Hi. I was wondering if others have had this happen to them. Lately, I've noticed that on a couple of my sites the homepage no longer appears in the Google SERP. Instead, a Facebook page I've created appears in the position the homepage used to get. My subpages still get listed in Google--just not the homepage. Obviously, I'd prefer that both the homepage and Facebook page appear. Any thoughts on what's going on? Thanks for your help!
Algorithm Updates | | TuxedoCat0 -
Why google index ip address instead of the domain name?
I have a website ,now google index ip address of it instead of the domain name,I have used 301 redirected to the domain name,but how to change the index IP to its domain name? And why google index the IP address?
Algorithm Updates | | frankfans1170