Google crawling 200 page site thousands of times/day. Why?
-
Hello all, I'm looking at something a bit wonky for one of the websites I manage. It's similar enough to other websites I manage (built on a template) that I'm surprised to see this issue occurring. The xml sitemap submitted shows Google there are 229 pages on the site. Starting in the beginning of December Google really ramped up their intensity in crawling the site. At its high point Google crawled 13,359 pages in a single day.
I mentioned I manage other similar sites - this is a very unusual spike. There are no resources like infinite scroll that auto generates content and would cause Google some grief.
So follow up questions to my "why?" is "how is this affecting my SEO efforts?" and "what do I do about it?". I've never encountered this before, but I think limiting my crawl budget would be treating the symptom instead of finding the cure. Any advice is appreciated. Thanks!
*edited for grammar.
-
I have a final update for everyone! We discovered the cause of the mysterious increase in crawling. One of our partners tested out a second version of the content on the website (yes, we have two complete sets of content for every page) by swapping out the first set with the second set. The second set caused Google to reevaluate the entire website, crawl it repeatedly thousands of times for two weeks, then stop.
The result of this refresh was a jump in the rankings. We were ranking on page one for about 15% of our targeted keywords and after the new content was inputted it jumped to 71%. Only time will tell if those new rankings will stick, but for now it looks pretty good.
-
Update: after about two weeks the crawl rate returned to normal. We haven't been able to identify a cause yet.
-
It is strange. It's definitely worth looking at access logs and analyzing crawler data there so you can see what pages are getting hit by the crawler just to be sure you understand the activity.
-
Well I would be more then happy if Google would visit my pages more often then once a day. We have around 100k original pages and we also see them visiting 250k pages daily with uplifts to 350k+ which I don't consider to be a bad thing. As long as you're sure about the fact that they see the right pages I would say it's a good thing. The crawl rate really varies day over day for any site, sometimes you get a high rate for a while and then it drops again when Google will find out that your site isn't creating that much new fresh content anymore.
Curious about your idea with the sitemap priority, to my experience + knowledge it doesn't change anything.
-
Yes I have, and yes there are pages that aren't listed in the sitemap and aren't supposed to be there. That's being corrected (we're considering experimenting with priority tags in the sitemap to see if it has an impact over just immediately blocking with robots.txt or meta robots). But if you factor in those pages, it still only amounts to 303 pages.
Weird, right?
-
Have you tried scanning the site with something like screaming frog to make sure there aren't pages that just aren't listed in the sitemap? Ie. tag or category pages, images or other partial content pieces that are creating pages.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
I have a metadata issue. My site crawl is coming back with missing descriptions, but all of the pages look like site tags (i.e. /blog/?_sft_tag=call-routing)
I have a metadata issue. My site crawl is coming back with missing descriptions, but all of the pages look like site tags (i.e. /blog/?_sft_tag=call-routing)
Intermediate & Advanced SEO | | amarieyoussef0 -
Google webcache of product page redirects back to product page
Hi all– I've legitimately never seen this before, in any circumstance. I just went to check the google webcache of a product page on our site (was just grabbing the last indexation date) and was immediately redirected away from google's cached version BACK to the site's standard product page. I ran a status check on the product page itself and it was 200, then ran a status check on the webcache version and sure enough, it registered as redirected. It looks like this is happening for ALL indexed product pages across the site (several thousand), and though organic traffic has not been affected it is starting to worry me a little bit. Has anyone ever encountered this situation before? Why would a google webcache possibly have any reason to redirect? Is there anything to be done on our side? Thanks as always for the help and opinions, y'all!
Intermediate & Advanced SEO | | TukTown1 -
Our site is on a secure server (https) will a link to http:// be of less value?
Our site is hosted on a secure network (I.E. Our web address is - https://www.workbooks.com). Will a backlink pointing to: http://www.workbooks.com provide less value than a link pointing to: https://www.workbooks.com ? Many thanks, Sam
Intermediate & Advanced SEO | | Sam.at.Moz0 -
Why Google scrambles/change our product page titles? And descriptions too?
Here is an interesting issue we are noticing lately: Google is always more scrambling and changing the title of our product pages in the SERPs results. Here is an example: Keyword: "bach arioso sheet music". We are down at the 6th spot, and the shown title is different from what's defined inside the TITLE tag of that page. And that appears often for other keywords/product pages. Why's that? How can we control that? It is hard for us to optimize titles and test CTR and other metrics if Google is showing them differently to the users. Similar issue with the description tag: sometimes Google instead of showing to the users the description tag contents, shows part of the text taken from the page even though the searched keywords are included both in the title and the description tag, and so I can't find justification to show text taken from the page instead... it is quite difficult to understand the motivation beyond all this! Any thoughts are very welcome. Thanks! Fab.
Intermediate & Advanced SEO | | fablau0 -
Google Disavow Tool - Waste of Time
My humble opinion is that Google's disavow tool.... is a utter waste of your time! My site, http://goo.gl/pdsHs was penalized over a year ago after the SEO we hired used black hat techniques to increase ranking. Ironically, while having visibility, Google itself had become a customer. (I guess the site was pretty high quality, trust worthy and user friendly enough for Google employees to purchase from.) Soon enough the message about detecting unnatural links had shown up on the webmaster tools and as expected, our rankings sank and out of view. For a year we had contacted webmasters, asking them remove links pointing back to us. 90% didn't respond, the other 10% complied). Work on our site continued, adding high quality, highly relevant unique content.
Intermediate & Advanced SEO | | Prime85
Rankings never recovered and neither did our traffic or business….. Earlier this month, we learned about Google’s "link disavow tool" and were excited! We had hoped that following the cleanup instruction, using the “link disavow tool”, we would get a chance at recovery!
We watched Matt Cutts’ video, read the various forums/blogs/topics online that were written about it, and then we felt comfortable enough to use it... We went through our backlink profile, determining which links were either spammy or seemed a result of black hat practices or the links added by a 3rd party possibly interested in our demise and added them to a .txt file. We submitted the file via the disavow tool and followed with another reconsideration request. The result came a couple of weeks later… the same cookie cutter email in the WMT suggesting that there are “unnatural links” to the site. Hope turned to disappointment and frustration. Looks like the big box companies will continue to populate the top 100 results of ANY search, the rest will help Google’s shareholders… If your site has gotten in the algorithm crosshairs, you have a better chance of recovering by changing your URL than messing around with this useless tool.0 -
E-Commerce site - How do I geo-target towns/cities/states if there aren't any store locations?
Site = e-commerce Products = clothing (no apparel can be location specific like sports gear where you can do the location specific team gear (NBA, NFL, etc)) Problems = a. no store front b. I don't want to do any sitewides (footers, sidebars, etc) because of the penguin update Question = How do you geo-target these category pages and product pages? Ideas = a. reviews with clients locations b. blog posts with clients images wearing apparel and location description and keywords that also links back to that category or be it product page (images geo- targeted, tags, and description) c. ? Thanks in advance!
Intermediate & Advanced SEO | | Cyclone0 -
Google is not Indicating any Links to my site
We built a new store on another ccTLD and linked to it from some of our other domains in a few locations. I am noticing that with the Google operator command "links:" we are seeing nothing linking to our site anywhere. Some things to clarify: These are not no-follow links These pages linking to our new domain are indexed The pages being linked to on our new domain are indexed This is not a flash site or heavy in JavaScript The links existed the day the site was launched so when the new pages were crawled they existed. "Site:" command in Google shows me that my new site is indexed. What could potentially be causing this? I am trying to get these newer ccTLD's to begin ranking and I understand that I need to get links going to these pages since they are fairly new (2.5 months) so I can outrank the .com in the SE's in those locales. (Like Google.co.uk)
Intermediate & Advanced SEO | | DRSearchEngOpt0 -
How to see which site Google views as a scraper site?
If we have content on our site that is found on another site, what is the best way to know which site Google views as the original source? If you search for a line of the content such as "xyz abc etc" and the other site shows before yours in search results, does that mean that Google views that site as the original source?
Intermediate & Advanced SEO | | nicole.healthline0