Google crawling 200 page site thousands of times/day. Why?

brettmandoes

Hello all, I'm looking at something a bit wonky for one of the websites I manage. It's similar enough to other websites I manage (built on a template) that I'm surprised to see this issue occurring. The xml sitemap submitted shows Google there are 229 pages on the site. Starting in the beginning of December Google really ramped up their intensity in crawling the site. At its high point Google crawled 13,359 pages in a single day.

I mentioned I manage other similar sites - this is a very unusual spike. There are no resources like infinite scroll that auto generates content and would cause Google some grief.

So follow up questions to my "why?" is "how is this affecting my SEO efforts?" and "what do I do about it?". I've never encountered this before, but I think limiting my crawl budget would be treating the symptom instead of finding the cure. Any advice is appreciated. Thanks!

*edited for grammar.

brettmandoes

I have a final update for everyone! We discovered the cause of the mysterious increase in crawling. One of our partners tested out a second version of the content on the website (yes, we have two complete sets of content for every page) by swapping out the first set with the second set. The second set caused Google to reevaluate the entire website, crawl it repeatedly thousands of times for two weeks, then stop.

The result of this refresh was a jump in the rankings. We were ranking on page one for about 15% of our targeted keywords and after the new content was inputted it jumped to 71%. Only time will tell if those new rankings will stick, but for now it looks pretty good.

brettmandoes

Update: after about two weeks the crawl rate returned to normal. We haven't been able to identify a cause yet.

slatronica

It is strange. It's definitely worth looking at access logs and analyzing crawler data there so you can see what pages are getting hit by the crawler just to be sure you understand the activity.

Martijn_Scheijbeler

Well I would be more then happy if Google would visit my pages more often then once a day. We have around 100k original pages and we also see them visiting 250k pages daily with uplifts to 350k+ which I don't consider to be a bad thing. As long as you're sure about the fact that they see the right pages I would say it's a good thing. The crawl rate really varies day over day for any site, sometimes you get a high rate for a while and then it drops again when Google will find out that your site isn't creating that much new fresh content anymore.

Curious about your idea with the sitemap priority, to my experience + knowledge it doesn't change anything.

brettmandoes

Yes I have, and yes there are pages that aren't listed in the sitemap and aren't supposed to be there. That's being corrected (we're considering experimenting with priority tags in the sitemap to see if it has an impact over just immediately blocking with robots.txt or meta robots). But if you factor in those pages, it still only amounts to 303 pages.

Weird, right?

slatronica

Have you tried scanning the site with something like screaming frog to make sure there aren't pages that just aren't listed in the sitemap? Ie. tag or category pages, images or other partial content pieces that are creating pages.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Google crawling 200 page site thousands of times/day. Why?

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Crawling/indexing of near duplicate product pages

How does Googlebot evaluate performance/page speed on Isomorphic/Single Page Applications?

Crawled page count in Search console

Google is indexing the wrong pages

Google+ Page Question

Does Google still don't index Hashtag Links ? No chance to get a Search Result that leads directly to a section of a page? or to one of numeras Hashtag Pages in a single HTML page?

Google local pointing to Google plus page not homepage

How to stop pages being crawled from xml feed?