Google crawling 200 page site thousands of times/day. Why?
-
Hello all, I'm looking at something a bit wonky for one of the websites I manage. It's similar enough to other websites I manage (built on a template) that I'm surprised to see this issue occurring. The xml sitemap submitted shows Google there are 229 pages on the site. Starting in the beginning of December Google really ramped up their intensity in crawling the site. At its high point Google crawled 13,359 pages in a single day.
I mentioned I manage other similar sites - this is a very unusual spike. There are no resources like infinite scroll that auto generates content and would cause Google some grief.
So follow up questions to my "why?" is "how is this affecting my SEO efforts?" and "what do I do about it?". I've never encountered this before, but I think limiting my crawl budget would be treating the symptom instead of finding the cure. Any advice is appreciated. Thanks!
*edited for grammar.
-
I have a final update for everyone! We discovered the cause of the mysterious increase in crawling. One of our partners tested out a second version of the content on the website (yes, we have two complete sets of content for every page) by swapping out the first set with the second set. The second set caused Google to reevaluate the entire website, crawl it repeatedly thousands of times for two weeks, then stop.
The result of this refresh was a jump in the rankings. We were ranking on page one for about 15% of our targeted keywords and after the new content was inputted it jumped to 71%. Only time will tell if those new rankings will stick, but for now it looks pretty good.
-
Update: after about two weeks the crawl rate returned to normal. We haven't been able to identify a cause yet.
-
It is strange. It's definitely worth looking at access logs and analyzing crawler data there so you can see what pages are getting hit by the crawler just to be sure you understand the activity.
-
Well I would be more then happy if Google would visit my pages more often then once a day. We have around 100k original pages and we also see them visiting 250k pages daily with uplifts to 350k+ which I don't consider to be a bad thing. As long as you're sure about the fact that they see the right pages I would say it's a good thing. The crawl rate really varies day over day for any site, sometimes you get a high rate for a while and then it drops again when Google will find out that your site isn't creating that much new fresh content anymore.
Curious about your idea with the sitemap priority, to my experience + knowledge it doesn't change anything.
-
Yes I have, and yes there are pages that aren't listed in the sitemap and aren't supposed to be there. That's being corrected (we're considering experimenting with priority tags in the sitemap to see if it has an impact over just immediately blocking with robots.txt or meta robots). But if you factor in those pages, it still only amounts to 303 pages.
Weird, right?
-
Have you tried scanning the site with something like screaming frog to make sure there aren't pages that just aren't listed in the sitemap? Ie. tag or category pages, images or other partial content pieces that are creating pages.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
No Index thousands of thin content pages?
Hello all! I'm working on a site that features a service marketed to community leaders that allows the citizens of that community log 311 type issues such as potholes, broken streetlights, etc. The "marketing" front of the site is 10-12 pages of content to be optimized for the community leader searchers however, as you can imagine there are thousands and thousands of pages of one or two line complaints such as, "There is a pothole on Main St. and 3rd." These complaint pages are not about the service, and I'm thinking not helpful to my end goal of gaining awareness of the service through search for the community leaders. Community leaders are searching for "311 request service", not "potholes on main street". Should all of these "complaint" pages be NOINDEX'd? What if there are a number of quality links pointing to the complaint pages? Do I have to worry about losing Domain Authority if I do NOINDEX them? Thanks for any input. Ken
Intermediate & Advanced SEO | | KenSchaefer0 -
Keyword stuffing on category pages - eCommerce site
Hi there fellow Mozzers. I work for a wine company, and I have a theory that some of our category pages are not ranking as well as they could, due to keyword stuffing. The best example is our Champagne category page, which we are trying to rank for the keyword Champagne, currently rank 6ish. However, when I load the page into Moz, it tells me that I might be stuffing, which I am not, BUT my products might be giving both Moz and Google this impression as well. Our product names for any given Champagne is "Champagne - {name}" and the producer is "Champagne {producer name}. Now, on the category pages we have a list of Champagnes, actually 44 Which means that with the way we display them, with both name of the wine, the name of the producer AND the district. That means we have 132 mentions of the word "Champagne" + the content text that I have written. I am wondering, how good is Google at identifying that this is in fact not stuffing, but rather functionality that makes for this high density of the keyword? Is there anything I can do? I mean, we can change it so it's not listed with Champagne on all the products, but I believe it would make the usability suffer a bit, not a lot - but it's a question of balance and I would like to hear if anyone has encountered a similar problem, if it is in fact a problem?
Intermediate & Advanced SEO | | Nikolaj-Landrock2 -
Redirect wordpress from /%post_id%/%postname%/ to /blog/%postname%/
Hi what is the code to redirect wordpress blog from site.com/%post_id%/%postname%/ to site.com/blog/%postname%/ We are moving the site to a new server and new url structure. Thanks in advance
Intermediate & Advanced SEO | | Taiger0 -
What to do with large number of old/outdated pages?
Are we redoing a large portion of our site (not ecommerce). We have a large number of pages (about 2000 indexed pages, out of about 3000) that have been forgetten about until recently, are very outdated, don't drive any traffic (according to Google Analytics) But they are ranking very well for the targeting keyword (#3 organic for most). What should I do with those pages? Could you give any guidance on whether we should or what affect it might have one the rest of the website if we delete those pages or simply 301 redirecting all those pages to the home page?
Intermediate & Advanced SEO | | aphoontrakul0 -
Why has my home page replaced my sub-category page for set of keywords? Happened 2x in last 2 weeks for day or so only to fix itself. What is going on?
Today I noticed a really weird problem. Our LED Step Lights page (https://www.pegasuslighting.com/led-step-lights.html) has been replaced in the search results with our home page. See screenshot below. As I started to research what was going on, I noticed that this same thing must have happened on January 26 and 27 because in my Analytics I can see that our LED Step Lights sub-cat page had a sudden drop in traffic on those two days only to bounce back again on the 28th. See screenshot below. Our LED Step Lights page has had no changes in content, meta information, or anything in months. We have done no recent link building to this page in years. I don't understand what is going on. This is a popular page for us generating decent traffic. I really don't understand what is going on or even how to try and resolve this problem. I checked our Search Console. No messages. No manual web spam actions. Nothing to suggest that anything is going on except for the weird drops in traffic. Has anyone ever seen this happen before? Does anyone have any ideas as to what may be going on? serp-led-step-lights.png organic-traffic-drops.png search-console-led-step-lights.png
Intermediate & Advanced SEO | | cajohnson0 -
Google indexing pages from chrome history ?
We have pages that are not linked from site yet they are indexed in Google. It could be possible if Google got these pages from browser. Does Google takes data from chrome?
Intermediate & Advanced SEO | | vivekrathore0 -
Thousands of Web Pages Disappered from Google Index
The site is - http://shop.riversideexports.com We checked webmaster tools, nothing strange. Then we manually resubmitted using webmaster tools about a month ago. Now only seeing about 15 pages indexed. The rest of the sites on our network are heavily indexed and ranking really well. BUT the sites that are using a sub domain are not. Could this be a sub domain issue? If so, how? If not, what is causing this? Please advise. UPDATE: What we can also share is that the site was cleared twice in it's lifetime - all pages deleted and re-generated. The first two times we had full indexing - now this site hovers at 15 results in the index. We have many other sites in the network that have very similar attributes (such as redundant or empty meta) and none have behaved this way. The broader question is how to do we get the indexing back ?
Intermediate & Advanced SEO | | suredone0 -
Google+ Personal Page pass link juice?
I noticed recently that a clients google plus business page (Set up as a personal page) has a followed link pointing to their site. They have many links on the web pointing to the google+ page, however that page is an https page. So the question is, would a google+ page that is https still pass authority and link juice to the site linked in the about us tab?
Intermediate & Advanced SEO | | iAnalyst.com0