Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Crawl Stats Decline After Site Launch (Pages Crawled Per Day, KB Downloaded Per Day)
-
Hi all,
I have been looking into this for about a month and haven't been able to figure out what is going on with this situation. We recently did a website re-design and moved from a separate mobile site to responsive. After the launch, I immediately noticed a decline in pages crawled per day and KB downloaded per day in the crawl stats. I expected the opposite to happen as I figured Google would be crawling more pages for a while to figure out the new site. There was also an increase in time spent downloading a page. This has went back down but the pages crawled has never went back up. Some notes about the re-design:
- URLs did not change
- Mobile URLs were redirected
- Images were moved from a subdomain (images.sitename.com) to Amazon S3
- Had an immediate decline in both organic and paid traffic (roughly 20-30% for each channel)
I have not been able to find any glaring issues in search console as indexation looks good, no spike in 404s, or mobile usability issues. Just wondering if anyone has an idea or insight into what caused the drop in pages crawled? Here is the robots.txt and attaching a photo of the crawl stats.
User-agent: ShopWiki Disallow: / User-agent: deepcrawl Disallow: / User-agent: Speedy Disallow: / User-agent: SLI_Systems_Indexer Disallow: / User-agent: Yandex Disallow: / User-agent: MJ12bot Disallow: / User-agent: BrightEdge Crawler/1.0 (crawler@brightedge.com) Disallow: / User-agent: * Crawl-delay: 5 Disallow: /cart/ Disallow: /compare/ ```[fSAOL0](https://ibb.co/fSAOL0)
-
Yea that's definitely tricky. I'm assuming you haven't taken out any load balancing that was previously in place between desktop and m. meaning your server is struggling a lot more? The Page Speed Insights tool can be good info but if possible I'd have a look at that user experience index to get an idea of how other users are experiencing the site.
A next port of call could be your server logs? Do you have any other subdomains which are performing differently in search console?
In terms of getting Google to crawl more, unfortunately at this point my instinct would be to keep trying to optimise the site to make it as crawl-friendly as possible and wait for Google to start crawling more. It does look like the original spike in time spent downloading has subsided a bit but it's still higher than it was. Without doing the maths, given that pages crawled and kilobytes downloaded have dropped, the level of slowdown may have persisted and the drop in that graph could have been caused by Google easing back. I'd keep working on making the site as efficient and consistent as possible and try to get that line tracking lower as an immediate tactic.
-
Hi Robin,
Thanks a lot for the reply. A lot of good information there.
- The crawl delay has been on the site as long as I have known so it was left in place just to minimize changes
- Have not changed any of the settings in Search Console. It has remained at "Let Google optimize for my site"
- Have not received the notification for mobile first indexing
- The redirects were one to one for the mobile site. I do not believe there are any redirect chains from those.
- The desktop pages remained roughly the same size but on a mobile device, pages are slightly heavier compared to the sepatate m dot site. The separate m dot site had a lot of content stripped out and was pretty bare to be fast. We introduced more image compression than we have ever done and also deferred image loading to make the user experience as fast as possible. The site scores in the 90s on Google's page speed insights tool.
- Yes, resizing based on viewport. Content is basically the same between devices. We have some information in accordions on product detail pages on and show fewer products on the grids on mobile.
- They are not the same images files but they are actually smaller than they were previously as we were not compressing them and using different sizes in different locations to minimize page weight.
I definitely lean towards it being performance related as in the crawl stats there seems to be a correlation between time spent downloading a page and the other two stats. I just wonder how you get Google to start crawling more once the performance is fixed or if they will figure it out.
-
Hi there, thanks for posting!
Sounds like an interesting one, some questions that come to mind which I'd just like to run through to make sure we're not missing anything;
- Why do you have Crawl-delay set for all user agents? Officially it's not something Google supports but the reason for that could be the cause of this
- Have you changed any settings in search console? There is a slider for how often you want Google to crawl a site
- Have you had the Search Console notification that you're now on the mobile-first index?
- When you redirected the mobile site, was it all one-to-one redirects? Is there any possibility you've introduced redirect chains?
- After the redesign - are the pages now significantly bigger (in terms of amount of data needed to fully load the page)? Are there any very large assets that are now on every page?
- When you say responsive, is it resizing based on viewport? How much duplication has been added to the page? Is there a bunch of content that is there for mobile but not loaded unless viewed from mobile (and vice versa)?
- When you moved the images, were they the same exact image files or might they now be the full-size image files?
This is just first blush so I could be off the mark but those graphs suggest to me that Google is having to work harder to crawl your pages and, as a result, is throttling the amount of time spent on your site. If the redesign or switch to responsive involved making the pages significantly "heavier" where that could be additional JavaScript, bigger images, more content etc. that could cause that effect. If you've got any sitespeed benchmarking in place you could have a look at that to see whether things have changed. Google also uses pagespeed as a ranking factor so that could explain the traffic drop.
The other thing to bear in mind is that combining the mobile and desktop sites was essentially a migration, particularly if you were on the mobile-first index. It may be that the traffic dip is less related to the crawl rate, but I understand why we'd make the connection there.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
XML sitemap generator only crawling 20% of my site
Hi guys, I am trying to submit the most recent XML sitemap but the sitemap generator tools are only crawling about 20% of my site. The site carries around 150 pages and only 37 show up on tools like xml-sitemaps.com. My goal is to get all the important URLs we care about into the XML sitemap. How should I go about this? Thanks
Intermediate & Advanced SEO | | TyEl0 -
Is it good or bad to add noindex for empty pages, which will get content dynamically after some days
We have followers, following, friends, etc pages for each user who creates account on our website. so when new user sign up, he may have 0 followers, 0 following and 0 friends, but over period of time he can get those lists go up. we have different pages for followers, following and friends which are allowed for google to index. When user don't have any followers/following/friends, those pages looks empty and we get issue of duplicate content and description too short. so is it better that we add noindex for those pages temporarily and remove noindex tag when there are at least 2 or more people on those pages. What are side effects of adding noindex when there is no data on those page or benefits of it?
Intermediate & Advanced SEO | | swapnil120 -
[Very Urgent] More 100 "/search/adult-site-keywords" Crawl errors under Search Console
I just opened my G Search Console and was shocked to see more than 150 Not Found errors under Crawl errors. Mine is a Wordpress site (it's consistently updated too): Here's how they show up: Example 1: URL: www.example.com/search/adult-site-keyword/page2.html/feed/rss2 Linked From: http://an-adult-image-hosting.com/search/adult-site-keyword/page2.html Example 2 (this surprised me the most when I looked at the linked from data): URL: www.example.com/search/adult-site-keyword-2.html/page/3/ Linked From: www.example.com/search/adult-site-keyword-2.html/page/2/ (this is showing as if it's from our own site) http://a-spammy-adult-site.com/search/adult-site-keyword-2.html Example 3: URL: www.example.com/search/adult-site-keyword-3.html Linked From: http://an-adult-image-hosting.com/search/adult-site-keyword-3.html How do I address this issue?
Intermediate & Advanced SEO | | rmehta10 -
Crawled page count in Search console
Hi Guys, I'm working on a project (premium-hookahs.nl) where I stumble upon a situation I can’t address. Attached is a screenshot of the crawled pages in Search Console. History: Doing to technical difficulties this webshop didn’t always no index filterpages resulting in thousands of duplicated pages. In reality this webshops has less than 1000 individual pages. At this point we took the following steps to result this: Noindex filterpages. Exclude those filterspages in Search Console and robots.txt. Canonical the filterpages to the relevant categoriepages. This however didn’t result in Google crawling less pages. Although the implementation wasn’t always sound (technical problems during updates) I’m sure this setup has been the same for the last two weeks. Personally I expected a drop of crawled pages but they are still sky high. Can’t imagine Google visits this site 40 times a day. To complicate the situation: We’re running an experiment to gain positions on around 250 long term searches. A few filters will be indexed (size, color, number of hoses and flavors) and three of them can be combined. This results in around 250 extra pages. Meta titles, descriptions, h1 and texts are unique as well. Questions: - Excluding in robots.txt should result in Google not crawling those pages right? - Is this number of crawled pages normal for a website with around 1000 unique pages? - What am I missing? BxlESTT
Intermediate & Advanced SEO | | Bob_van_Biezen0 -
Ecommerce Site - Duplicate product descriptions & SKU pages
Hi I have a couple of questions regarding the best way to optimise SKU pages on a large ecommerce site. At the moment we have 2 landing pages per product - one is the primary landing page with no SKU, the other includes the SKU in the URL so our sales people & customers can find it when using the search facility on the site. The SKU landing page has a canonical pointing to the primary page as they're duplicates. Is this the best way? Or is it better to have the one page with the SKU in the URL? Also, we have loads of products with the very similar product descriptions, I am working on trying to include a unique paragraph or few sentences on these to improve the content - how dangerous is the duplicate content within your own site? I know its best to have totally unique content, but it won't be possible on a site with thousands of products and a small team. At the moment I am trying to prioritise the products to update. Thank you 🙂
Intermediate & Advanced SEO | | BeckyKey0 -
Date of page first indexed or age of a page?
Hi does anyone know any ways, tools to find when a page was first indexed/cached by Google? I remember a while back, around 2009 i had a firefox plugin which could check this, and gave you a exact date. Maybe this has changed since. I don't remember the plugin. Or any recommendations on finding the age of a page (not domain) for a website? This is for competitor research not my own website. Cheers, Paul
Intermediate & Advanced SEO | | MBASydney0 -
Why is my Crawl Report Showing Thousands of Pages that Do Not Exist?
Hi, I just downloaded a Crawl Summary Report for a client's website. I am seeing THOUSANDS of duplicate page content errors. The overwhelming majority of them look something like this: ERROR: http://www.earlyinterventionsupport.com/resources/parentingtips/development/parentingtips/development/development/development/development/development/development/parentingtips/specialneeds/default.aspx This page doesn't exist and results in a 404 page. Why are these pages showing up? How do I get rid of them? Are they endangering the health of my site as a whole? Thank you, Jenna <colgroup><col width="1051"></colgroup>
Intermediate & Advanced SEO | | JennaCMag
| |0 -
10,000+ links from one site per URL--is this hurting us?
We manage content for a partner site, and since much of their content is similar to ours, we canonicalized their content to ours. As a result, some URLs have anything from 1,000,000 inbound links / URL to 10,000+ links / URL --all from the same domain. We've noticed a 10% decline in traffic since this showed up in our webmasters account & were wondering if we should nofollow these links?
Intermediate & Advanced SEO | | nicole.healthline0