Crawl errors for pages that no longer exist
-
Hey folks,
I've been working on a site recently where I took a bunch of old, outdated pages down. In the Google Search Console "Crawl Errors" section, I've started seeing a bunch of "Not Found" errors for those pages. That makes perfect sense.
The thing that I'm confused about is that the "Linked From" list only shows a sitemap that I ALSO took down. Alternatively, some of them list other old, removed pages in the "Linked From" list.
Is there a reason that Google is trying to inform me that pages/sitemaps that don't exist are somehow still linking to other pages that don't exist? And is this ultimately something I should be concerned about?
Thanks!
-
Thanks for the question, this can definitely be annoying for webmasters!
Unfortunately, bots can don't everything in parallel. They have to take steps...
Step 1. Take List #1 of links.
Step 2. Crawl those links and build List #2.
Step 3. Crawl List #3 and build List #4...Now, sometimes it doesn't follow that same order. Let's say that in Step 3 it finds a bunch of pages with unique content. Maybe the next time around, it goes and checks some of those links in Step 3 without first checking if they were still linked. Why start the crawl all the way from the beginning again when you have a big list of URLs?
But, this creates a problem. When some of those links it crawled in Step 3 aren't there any more, Google will tell you they aren't there and tell you how they originally found them (which happened to be from a page in List #1). But what if Google hasn't checked that link in List #1 recently? What if you just removed it too?
Well, for a little while, at least, you will end up with errors.
Now, here comes the real rub - how long will it take for Google to find and correct that message it left you in the crawl report? Days? Weeks? Months? Who knows. Your best bet is to mark them as fixed and force Google to keep rechecking. Eventually, they will figure it out.
TL;DR; it is a data freshness and reporting issue that isn't your fault and isn't worth your time.
-
No - Google is just showing how slow it is when updating data in Webmaster tools.
Don't worry - if you wait long enough they'll go away. You could also mark them as solved (do this only if you are sure that there are no links pointing to these pages - to check if your internal linking is ok Screaming Frog is great tool)
Dirk
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How important is Lighthouse page speed measurement?
Hi, Many experts cite the Lighthouse speed as an important factor for search ranking. It's confusing because several top sites have Lighthouse speed of 30-40, yet they rank well. Also, some sites that load quickly have a low Lighthouse speed score (when I test on mobile/desktop they load much quicker than stated by Lighthouse). When we look at other image rich sites (such as Airbnb, John Deere etc) the Lighthouse score can be 30-40. Our site https://www.equipmentradar.com/ loads quickly on Desktop and Mobile, but the Lighthouse score is similar to Airbnb and so forth. We have many photos similar to photo below, probably 30-40, many of which load async. Should we spend more time optimizing Lighthouse or is it ok? Are large images fine to load async? Thank you, Dave bg_05.jpg
Reporting & Analytics | | erdev0 -
Shall i index double pages of my website as compared to my competitors?
a:my competitors has indexed 10 pages (checked it with site:abcd.com and found 10 results) b:what if i index 20 pages of my site and create a lot of content which is also better than my competitors who will have the edge?
Reporting & Analytics | | calvinkj0 -
Irrelevant page with high bounce rate
I have a page on my site, www.waikoloavacationrentals.com/kolea-rentals/floor-plans, that gets me roughly 17% of my traffic. That being said it is not really relevant traffic because it comes from the search term "floor plans", which really has nothing to do with Hawaii vacation rentals, which is what I do. My question is does Google know how to figure that out when they are looking at my stats or is there a way to let google know that that page probably should not show up for that search phrase? On the positive, they are nice floor plans and if someone is searching for ideas for floor plans and see one of them in google images then it probably could help them, but it really is not relevant to my business. It has a 80% bounce rate, but does have an average time on page of 1.5 minutes, which is a fair amount for what is there.
Reporting & Analytics | | RobDalton0 -
Google Webmaster indicates robots.text access error
Seems that Google has not been crawling due to an access issue with our robots.txt
Reporting & Analytics | | jmueller0823
Late 2013 we migrated to a new host, WPEngine, so things might have changed, however this issue appears to be recent. A quick test shows I can access the file. This is the Google Webmaster Tool message: http://www.growth trac dot com/: Googlebot can't access your site January 17, 2014 Over the last 24 hours, Googlebot encountered 62 errors while attempting to access your robots.txt. To ensure that we didn't crawl any pages listed in that file, we postponed our crawl. Your site's overall robots.txt error rate is 8.8% Note the above message says 'over the last 24 hours', however the date is Jan-17 This is the response from our host:
Thanks for contacting WP Engine support! I looked into the suggestions listed below and it doesn't appear that these scenarios are the cause of the errors. I looked into the server logs and I was only able to find 200 server responses on the /robots.txt. Secondly I made sure that the server wasn't over loaded. The last suggestion doesn't apply to your setup on WP Engine. We do not have any leads as to why the errors occurred. If you have any other questions or concerns, please feel free to reach out to us. Google is crawling the site-- should I be concerned? If so, is there a way to remedy this? By the way, our robots file is very lean, only a few lines, not a big deal. Thanks!0 -
Is the meta description available on the On Page Optimization Report even if its currently being optimized?
Currently, description is only available if the element is not being optimized (i.e. character count is off/keyword isn't included in the description)
Reporting & Analytics | | Jerome670 -
Analytics: How to see how many uniques viewed over X amount of pages in period
Hey guys, anyone any idea how to do the following in google analytics? I need a little help, I am looking at my traffic for a given month, I can see how many unique users visit over X amount of times, but can instead of visits, can I see how many unique users I had that looked at X amount of pages or more? Appreciate all help!! 🙂
Reporting & Analytics | | MirandaP0 -
Page Speed - What tool to use?
I am looking for a good tool to measure page speed. Any tools out there that you recommend?
Reporting & Analytics | | rmontanez0 -
Difference between page/domain authority
could anyone explain the difference between Page Authority and Domain Authoity to me or give me a link to a site where it is explained? Sorry if It's really obvious and I'm just too stupid to find out, but I've searched and haven't found anything.
Reporting & Analytics | | mtueckcr0