Do search engines crawl links on 404 pages?
-
I'm currently in the process of redesigning my site's 404 page. I know there's all sorts of best practices from UX standpoint but what about search engines? Since these pages are roadblocks in the crawl process, I was wondering if there's a way to help the search engine continue its crawl.
Does putting links to "recent posts" or something along those lines allow the bot to continue on its way or does the crawl stop at that point because the 404 HTTP status code is thrown in the header response?
-
Okay, thanks Alan!
-
Hi Brad
Sorry I have only just come back to you - it was late night here in the UK, but it looks like Alan has already answered your question
Have you tested your 404 page with fetch as Google in webmaster tools - you should see that it can see the links on your 404 page and as such will continue crawling them as Alan has said.
So what is a benefit to a user will also be a benefit to Google crawling your site in my opinion
-
Sorry, yes, it should crawl the links - they used to do that.
But you can prove it to yourself, by doing what I said - and then report back.
-
Yes it will continue crawling or yes it will stop the crawl?
-
Yes and you can test it by creating a page that is linked from nowhere else and then check your logs or analytics
-
Hey Matt,
Thanks for the reply. I'm aware of all the best practice stuff but thanks for sending through. It didn't quite answer my question so let me rephrase...
Will a bot follow a hyperlink (like the example below) on a 404 page or will it stop the crawl on that page (not on the whole site) because the header response code is a 404?
-
Hi Brad
Firstly it is great from a usability point of view to have a custom 404 page and I would link it to your most popular content and maybe add a search feature on the page for your site to help find the content that is missing. I have come across some nice 404s that actually have very concise sitemap in order to help the visitor navigate the site.In order to prevent Google from indexing your 404 page you need to make sure it returns an actuall 404 HTTP status code.
In order to understand how Goolgebot crawls your site I would look at the following post from Google themselves - https://support.google.com/webmasters/answer/182072?hl=en
Rather than being concerned about a 404 page having links on to keep the crawl going make sure you have an XML sitemap that you have submitted to Google via Webmaster Tools as this will help your crawl process.
Googlebot alots a set amount of time to crawling your site and it doesn't just stop crawling because it encounters a 404 error. However make sure that you monitor Google Webmaster Tools and take care of any reported 404s with 301 redirects for instance if the page has changed location. You will notice that Googlebot reports 404 erros on the days it finds them and these can often be multiple 404 errors encountered in one visit to your site by Googlebot. Keeing an eye on this and making sure you keep it updated will make your site as crawl efficient as possible which is clearly what you are after - as we all are
I thought this would also be interesting reading in relation to this - http://googlewebmastercentral.blogspot.co.uk/2011/05/do-404s-hurt-my-site.html
Hope this helps
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
301 redirect to search results page?
Hi - we just launched our redesigned website. On the previous site, we had multiple .html pages that contained links to supporting pdf documentation. On this new site, we no longer have those .html landing pages containing the links. The question came up, should we do a search on our site to gather a single link that contains all pdf links from the previous site, and set up a redirect? It's my understanding that you wouldn't want google to index a search results page on your website. Example: old site had the link http://www.oldsite.com/technical-documents.html new site, to see those same links would be like: http://www.newsite.com/resources/search?View+Results=&f[]=categories%3A196
Intermediate & Advanced SEO | | Jenny10 -
How to handle broken links to phantom pages appearing in webmaster tools
Hi,Would love to hear different experiences and thoughts on this one. We have a site that is plagued with 404's in the Webmaster Tools. A significant number of them have never existed, for instance affiliates have linked to them with the wrong URL or scraper sites have linked to them with a truncated version of the URL and an ellipsis eg; /my-nonexistent... What's the best way to handle these? If we do nothing and mark as fixed, they reappear in the broken links report. If we 301 redirect and mark as fixed they reappear. We tried 410 (gone forever) and marking as fixed; they re-appeared. We have a lot of legacy broken links and we would really like to clean up our WMT broken link profile - does anyone know of a way we can make these links to non extistent pages disappear once and for all? Many thanks in advance!
Intermediate & Advanced SEO | | dancape0 -
Are links on the page like this detrimental?
Hello, on www.ditalia.com.au are the links at the bottom of the page under: Latest Blog Posts, Most Popular Blogs, Fabric & Lace, Wedding Dresses..., useful or detrimental to SEO?
Intermediate & Advanced SEO | | infinart0 -
My warning report says I have too many on page links - 517! I can't find 50% of them but my q is about no follow
if we put 'no follow' on some of these links does that mean the search engines won't index the no follow pages even if those pages are linked to from elsewhere? no link juice will flow from the page with the (no follow) links on? Just trying to understand why my rankings have dropped so dramatically in the last 6 weeks or so since we redesigned the site, and it might be that now we have too many links on the homepage. This is the page http://www.suffolktouristguide.com/ All suggestions appreciated!
Intermediate & Advanced SEO | | SarahinSuffolk0 -
Domain Links or SubDomain Links, which is better?
Hi, I only now found out that www.domain.com and www.domain.com/ are different. Most of my external links are directed to www.domain.com/
Intermediate & Advanced SEO | | BeytzNet
Which I understand is considered the subdomain and not the domain. Should I redirect? (and if so how?)
Should I post new links only to my domain?0 -
Best way to block a search engine from crawling a link?
If we have one page on our site that is is only linked to by one other page, what is the best way to block crawler access to that page? I know we could set the link to "nofollow" and that would prevent the crawler from passing any authority, and we can set the page to "noindex" to prevent it from appearing in search results, but what is the best way to prevent the crawler from accessing that one link?
Intermediate & Advanced SEO | | nicole.healthline0 -
Maximum of 100 links on a page vs rel="nofollow"
All, I read within the SEOmoz blog that search engines consider 100 links on a page to be plenty, and we should try (where possible) to keep within the 100 limit. My question is; when a rel="nofollow" attribute is given to a link, does that link still count towards your maximum 100? Many thanks Guy
Intermediate & Advanced SEO | | Horizon0 -
How does one know where to insert the right strips of coding on the right pages for Canonical Links?
On my Website, I am the only SEO optimizer wizard person. I have to teach myself everything and I get overwhelmed a lot. I recently started using SEOMOZ and on my report it stated we had duplicate page titles and that it was bad and should be fixed quickly. So I did my research and found that I needed to use canonical links to reference one page to be indexed. However my problem lies in exactly how to add this coding to my site. I greatly appreciate any help or at least looking at this question.
Intermediate & Advanced SEO | | FrontlineMobility0