Do search engines crawl links on 404 pages?
-
I'm currently in the process of redesigning my site's 404 page. I know there's all sorts of best practices from UX standpoint but what about search engines? Since these pages are roadblocks in the crawl process, I was wondering if there's a way to help the search engine continue its crawl.
Does putting links to "recent posts" or something along those lines allow the bot to continue on its way or does the crawl stop at that point because the 404 HTTP status code is thrown in the header response?
-
Okay, thanks Alan!
-
Hi Brad
Sorry I have only just come back to you - it was late night here in the UK, but it looks like Alan has already answered your question
Have you tested your 404 page with fetch as Google in webmaster tools - you should see that it can see the links on your 404 page and as such will continue crawling them as Alan has said.
So what is a benefit to a user will also be a benefit to Google crawling your site in my opinion
-
Sorry, yes, it should crawl the links - they used to do that.
But you can prove it to yourself, by doing what I said - and then report back.
-
Yes it will continue crawling or yes it will stop the crawl?
-
Yes and you can test it by creating a page that is linked from nowhere else and then check your logs or analytics
-
Hey Matt,
Thanks for the reply. I'm aware of all the best practice stuff but thanks for sending through. It didn't quite answer my question so let me rephrase...
Will a bot follow a hyperlink (like the example below) on a 404 page or will it stop the crawl on that page (not on the whole site) because the header response code is a 404?
-
Hi Brad
Firstly it is great from a usability point of view to have a custom 404 page and I would link it to your most popular content and maybe add a search feature on the page for your site to help find the content that is missing. I have come across some nice 404s that actually have very concise sitemap in order to help the visitor navigate the site.In order to prevent Google from indexing your 404 page you need to make sure it returns an actuall 404 HTTP status code.
In order to understand how Goolgebot crawls your site I would look at the following post from Google themselves - https://support.google.com/webmasters/answer/182072?hl=en
Rather than being concerned about a 404 page having links on to keep the crawl going make sure you have an XML sitemap that you have submitted to Google via Webmaster Tools as this will help your crawl process.
Googlebot alots a set amount of time to crawling your site and it doesn't just stop crawling because it encounters a 404 error. However make sure that you monitor Google Webmaster Tools and take care of any reported 404s with 301 redirects for instance if the page has changed location. You will notice that Googlebot reports 404 erros on the days it finds them and these can often be multiple 404 errors encountered in one visit to your site by Googlebot. Keeing an eye on this and making sure you keep it updated will make your site as crawl efficient as possible which is clearly what you are after - as we all are
I thought this would also be interesting reading in relation to this - http://googlewebmastercentral.blogspot.co.uk/2011/05/do-404s-hurt-my-site.html
Hope this helps
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Dynamic referenced canonical pages based on IP region and link equity question
Hi all, My website uses relative URLs that has PHP to read a users IP address, and update the page's referenced canonical tag to an region specific absolute URL for ranking / search results. E.g. www.example.com/category/product - relative URL referenced for internal links / external linkbuilding If a US IP address hits this link, the URL is the same, but canonicalisation is updated in the source to reference www.example.com**/us/**category/product, so all ranking considerations are pointed to that page instead. None of these region specific pages are actually used internally within the site. This decision was done so external links / blog content would fit a user no matter where they were coming from. I'm assuming this is an issue in trying to pass link equity with Googlebot, because it is splitting the strength between different absolute canonical pages depending on what IP it's using to crawl said links (as the relative URL will dynamically alter the canonical reference which is what ranking in SERPs) Any assistance or information no matter how small would be invaluable. Thanks!
Intermediate & Advanced SEO | | MattBassos0 -
Print pages returning 404's
Print pages on one of our sister sites are returning 404's in our crawl but are visible when clicked on. Here is one example: https://www.theelementsofliving.com/recipe/citrus-energy-boosting-smoothie/print Any ideas as to why these are returning errors? Thank you!
Intermediate & Advanced SEO | | FirstService0 -
Targeting two search terms with same intent - one or more pages for SEO benefits?
I'd like some professional opinions on this topic. I'm looking after the SEO for my friends site, and there are two main search terms we are looking to boost in search engines. The company sells Billboard advertising space to businesses in the UK. Here are the two search terms we're looking to target: Billboard Advertising - 880 searches P/M Outdoor Advertising - 720 searches P/M It would usually make sense to make a separate page to target the keyword "billboard advertising" on its own fully optimised landing page with more information on the topic and with a targeted URL: www.website.com/billboard-advertising/ and the homepage to target "outdoor advertising" as it's an outdoor advertising agency. But there's a problem, as both search terms are highly related and have the same intent, I'm worried that if we create a separate page to target the billboard advertising, it will conflict with the homepage targeting outdoor advertising. Also, the main competitors who are currently ranked position 1-3, are ranking with their home pages and not optimised landing pages to target the exact search term "billboard advertising". Any advice on this?
Intermediate & Advanced SEO | | Jseddon920 -
Do internal links from non-indexed pages matter?
Hi everybody! Here's my question. After a site migration, a client has seen a big drop in rankings. We're trying to narrow down the issue. It seems that they have lost around 15,000 links following the switch, but these came from pages that were blocked in the robots.txt file. I was wondering if there was any research that has been done on the impact of internal links from no-indexed pages. Would be great to hear your thoughts! Sam
Intermediate & Advanced SEO | | Blink-SEO0 -
Canonical URL on search result pages
Hi there, Our company sells educational videos to Nurses via subscription. I've been looking at their video search results page:
Intermediate & Advanced SEO | | 9868john
http://www.nursesfornurses.com.au/cpd When you click on a category, the URL appears like this:
http://www.nursesfornurses.com.au/cpd?view=category&cat=9&name=Acute+Surgical+Nursing
http://www.nursesfornurses.com.au/cpd?view=category&cat=6&name=Medications Would this be an instance where i'd use the canonical tag to redirect each search results page? Bearing in mind the /cpd page is under /Nursing cpd, and that /Nursing cpd is our best performing page in search engines, would it be better to refer it to the 'Nursing CPD' rather than 'CPD' page? Any advice is very welcome,
Thanks,
John0 -
Pages that 301 redirect to a 404
We are going through a website redesign that involves changing URL's for the pages on our site. Currently all our pages are in the format domain.com/example.html and we are moving to stip off the .html file extension so it would just be domain.com/example We have thousands of pages as the site deals with news so building a redirect for each individual page isn't really feasible. My plan is to have a generic rewrite rule that redirects any page that ends .html to the stripped off version of this. A problem I can see with this is that it will also redirect pages that don't exist. So for example, domain.com/non-existant-page.html would 301 to domain.com/non-existant-page which would then return a 404 status. What would the SEO repercussions be for this? Obviously if a page doesn't exist already then it shouldn't show up in the search engine indexes and shouldn't be a problem but I'm a bit worried about how old pages that currently legitimately 404 will be treated when they start to 301 redirect to a 404 instead. Not sure if there any other potential issues from this that I've missed either? Thanks!
Intermediate & Advanced SEO | | sbb0240 -
How can I fix "Too Many On Page Links"?
One of the warnings from SEO Moz says that we have "too many on page links" on a series of pages on my website. The pages it's giving me these warnings on are on my printing sample pages. I'm assuming that it's because of my left navigation. You can see an example here: http://www.3000doorhangers.com/door-hanger-design-samples/deck-and-fence-door-hanger-samples/ Any suggestions on how to fix this warning? Thanks!
Intermediate & Advanced SEO | | JimDirectMailCoach0 -
How many links home on a page?
We are planning on a mega menu which will have around 300 links and a mega slider which will have around 175 links if our developer has their way. In all I could be looking at over 500 links from the home page. The Mega Menu will flatten the site link structure out but I am worried this slider on the home page which is our 4th most visited page behind our 3 core category pages. What are your thoughts?
Intermediate & Advanced SEO | | robertrRSwalters0