Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Do you add 404 page into robot file or just add no index tag?
-
Hi,
got different opinion on this so i wanted to double check with your comment is.
We've got /404.html page and I was wondering if you would add this page to robot text so it wouldn't be indexed or would you just add no index tag? What would be the best approach?
Thanks!
-
Hello Rubix,
Saijo gave you some great advice, but I'm concerned about the fact that you have that page in the first place, and that it produces those URL parameters. It suggests to me that instead of showing a 404 error on the contact-office.aspx page (assuming that pages doesn't exist on that URL) you are redirecting the user who tries to access that URL to the /404.html page (e.g. /404.html?aspxerrorpath=/contact-office.aspx).
Typically you want the 404 http status code to show on the URL the user is trying to unsuccessfully access. In this case instead of redirecting them to your "404 page URL" you would want to show your customized 404 message (and ensure it returns a 404 status code, use this tool) on www.yourdomain.com/contact-office.aspx.
I hope this makes sense to you. If not, feel free to ask for clarification.
-
404 are OK on your site just make sure you send the proper 404 header response for the 404 page ... Google does NOT index 404 pages ( as long as it sends the 404 header response ) , so you don't need to block them via robots.txt or meta robots.
Infact GWT warns you about these if they are able to crawl the so called 404 pages that doesn't send a 404 header response , so I think its a good idea NOT to noindex them you will get the warning if something is wrong.
Google will only index your 404 if you don't do that..they call it soft 404 : https://support.google.com/webmasters/answer/181708?hl=en
worth reading : http://moz.com/learn/seo/http-status-codes
-
Thanks Martijn,
I actually want to know what would you do for the 404 page itself. It is something like:
www.mainurl.com/404.html and for some reason this started to create some other links such as
www.mainrul.com/404.html?aspxerrorpath=/contact-office.aspx
Do you think I should add 404 page and subpages to Robot.txt ?
Thanks!
-
Hi Sida,
I would add a noindex to the page and as you also will return the 404 status code this is enough data for Google to tell not to index the page itself.
Hope this answers your question.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
When should you 410 pages instead of 404
Hi All, We have approx 6,000 - 404 pages. These are for categories etc we don't do anymore and there is not near replacement etc so basically no reason or benefit to have them at all. I can see in GWT , these are still being crawled/found and therefore taking up crawler bandwidth. Our SEO agency said we should 410 these pages?.. I am wondering what the difference is and how google treats them differently ?. Do anyone know When should you 410 pages instead of 404 ? thanks Pete
Intermediate & Advanced SEO | | PeteC120 -
Date of page first indexed or age of a page?
Hi does anyone know any ways, tools to find when a page was first indexed/cached by Google? I remember a while back, around 2009 i had a firefox plugin which could check this, and gave you a exact date. Maybe this has changed since. I don't remember the plugin. Or any recommendations on finding the age of a page (not domain) for a website? This is for competitor research not my own website. Cheers, Paul
Intermediate & Advanced SEO | | MBASydney0 -
Wordpress Tag Pages - NoIndex?
Hi there. I am using Yoast Wordpress Plugin. I just wonder if any test have been done around the effects of Index vs Noindex for Tag Pages? ( like when tagging a word relevant to an article ) Thanks 🙂 Martin
Intermediate & Advanced SEO | | s_EOgi_Bear0 -
Whats the best way to remove search indexed pages on magento?
A new client ( aqmp.com.br/ )call me yestarday and she told me since they moved on magento they droped down more than US$ 20.000 in sales revenue ( monthly)... I´ve just checked the webmaster tool and I´ve just discovered the number of crawled pages went from 3.260 to 75.000 since magento started... magento is creating lots of pages with queries like search and filters. Example: http://aqmp.com.br/acessorios/lencos.html http://aqmp.com.br/acessorios/lencos.html?mode=grid http://aqmp.com.br/acessorios/lencos.html?dir=desc&order=name Add a instruction on robots.txt is the best way to remove unnecessary pages of the search engine?
Intermediate & Advanced SEO | | SeoMartin10 -
Do 404 Pages from Broken Links Still Pass Link Equity?
Hi everyone, I've searched the Q&A section, and also Google, for about the past hour and couldn't find a clear answer on this. When inbound links point to a page that no longer exists, thus producing a 404 Error Page, is link equity/domain authority lost? We are migrating a large eCommerce website and have hundreds of pages with little to no traffic that have legacy 301 redirects pointing to their URLs. I'm trying to decide how necessary it is to keep these redirects. I'm not concerned about the page authority of the pages with little traffic...I'm concerned about overall domain authority of the site since that certainly plays a role in how the site ranks overall in Google (especially pages with no links pointing to them...perfect example is Amazon...thousands of pages with no external links that rank #1 in Google for their product name). Anyone have a clear answer? Thanks!
Intermediate & Advanced SEO | | M_D_Golden_Peak0 -
Do 404 pages pass link juice? And best practices...
Last year Google said bad links to 404 pages wouldn't hurt your site. Could that still be the case in light of recent Google updates to try and combat spammy links and negative SEO? Can links to 404 pages benefit a website and pass link juice? I'd assume at the very least that any link juice will pass through links FROM the 404 page? Many websites have great 404 pages that get linked to: http://www.opensiteexplorer.org/links?site=http%3A%2F%2Fretardzone.com%2F404 - that was the first of four I checked from the "60 Really Cool...404 Pages" that actually returned the 404 HTTP Status! So apologies if you find the word 'retard' offensive. According to Open Site Explorer it has a decent Page Authority and number of backlinks - but it doesn't show in Google's SERPs. I'd never do it, but if you have a particularly well-linked to 404 page, is there an argument for giving it 200 OK Status? Finally, what are the best practices regarding 404s and address bar links? For example, if
Intermediate & Advanced SEO | | Alex-Harford
www.examplesite.com/3rwdfs returns a 404 error, should I make that redirect to
www.examplesite.com/404 or leave it as is? Redirecting to www.examplesite.com/404 might not be user-friendly as people won't be able to correct the URL in the address bar. But if I have a great 404 page that people link to, I don't want links going to loads of random pages do I? Is either way considered best practice? If I did a 301 redirect I guess it would send the wrong signal to the crawlers? Should I use a 302 redirect, or even a 304 Not Modified redirect?1 -
Soft 404's from pages blocked by robots.txt -- cause for concern?
We're seeing soft 404 errors appear in our google webmaster tools section on pages that are blocked by robots.txt (our search result pages). Should we be concerned? Is there anything we can do about this?
Intermediate & Advanced SEO | | nicole.healthline4 -
Could you use a robots.txt file to disalow a duplicate content page from being crawled?
A website has duplicate content pages to make it easier for users to find the information from a couple spots in the site navigation. Site owner would like to keep it this way without hurting SEO. I've thought of using the robots.txt file to disallow search engines from crawling one of the pages. Would you think this is a workable/acceptable solution?
Intermediate & Advanced SEO | | gregelwell0