Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Mass 404 Checker?
-
Hi all,
I'm currently looking after a collection of old newspaper sites that have had various developments during their time. The problem is there are so many 404 pages all over the place and the sites are bleeding link juice everywhere so I'm looking for a tool where I can check a lot of URLs at once.
For example from an OSE report I have done a random sampling of the target URLs and some of them 404 (eek!) but there are too many to check manually to know which ones are still live and which ones have 404'd or are redirecting. Is there a tool anyone uses for this or a way one of the SEOMoz tools can do this?
Also I've asked a few people personally how to check this and they've suggested Xenu, Xenu won't work as it only checks current site navigation.
Thanks in advance!
-
Hi,
we are seo agency at turkey, our name clicksus. We can deadlinkchecker.com and it is very easy & good.
-
Glad I was able to help!
It would be great if you could mark the answers you found helpful, and mark the question as answered if you feel you got the information you needed. That will make it even more useful for other users.
Paul
-
Wow nice one mate did not know that in the Top Pages tab that is perfect! I'll remember to click around more often now.
I found this tool on my adventures which was exactly what I was after: http://www.tomanthony.co.uk/tools/bulk-http-header-compare/
Also cheers for your walkthrough, having problems with the site still bleeding 404 pages, first thing first however is fixing these pages getting high quality links to them
Cheers again!
-
Sorry, one additional - since you mentioned using Open Site Explorer...
Go to the Top Pages tab in OSE and filter the results to include only incoming links. One of the columns in that report is HTTP Status. It will tell you if the linked page's status is 404. Again, just download the full CSV, sort the resulting spreadsheet by the Status column and you'll be able to generate a list of URLs that no longer have pages associated with them to start fixing.
Paul
-
Ollie, if I'm understanding your question correctly, the easiest place for you to start is with Google Webmaster Tools. You're looking to discover URLs of pages that used to exist on the sites, but no longer do, yes?
If you click on the Health link in left sidebar, then click Crawl Errors, you get a page showing different kinds of errors the Google crawler has detected. Click on the Not Found error box and you'll get a complete list of all the pages Google is aware of that can no longer be found on your site (i.e. 404s).
You can then download the whole list as a CSV and start cleaning them up from there.
This list will basically include pages that have been linked to at one time or another from other sites on the web, so while not exhaustive, it will show the ones that are most likely to still be getting traffic. For really high-value incoming links, you might even want to contact the linking site and see if you can get them to relink to the correct new page.
Alternatively, if you can access the sites' server logs, they will record all the incoming 404s with their associated URLs as well and you can get a dump from the log files to begin creating your work list. I just find it's usually easier to get access to Webmaster Tools than to get at a clients server log files.
Is that what you're looking for?
Paul
-
To be honest, I don't know anyone who has bad things to say about Screaming Frog - aside from the cost, but as you said, really worth it.
However, it is free for up to 500 page crawl limit, so perhaps give it a go?
Andy
-
Cheers Andy & Kyle
Problem with this tool as it works similar to Xenu which is great for making sure your current navigation isn't causing problems.
My problem is there are over 15k links pointing to all sorts of articles and I have no idea what's live and what's not. Running the site through that tool won't report the pages that aren't linked in the navigation anymore but are still being linked to.
Example is manually checking some of the links I've found that the site has quite a few links from the BBC going to 404 pages. Running the site through Xenu or Screamy Frog doesn't find these pages.
Ideally I'm after a tool I can slap in a load of URLs and it'll do a simple HTTP header check on them. Only tools I can find do 1 or 10 at a time which would take quite a while trying to do 15k!
-
Agree with Screaming Frog. It's more comprehensive than **Xenu's Link Sleuth. **
It costs £99 for a year but totally worth it.
I had a few issues with Xenu taking too long to compile a report or simply crashing.
-
Xenu Liunk Seuth - its free and will go through internal links, external or both, it will also show you where the 404 page is being linked from.
Also can report 302s.
-
Screaming Frog Spider does a pretty good job...
As simple as enter the URL and leave it to report back when completed.
Andy
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Find all external 404 errors/links?
Hi All, We have recently discovered a site was linking to our site but it was linking to an incorrect url, resulting in a 404 error. We had only found this by pure chance and wondered if there was a tool out there that will tell us when a site is linking to an incorrect url on our site? Thanks 🙂
Technical SEO | | O2C0 -
Duplicate content and 404 errors
I apologize in advance, but I am an SEO novice and my understanding of code is very limited. Moz has issued a lot (several hundred) of duplicate content and 404 error flags on the ecommerce site my company takes care of. For the duplicate content, some of the pages it says are duplicates don't even seem similar to me. additionally, a lot of them are static pages we embed images of size charts that we use as popups on item pages. it says these issues are high priority but how bad is this? Is this just an issue because if a page has similar content the engine spider won't know which one to index? also, what is the best way to handle these urls bringing back 404 errors? I should probably have a developer look at these issues but I wanted to ask the extremely knowledgeable Moz community before I do 🙂
Technical SEO | | AliMac260 -
422 vs 404 Status Codes
We work with an automotive industry platform provider and whenever a vehicle is removed from inventory, a 404 error is returned. Being that inventory moves so quickly, we have a host of 404 errors in search console. The fix that the platform provider proposed was to return a 422 status code vs a 404. I'm not familiar with how a 422 may impact our optimization efforts. Is this a good approach, since there is no scalable way to 301 redirect all of those dead inventory pages.
Technical SEO | | AfroSEO0 -
How big is the problem: 404-errors as result of out of stock products?
We had a discussion about the importance of 404-errors as result of products which are out of stock. Of course this is not good, but what is the leverance in terms of importance: low-medium-high?
Technical SEO | | Digital-DMG0 -
Increase 404 errors or 301 redirects?
Hi all, I'm working on an e-commerce site that sells products that may only be available for a certain period of time. Eg. A product may only be selling for 1 year and then be permanently out of stock. When a product goes out of stock, the page is removed from the site regardless of any links it may have gotten over time. I am trying to figure out the best way to handle these permanently out of stock pages. At the moment, the site is set up to return a 404 page for each of these products. There are currently 600 (and increasing) instances of this appearing on Google Webmasters. I have read that too many 404 errors may have a negative impact on your site, and so thought I might 301 redirect these URLs to a more appropriate page. However I've also read that too many 301 redirects may have a negative impact on your site. I foresee this to be an issue several years down the road when the site has thousands of expired products which will result in thousands of 404 errors or 301 redirects depending on which route I take. Which would be the better route? Is there a better solution?
Technical SEO | | Oxfordcomma0 -
404 Errors After Site Migration
Hello - I'm working on a website selling fashion accessories. The site just went through a site migration from Yahoo! to Big Commerce. Now we have a high level of warnings and errors from the crawl. Few are mentioning sites I never seen before on the Yahoo! platform. I also notice that the pages crawled has doubled. How can I fix or did I do something wrong with migration? I was running the website with minimal errors and now overwhelmed with errors all the error updates. If I can get some assistance on what could be wrong, I would greatly appreciate. Thanks.
Technical SEO | | ShopChameleon0 -
No Search Results Found - Should this return status code 404?
A question came up today on how to correctly serve the right status code on pages where no search results are found. I did a couple searches on some major eccomerce and news sites and they were ALL serving status code 200 for No Search Results Found http://www.zappos.com/dsfasdgasdgadsg http://www.amazon.com/s/ref=nb_sb_noss?url=search-alias%3Daps&field-keywords=sdafasdklgjasdklgjsjdjkl http://www.ebay.com/sch/i.html?_trksid=p5197.m570.l1313&_nkw=dfjakljgdkslagklasd&_sacat=0 http://www.cnn.com/search/?query=sdgadgdsagas&x=0&y=0&primaryType=mixed&sortBy=date&intl=false http://www.seomoz.org/pages/search_results?q=sdagasdgasdgasg I thought I read somewhere were it was recommended to serve a status code 404 on these types of pages. Based on what I found above, all sites were serving a 200, so it appears this may not be the best practice. Any thoughts?
Technical SEO | | WEB-IRS0 -
Best 404 Error Checker?
I have a client with a lot of 404 errors from Web Master Tools, and i have to go through and check each of the links because Some redirect to the correct page Some redirect to another url but its a 404 error Some are just 404 errors Does anyone know of a tool where i can dump all of the urls and it will tell me If the url is redirected, and to where if the page is a 404 or other error Any tips or suggestions will be really appreciated! Thanks SEO Moz'rs
Technical SEO | | anchorwave0