How to find all 404 deadlinks - webmaster only allows 1000 to be downloaded...
-
Hi Guys
I have a question...I am currently working on a website that was hit by a spam attack.
The website was hacked and 1000's of adult censored pages were created on the wordpress site.
The hosting company cleared all of the dubious files - but this has left 1000's of dead 404 pages.
We want to fix the dead pages but Google webmaster only shows and allows you to download 1000.
There are a lot more than 1000....does any know of any Good tools that allows you to identify all 404 pages?
Thanks, Duncan
-
The Moz crawl report will also show 404s. I sometimes find that different spiders may find different things. Between the Search Console report, Screaming Frog (great investment) and Moz, you should have a nice collection of things to fix.
-
I must second Dirk's suggestion of screaming frog, great tool and I use it daily, a license is well worth the cost. Although spider crawl of the site will only point out 404's that have are links from an existing page, so if the hosting company cleaned up the not all of these 404's will surface.
One approach I would suggest is run the current 1000 404's in GWT through Screaming frog as a manually added list, (do it in 2 batches if you have the free version), start a spreadsheet of the resulting 404's and start working through that list. Once you have the 404's mark those as fixed as GWT tools set a reminder to check back in a few days and after a few days export the new list of 1000 404's and run these through screaming frog adding the resulting list to your spreadsheet. Keep doing this until you get the 404's errors in GWT down a manageable level.
I hope that helps, good luck.
-
Probably the easiest solution is to buy a licence from Screaming Frog & to crawl your site locally. The tool can do a lot of useful stuff to audit sites and will show you not only the full list of 4xx errors but also the pages that link to them.
There is also a free version but that allows you to crawl only 500 pages - which in your case is probably not sufficient but it would allow you to see how it works.
Hope this helps,
Dirk
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
301 or 404 Question for thin content Location Pages we want to remove
Hello All, I have a Hire Website with many categories and individual location pages for each of the 70 depots we operate. However, being dynamic pages, we have thousands of thin content pages. We have decided to only concentrate on our best performing locations and get rid of the rest as its physically impossible to write unique content for all our location pages for every categories. Therefore my question is. Would it cause me problems by having to many 301's for the location pages I am going to re-direct ( i was only going to send these back to the parent category page) or should I just 404 all those location pages and at some point in the future when we are in a position to concentrate on these locations then redo them with new content ? in terms of url numbers It would affect a few thousand 301's or 404's depending on people thoughts. Also , does anyone know what percentage of thin content on a site should be acceptable ?.. I know , none is best in an ideal world but it would be easier if there we could get away with a little percentage. We have been affected by Panda , so we are trying to tidy things up as best at possible, Any advice greatly appreciated? thanks Peter
Intermediate & Advanced SEO | | PeteC120 -
Why would my ip address show up in my webmaster tools links?
I am showing thousands of links from my servers ip address. What would cause that?
Intermediate & Advanced SEO | | EcommerceSite0 -
404 errors
Hi, we have plenty of 404 errors. We just deal with those that are of the highest priority (the ones that have high page authority). We have also a lot of errors like this: http://www.weddingrings.com/www.yoy-search.com . Does it make sense to redirect those to the home page or leave them as an 404 error?
Intermediate & Advanced SEO | | alexkatalkin0 -
Should i fix 404 my errors?
We have about 250, 404 errors due to changing alot of page names throughout our site. I've read some articles saying to leave them and eventually they will go away. Normally I would do a 301 redirect. What's the best solution?
Intermediate & Advanced SEO | | JimDirectMailCoach0 -
Webmaster Tools: Total Indexed VS Ever Crawled
Ok, In WMT's under health > index status I have both total indexed and ever crawled ticked - It also looks like the data is broken up weekly. As an example say you have the following: Total Indexed: 1000 Ever Crawled: 5000 What is this say? It found 5000 pages but only indexed 1000 (20%). Thanks
Intermediate & Advanced SEO | | Bondara0 -
Rankings dropped off a cliff. Webmaster tools message: No manual spam actions found. Now what?
A week ago my rankings for http://www.top-10-dating-reviews.com (some adult content) dropped and I'm now getting now impressions. I submitted a reconsideration request as I was sure I hadn't violated any rules and today the reply was that no manual spam actions were found. Te email goes onto say there are a variety of other things that could affect rankings such as site architecture, not being able to crawl and algo changes. As far as I'm aware all these issues are fine. I'm not aware of any algo updates last weekend. my question is what can I do now? I need to get my rankings back but there's nothing wrong with my site or practices.
Intermediate & Advanced SEO | | SamCUK0 -
Why are Pages returning 404 errors not being dropped?
Our webmaster tools continues to return anywhere upwards of 750 pages that have 404 errors. These are from pages of a previous site no longer used. However this was over 1 year ago these pages were dropped along with the 301 re-directs. Why is Google not clearing these from webmaster tools but re-listing them again after 3 month cycle? Is it because external sites have links to these pages? If so should I put a 301 in place (most of these site are forums and potentially dodgy directories etc from previous poor link building programs) or ask for a manual removal?
Intermediate & Advanced SEO | | Towelsrus0 -
How to find pages hardest hit
I have been hearing that panda can penalize a website for low quality pages. I have run duplicate content check and done my best to go through the whole website. I hear many people talking about deleting hardest hit pages, or fixing hardest hit pages. My question is how can I find which pages on our website are hardest hit? Is there anyway to check a website for pages that might score low. We do have a ecom section to the website which I am concerned might be considered low quality for each product page. Any advice would be a great help.
Intermediate & Advanced SEO | | fertilityhealth0