How to find all 404 deadlinks - webmaster only allows 1000 to be downloaded...
-
Hi Guys
I have a question...I am currently working on a website that was hit by a spam attack.
The website was hacked and 1000's of adult censored pages were created on the wordpress site.
The hosting company cleared all of the dubious files - but this has left 1000's of dead 404 pages.
We want to fix the dead pages but Google webmaster only shows and allows you to download 1000.
There are a lot more than 1000....does any know of any Good tools that allows you to identify all 404 pages?
Thanks, Duncan
-
The Moz crawl report will also show 404s. I sometimes find that different spiders may find different things. Between the Search Console report, Screaming Frog (great investment) and Moz, you should have a nice collection of things to fix.
-
I must second Dirk's suggestion of screaming frog, great tool and I use it daily, a license is well worth the cost. Although spider crawl of the site will only point out 404's that have are links from an existing page, so if the hosting company cleaned up the not all of these 404's will surface.
One approach I would suggest is run the current 1000 404's in GWT through Screaming frog as a manually added list, (do it in 2 batches if you have the free version), start a spreadsheet of the resulting 404's and start working through that list. Once you have the 404's mark those as fixed as GWT tools set a reminder to check back in a few days and after a few days export the new list of 1000 404's and run these through screaming frog adding the resulting list to your spreadsheet. Keep doing this until you get the 404's errors in GWT down a manageable level.
I hope that helps, good luck.
-
Probably the easiest solution is to buy a licence from Screaming Frog & to crawl your site locally. The tool can do a lot of useful stuff to audit sites and will show you not only the full list of 4xx errors but also the pages that link to them.
There is also a free version but that allows you to crawl only 500 pages - which in your case is probably not sufficient but it would allow you to see how it works.
Hope this helps,
Dirk
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Webmaster is giving errors of Duplicate Meta Descriptions and Duplicate Title Tags
Webmaster is giving errors of Duplicate Meta Descriptions and Duplicate Title Tags after I changes the permalinks structure in wordpress. It there a quick fix for this and how damaging is the above for seo. Thanks T
Intermediate & Advanced SEO | | Taiger0 -
404's and Ecommerce - Products no longer for sale
Hi We regularly have products which are no longer sold and discontinued. As we have such a large site, webmaster tools regularly picks up new 404's. These 404 pages aren't linked to from anywhere on the site any longer, however WMT will still report them as errors. Does this affect site authority? Thank you
Intermediate & Advanced SEO | | BeckyKey0 -
Site migration - 301 or 404 for pages no longer needed?
Hi I am migrating from my old website to a new one on a different, server with a very different domain and url structure. I know it's is best to change as little as possible but I just wasn't able to do that. Many of my pages can be redirected to new urls with similar or the same content. My old site has around 400 pages. Many of these pages/urls are no longer required on the new site - should I 404 these pages or 301 them to the homepage? I have looked through a lot of info online to work this out but cant seem to find a definative answer. Thanks for this!! James
Intermediate & Advanced SEO | | Curran0 -
Why is my servers ip address showing up in Webmaster Tools?
In links to my site in Google Webmaster Tools I am showing over 28,000 links from an ip address. The ip address is the address that my server is hosted on. For example it shows 200.100.100.100/help, almost like there are two copies of my site, one under the domain name and one under the ip address. Is this bad? Or is it just showing up there and Google knows that it is the same since the ip and domain are from the same server?
Intermediate & Advanced SEO | | EcommerceSite0 -
What are Soft 404's and are they a problem
Hi, I have some old pages that were coming up in google WMT as a 404. These had links into them so i thought i'd do a 301 back to either the home page or to a relevant category or page. However these are now listed in WMT as soft 404's. I'm not sure what this means and whether google is saying it doesn't like this? Any advice welcomed.
Intermediate & Advanced SEO | | Aikijeff0 -
What are the best ways to fix 404 errors?
I recently changed the url of my main blog and now have about 100 404 errors. I did a redirect from the old url to the new one however still have errors. 1. Should I do a 301 redirect from each old blog post url to the new blog post url? 2. Should I just delete the old blog post (url) and rewrite the blog post? I"m not concerned about links to the old posts as a lot of them do not have many links.
Intermediate & Advanced SEO | | webestate0 -
Why Does Ebay Allow Internal Search Result Pages to be Indexed?
Click this Google query: https://www.google.com/search?q=les+paul+studio Notice how Google has a rich snippet for Ebay saying that it has 229 results for Ebay's internal search result page: http://screencast.com/t/SLpopIvhl69z Notice how Sam Ash's internal search result page also ranks on page 1 of Google. I've always followed the best practice of setting internal search result pages to "noindex." Previously, our company's many Magento eCommerce stores had the internal search result pages set to be "index," and Google indexed over 20,000 internal search result URLs for every single site. I advised that we change these to "noindex," and impressions from Search Queries (reported in Google Webmaster Tools) shot up on 7/24 with the Panda update on that date. Traffic didn't necessarily shoot up...but it appeared that Google liked that we got rid of all this thin/duplicate content and ranked us more (deeper than page 1, however). Even Dr. Pete advises no-indexing internal search results here: http://www.seomoz.org/blog/duplicate-content-in-a-post-panda-world So, why is Google rewarding Ebay and Sam Ash with page 1 rankings for their internal search result pages? Is it their domain authority that lets them get away with it? Could it be that noindexing internal search result pages is NOT best practice? Is the game different for eCommerce sites? Very curious what my fellow professionals think. Thanks,
Intermediate & Advanced SEO | | M_D_Golden_Peak
Dan0 -
Status Code: 404 Errors. How to fix them.
Hi, I have a question about the "4xx Staus Code" errors appearing in the Analysis Tool provided by SEOmoz. They are indicated as the worst errors for your site and must be fixed. I get this message from the good people at SEOmoz: "4xx status codes are shown when the client requests a page that cannot be accessed. This is usually the result of a bad or broken link." Ok, my question is the following. How do I fix them? Those pages are shown as "404" pages on my site...isn't that enough? How can fix the "4xx status code" errors indicated by SEOmoz? Thank you very much for your help. Sal
Intermediate & Advanced SEO | | salvyy0