Google Webmaster tools -Fixing over 20,000+ crawl errors
-
Hi,
I'm trying to gather all the 404 crawl errors on my website after a recent hacking that I've been trying to rectify and clean up. Webmaster tools states that I have over 20 000+ crawl errors. I can only download a sample of 1000 errors. Is there any way to get the full list instead of correcting 1000 errors, marking them as fixed and waiting for the next batch of 1000 errors to be listed in Webmaster tools?
The current method is quite timely and I want to take care of all errors in one shot instead of over a course of a month.
-
You can use Screaming Frog to pinpoint where your 404s are coming from. Here's a great write-up with a few different ways to use SF for this: https://www.screamingfrog.co.uk/broken-link-checker/
Another option is Google Analytics.
- First, navigate to your All Pages report, then set primary dimension to Page Title.
- Next, go to your site and trigger a 404, take note of the page title, it should be something like 'Page Not Found'.
- Whatever that page title is on your 404 page, enter that in the inline filtering and it'll narrow the reporting down to just 404 pages.
- Then drill down into that result and see a full list of URLs that are throwing a 404.
- Set the secondary dimension to Previous Page Path to see the page that linked to the broken page.
Hope that's helpful!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Indexing Stopped
Hello Team, A month ago, Google was indexing more than 2,35,000 pages, now has reduced to 11K. I have cross-checked almost everything including content, backlinks and schemas. Everything is looking fine, except the server response time, being a heavy website, or may be due to server issues, the website has an average loading time of 4 secs. Also, I would like to mention that I have been using same server since I have started working on the website, and as said above a month ago the indexing rate was more than 2.3 M, now reduced to 11K. nothing changed. As I have tried my level best on doing research for the same, so please if you had any such experiences, do share your valuable solutions to this problem.
Intermediate & Advanced SEO | | jeffreyjohnson0 -
Google Search Console Crawl Errors?
We are using Google Search Console to monitor Crawl Errors. It seems Google is listing errors that are not actual errors. For instance, it shows this as "Not found": https://tapgoods.com/products/tapgoods__8_ft_plastic_tables_11_available So the page does not exist, but we cannot find any pages linking to it. It has a tab that shows Linked From, but if I look at the source of those pages, the link is not there. In this case, it is showing the front page (listed twice, both for http and https). Also, one of the pages it shows as linking to the non-existant page above is a non-existant page. We marked all the errors as fixed last week and then this week they came up again. 2/3 are the same pages we marked as fixed last week. Is this an issue with Google Search Console? Are we getting penalized for a non existant issue?
Intermediate & Advanced SEO | | TapGoods0 -
Breadcrumbs not displaying on Google
Hello, We have set breadcrumbs on some of our pages (example: https://www.globecar.com/en/car-rental/locations/canada/qc/montreal/airport-yul) for testing purposes and for some reasons they are still not showing up on Google: http://screencast.com/t/BSHQqkP69r6F Yet when I test the page with Google Structured Data Testing tool all is good: http://screencast.com/t/Fzlz3zae Any ideas? Thanks, Karim
Intermediate & Advanced SEO | | GlobeCar0 -
Why is my site not getting crawled by google?
Hi Moz Community, I have an escort directory website that is built out of ajax. We basically followed all the recommendations like implementing the escaped fragment code so Google would be able to see the content. Problem is whenever I submit my sitemap on Google webmastertool it always 700 had been submitted and only 12 static pages had been indexed. I did the site query and only a number of pages where indexed. Does it have anything to do with my site being on HTTPS and not on HTTP? My site is under HTTPS and all my content is ajax based. Thanks
Intermediate & Advanced SEO | | en-gageinc0 -
Getting a Sitemap for a Subdomain into Webmaster Tools
We have a subdomain that is a Wordpress blog, and it takes days, sometimes weeks for most posts to be indexed. We are using the Yoast plugin for SEO, which creates the sitemap.xml file. The problem is that the sitemap.xml file is located at blog.gallerydirect.com/sitemap.xml, and Webmaster Tools will only allow the insertion of the sitemap as a directory under the gallerydirect.com account. Right now, we have the sitemap listed in the robots.txt file, but I really don't know if Google is finding and parsing the sitemap. As far as I can tell, I have three options, and I'd like to get thoughts on which of the three options is the best choice (that is, unless there's an option I haven't thought of): 1. Create a separate Webmaster Tools account for the blog 2. Copy the blog's sitemap.xml file from blog.gallerydirect.com/sitemap.xml to the main web server and list it as something like gallerydirect.com/blogsitemap.xml, then notify Webmaster Tools of the new sitemap on the galllerydirect.com account 3. Do an .htaccess redirect on the blog server, such as RewriteRule ^sitemap.xml http://gallerydirect.com/blogsitemap_index.xml Then notify Webmaster Tools of the new blog sitemap in the gallerydirect.com account. Suggestions on what would be the best approach to be sure that Google is finding and indexing the blog ASAP?
Intermediate & Advanced SEO | | sbaylor0 -
Disavow Tool - WWW or Not?
Hi All, Just a quick question ... A shady domain linking to my website is indexed in Google for both example.com and www.example.com. If I wan't to disavow the entire domain, do I need to submit both: domain:www.example.com domain:example.com or just: domain:example.com Cheers!
Intermediate & Advanced SEO | | Carlos-R0 -
Who is beating you on Google (after Penguin)?
Hi,
Intermediate & Advanced SEO | | rayvensoft
After about a month of Penguin and 1 update, I am starting to notice an annoying pattern as to who is beating me in the rankings on google. I was wondering if anybody else has noticed this.
The sites who are beating me - almost without exception - fall into these 2 categories. 1) Super sites that have little or nothing to do with the service I am offering. Now it is not the homepages that are beating me. In almost all cases they are simply pages hidden in their forums where somebody in passing mentioned something relating to what I do. 2) Nobodies. Sites that have absolutely no links back to them, and look like they were made by a 5 year old. Has anybody else noticed this? I am just wondering if what I see only apply to my sites or if this is a pattern across the web. Does this mean that for small sites to rank, it is now all about on-page SEO? If it all about on-page, well that is great... much easier than link building. But I want to make sure others see the same thing before dedicating a lot of time to overhaul my sites and create new content.| Thanks!0 -
How to remove an entire subdomain from the Google index with URL removal tool?
Does anyone have clear instructions for how to do this? Do we need to set up a separate GWT account for each subdomain? I've tried using the URL removal tool, but it will only allow me to remove URLs indexed under my domain (i.e. domain.com not subdomain.domain.com) Any help would be much appreciated!!!
Intermediate & Advanced SEO | | nicole.healthline0