Massive Increase in 404 Errors in GWT
-
Last June, we transitioned our site to the Magento platform. When we did so, we naturally got an increase in 404 errors for URLs that were not redirected (for a variety of reasons: we hadn't carried the product for years, Google no longer got the same string when it did a "search" on the site, etc.). We knew these would be there and were completely fine with them.
We also got many 404s due to the way Magento had implemented their site map (putting in products that were not visible to customers, including all the different file paths to get to a product even though we use a flat structure, etc.). These were frustrating but we did custom work on the site map and let Google resolve those many, many 440s on its own.
Sure enough, a few months went by and GWT started to clear out the 404s. All the poor, nonexistent links from the site map and missing links from the old site - they started disappearing from the crawl notices and we slowly went from some 20k 404s to 4k 404s. Still a lot, but we were getting there.
Then, in the last 2 weeks, all of those links started showing up again in GWT and reporting as 404s. Now we have 38k 404s (way more than ever reported). I confirmed that these bad links are not showing up in our site map or anything and I'm really not sure how Google found these again.
I know, in general, these 404s don't hurt our site. But it just seems so odd. Is there any chance Google bots just randomly crawled a big ol' list of outdated links it hadn't tried for awhile? And does anyone have any advice for clearing them out?
-
I'm just cynical enough to suspect this may be a byproduct of Google Webmaster Tools recent inbound link meltdown. Huge numbers of GWT users are reporting that their inbound link reports have basically lost most of their links.
What if, in dealing with the problem, Google has gone back to an older version of the links database, which might recover more of the recent links, but also pull back a whack of those links it already discounted?
This is pure speculation on my part, but there's been so much volatility on Google's link reporting recently that I can't say I trust the data as far as I can toss it at the moment.
Have you tired a similar comparison to the data shown in Bing Webmaster Tools?
I'm sure I read of others encountering what you're talking about recently. Will see if I can find the references in case they found anything.
Paul
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Redirects, 301's & 404's
I have tons of links that I have had added a redirect to after creating my companies new website. Is it bad to have all these 301s? How do I permanently redirect those links? Also, on Google Search Console it's telling me I have 1,000+ excluded links. Is this bad? Will it negatively affect me? Is this something to do with my sitemap? Any help would be greatly appreciated 🙂
Technical SEO | | sammecooper0 -
Noticed a lot of duplicate content errors...
how do I fix duplicate content errors on categories and tags? I am trying to get rid of all the duplicate content and I'm really not sure how to. Any suggestions, advice and/or help on this would be greatly appreciated. I did add the canonical url through the SEO Yoast plugin, but I am still seeing errors. I did this on over 200 pages. Thanks for any assistance in advance. Jaime
Technical SEO | | slapshotstudio0 -
How to fix duplicate content errors with Go Daddy Site
I have a friend that uses a free GoDaddy template for his business website. I ran his site through Moz Crawl diagnostics, and wow - 395 errors. Mostly duplicate content and duplicate page title I dug further and found the site was doing this: URL: www.businessname.com/page1.php and the duplicate: businessname.com/page1.php Essentially, the duplicate is missing the www. And it does this 2 hundred times. How do I explain to him what is happening?
Technical SEO | | cschwartzel0 -
404 Errors & Redirection
Hi, I'm working with someone who recently had two websites redesigned. The old permalink structure consisted of domain/year/month/date/post-name. Their developer changed the new permalink structure to domain/post-name, but apparently he didn't redirect the old URLs to the new ones so we're finding that links from external sites result in 404 errors (once I remove the date in the URL, the links work fine). Each site has 3-4 years worth of blog posts, so there are quite a few that would need to be changed. I was thinking of using the Redirection plugin - would that be the best way to fix this sitewide on both sites?Any suggestions would be appreciated. Thanks, Carolina
Technical SEO | | csmm0 -
Best action to take for "error" URLs?
My site has many error URLs that Google webmaster has identified as pages without titles. These are URLs such as: www.site.com/page???1234 For these URLs should I: 1. Add them as duplicate canonicals to the correct page (that is being displayed on the error URLs) 2. Add 301 redirect to the correct URL 3. Block the pages in robots.txt Thanks!
Technical SEO | | theLotter0 -
4xx error - but no broken links founded by Xenu
In my SeoMoz crawl report I get multiple 4XX errors reported and they are all on the same type of links. www.zylom.com/nl/help/contact/9/ and differiate between the number at the end and the language. But I i look in the source code we nice said: <a class="<a class="attribute-value">bigbuttonblue</a>" style="<a class="attribute-value">float:right; margin-left:10px;</a>" href="[/nl/help/contact/9/?sid=9&e=login](view-source:http://www.zylom.com/nl/help/contact/9/?sid=9&e=login)" onfocus="<a class="attribute-value">blur()</a>" title="<a class="attribute-value">contact</a>"> contact a> I already tested the little helpfull tool Xenu, but this also doesn't give any broken links for the url's which I found in the 4xx error report. Could somebody give me a suggestion Why these 4xx errors keep coming? Could it be that the SeoMoz crawlers break the part ?sid=9&e=login' from the URL. Because if you want to enter the link, you first get a pop-up to fill in a login screen. Thanks for you answers already
Technical SEO | | Letty0 -
Having a massive amount of duplicate crawl errors
Im having over 400 crawl errors over duplicate content looking like this: http://www.mydomain.com/index.php?task=login&prevpage=http%3A%2F%2Fwww.mydomain.com%2Ftag%2Fmahjon http://www.mydomain.com/index.php?task=login&prevpage=http%3A%2F%2Fwww.mydomain.com%2Findex.php%3F etc.. etc... So there seems to be something with my login script that is not working, Anyone knows how to fix this? Thanks
Technical SEO | | stanken0 -
WP Blog Errors
My WP blog is adding my email during the crawl, and I am getting 200+ errors for similar to the following; http://www.cisaz.com/blog/2010/10-reasons-why-microsofts-internet-explorer-dominance-is-ending/tony@cisaz.net "tony@cisaz.net" is added to Every post. Any ideas how I fix it? I am using Yoast Plug in. Thanks Guys!
Technical SEO | | smstv0