WebMaster Tools keeps showing old 404 error but doesn't show a "Linked From" url. Why is that?
-
Hello Moz Community.
I have a question about 404 crawl errors in WebmasterTools, a while ago we had an internal linking problem regarding some links formed in a wrong way (a loop was making links on the fly), this error was identified and fixed back then but before it was fixed google got to index lots of those malformed pages. Recently we see in our WebMaster account that some of this links still appearing as 404 but we currently don't have that issue or any internal link pointing to any of those URLs and what confuses us even more is that WebMaster doesn't show anything in the "Linked From" tab where it usually does for this type of errors, so we are wondering what this means, could be that they still in google's cache or memory? we are not really sure.
If anyone has an idea of what this errors showing up now means we would really appreciate the help. Thanks.
-
Hi Jane, thanks for the follow up. Every time we see errors showing up in WMT (mainly 404's) we remove the URL's right away and indeed we see the errors going down every 4-5 days (under HTML improvements).
I am just surprised, that if we would not use the URL removal tool, how long it takes for Google to actually remove 404's from their index. I know the higher the PR, the more likely they crawl more often and the faster they remove these 404's I guess, but still.
-
Hi again,
Four months seems abnormally long, but it could have something to do with how many 404s are are - 400 is pretty high. Is this number at least going down every few weeks in WMT?
Cheers,
Jane
-
hi Jane, we've solved the cause of these errors more than 4 months ago at this point. There is no path to these urls anymore, but they keep showing up so it takes Google pretty long to clean up. And our estimate is that there about 400 more of these 404 errors so we still have some time to go I guess.
-
Hi,
How long have these errors been appearing since you fixed the issue? It could be a case of Google looking for URLs on the site that it has seen in the past, even though there is no path to them anymore. With the pathway gone, it should stop looking, but I'm curious how long the issue has been fixed for?
-
I hate to speculate on anything involving SEO, but I've always taken those 404s as visits Google has been able to grab data for. If Webmasters is able to catch the data for a visit to a 404, it'll let you know about it.
What lead me to this cringe assumption cringe was how similar those 404s were to existing pages, like someone tried to type in a URL and got it wrong, or deleted some of it and hit "enter".
Take the info for what it's worth, which isn't fact, just an idea to get you rolling.
-
I've had those too and they are quiet annoying (love seeing 0 errors hehe). I just mark and fixed and hope it doesn't show up again (usually stops appearing after doing that once or twice).
If anyone has another other insight into this, please share!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Webmaster Tools is saying "Sitemap contains urls which are blocked by robots.txt" after Https move...
Hi Everyone, I really don't see anything wrong with our robots.txt file after our https move that just happened, but Google says all URLs are blocked. The only change I know we need to make is changing the sitemap url to https. Anything you all see wrong with this robots.txt file? robots.txt This file is to prevent the crawling and indexing of certain parts of your site by web crawlers and spiders run by sites like Yahoo! and Google. By telling these "robots" where not to go on your site, you save bandwidth and server resources. This file will be ignored unless it is at the root of your host: Used: http://example.com/robots.txt Ignored: http://example.com/site/robots.txt For more information about the robots.txt standard, see: http://www.robotstxt.org/wc/robots.html For syntax checking, see: http://www.sxw.org.uk/computing/robots/check.html Website Sitemap Sitemap: http://www.bestpricenutrition.com/sitemap.xml Crawlers Setup User-agent: * Allowable Index Allow: /*?p=
Technical SEO | | vetofunk
Allow: /index.php/blog/
Allow: /catalog/seo_sitemap/category/ Directories Disallow: /404/
Disallow: /app/
Disallow: /cgi-bin/
Disallow: /downloader/
Disallow: /includes/
Disallow: /lib/
Disallow: /magento/
Disallow: /pkginfo/
Disallow: /report/
Disallow: /stats/
Disallow: /var/ Paths (clean URLs) Disallow: /index.php/
Disallow: /catalog/product_compare/
Disallow: /catalog/category/view/
Disallow: /catalog/product/view/
Disallow: /catalogsearch/
Disallow: /checkout/
Disallow: /control/
Disallow: /contacts/
Disallow: /customer/
Disallow: /customize/
Disallow: /newsletter/
Disallow: /poll/
Disallow: /review/
Disallow: /sendfriend/
Disallow: /tag/
Disallow: /wishlist/
Disallow: /aitmanufacturers/index/view/
Disallow: /blog/tag/
Disallow: /advancedreviews/abuse/reportajax/
Disallow: /advancedreviews/ajaxproduct/
Disallow: /advancedreviews/proscons/checkbyproscons/
Disallow: /catalog/product/gallery/
Disallow: /productquestions/index/ajaxform/ Files Disallow: /cron.php
Disallow: /cron.sh
Disallow: /error_log
Disallow: /install.php
Disallow: /LICENSE.html
Disallow: /LICENSE.txt
Disallow: /LICENSE_AFL.txt
Disallow: /STATUS.txt Paths (no clean URLs) Disallow: /.php$
Disallow: /?SID=
disallow: /?cat=
disallow: /?price=
disallow: /?flavor=
disallow: /?dir=
disallow: /?mode=
disallow: /?list=
disallow: /?limit=5
disallow: /?limit=10
disallow: /?limit=15
disallow: /?limit=20
disallow: /*?limit=250 -
Old URLs Appearing in SERPs
Thirteen months ago we removed a large number of non-corporate URLs from our web server. We created 301 redirects and in some cases, we simply removed the content as there was no place to redirect to. Unfortunately, all these pages still appear in Google's SERPs (not Bings) for both the 301'd pages and the pages we removed without redirecting. When you click on the pages in the SERPs that have been redirected - you do get redirected - so we have ruled out any problems with the 301s. We have already resubmitted our XML sitemap and when we run a crawl using Screaming Frog we do not see any of these old pages being linked to at our domain. We have a few different approaches we're considering to get Google to remove these pages from the SERPs and would welcome your input. Remove the 301 redirect entirely so that visits to those pages return a 404 (much easier) or a 410 (would require some setup/configuration via Wordpress). This of course means that anyone visiting those URLs won't be forwarded along, but Google may not drop those redirects from the SERPs otherwise. Request that Google temporarily block those pages (done via GWMT), which lasts for 90 days. Update robots.txt to block access to the redirecting directories. Thank you. Rosemary One year ago I removed a whole lot of junk that was on my web server but it is still appearing in the SERPs.
Technical SEO | | RosemaryB3 -
Do links that point to an old URL retain value if we have the correct redirects?
I've recently taken over SEO for my company. There are a lot of old links that point to our old URL (www.examplecountry.com changed to (www.examplewhatwedo.com). We have the correct redirects in place and Open Site Explorer shows many of the links pointing to the old site even though I'm inputting the new URL. I just want to put my mind at rest that any value these links have doesn't got lost due to the URL change. Unfortunately a lot of them have the old URL as the anchor text....which I guess will decrease their quality? Thanks!
Technical SEO | | MarbellaSurferDude0 -
Rel="no follow" for All Links on a Site that Charges for Advertising
If I run a site that charges other companies for listing their products, running banner advertisements, white paper downloads, etc. does it make sense to "no follow" all of their links on my site? For example: they receive a profile page, product pages and are allowed to post press releases. Should all of their links on these pages be "no follow"? It seems like a gray area to me because the explicit advertisements will definitely be "no followed" and they are not buying links, but buying exposure. However, I still don't know the common practice for links from other parts of their "package". Thanks
Technical SEO | | zazo0 -
Why won't the Moz plug in "Analyze Page" tool read data on a Big Commerce site?
We love our new Big Commerce site, just curious as to what the hang up is.
Technical SEO | | spalmer0 -
Keyword rankings improve but traffic doesn't
I am working on a couple of SEO projects and have noticed over the past couple of months that the keywords rankings have improved immensely with most of them amongst top 10 on google, but still the traffic on the website doesn't improve much. Can somebody explain me the possible reasons behind this, and what can I do to attract more traffic?
Technical SEO | | KS__0 -
Can someone break down 'page level link metrics' for me?
Sorry for the, again, basic question - can someone define page level link metrics for me?
Technical SEO | | Benj250 -
404 errors on a 301'd page
I current have a site that when run though a site map tool (screaming frog or xenu) returns a 404 error on a number of pages The pages are indexed in Google and when visited they do 301 to the correct page? why would the sitemap tool be giving me a different result? is it not reading the page correctly?
Technical SEO | | EAOM0