Weird 404 Errors in Webmaster Tools
-
Hi,
In a regular check with Webmaster Tools, I have noticed a sudden increase in the number of "not found-404" errors. So I have been looking at them and noticed something weird has been going on.
There are well over 100 pages with 404-errors. The funny thing is, none of the ULR's are correct, For example, if the actual url is something like www.domain.com/latest-reviews , the 404-error points to a non-existent URL like www.domain.com/latest-re And when I checked where they were linked from, they are all from these spammy sites.
Anyone know what could be causing these links, why would anyone link on purpose to a non-existent page?
cheers,
-
I have alike problem: dozen of 404 errors in webmastertools like this:
http://domain.ru/ka...tino-akcia-trexkomnatnaja
http://domain.ru/Sa...e-novosti-za-oktyabr-2012
And there's not linkes to these pages from anywhere. Strange situation, cause i've lot's of pages with urls of different length, but not all of theme comes with error.
-
Thanks. I have actually been adding 301 redirects but didn't want to be spending too much time on it. Some of the links were not even linked. They were just text and Google still treated them as links.
-
Thanks. I've got canonical. So I guess I don't have to do anything.
-
Hi,
When compare you give urls seems someone have posted your shortened urls. As an example on some websites they are shortening the actual url and using as Anchor text.
As an example http://www.seomoz.org/q/wei.. but it correctly has linked to the correct page. But some users with less knowledge, they just copy the Anchor text and post those at blog posts or some other places. Because that anchor text looks like an url.
And also it can be happen because of some other site's activity.
Anyway 404 not found errors will not affect your ranking. So you do not have to worry about this problem. Also suggest you to read this help document about 404 errors.
But I can see some another problem can happen because of this kind of activity. Because if you will get any traffic from a url like that with some suffixed which you have not created. As an example a url like this
www.domain.com/latest-reviews/?refferer=some_reffer
can be have a duplicate content issue. So, I strongly recommend to add rel canonical url in to your page.
Regards
Prasad
-
Google is finding text URLs on sites with limited characters. It's a google crawl problem.
SiteX refers to your article: http://yourdomain.com/blog/austin/steve-rides-to-the-alamo but they hit a charater limit of say 40 characters so they print the URL as "http://yourdomain.com/blog/austin/steve" but link it correctly. Even with a correct link, google will read the text and crawl it the way the text is printed, not linked. Or this happens if it's not linked at all and just a shortened text URL.
To sum it up... Google's got a problem and scrapper sites that chop up URLs are feeding the bots crap. If however the linking domain is a good one and you'd like to take advantage of this little error, then you create a redirect rule on your website for the 404 page.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Clarification on indexation of XML sitemaps within Webmaster Tools
Hi Mozzers, I have a large service based website, which seems to be losing pages within Google's index. Whilst working on the site, I noticed that there are a number of xml sitemaps for each of the services. So I submitted them to webmaster tools last Friday (14th) and when I left they were "pending". On returning to the office today, they all appear to have been successfully processed on either the 15th or 17th and I can see the following data: 13/08 - Submitted=0 Indexed=0
Technical SEO | | Silkstream
14/08 - Submitted=606,733 Indexed=122,243
15/08 - Submitted=606,733 Indexed=494,651
16/08 - Submitted=606,733 Indexed=517,527
17/08 - Submitted=606,733 Indexed=517,498 Question 1: The indexed pages on 14th of 122,243 - Is this how many pages were previously indexed? Before Google processed the sitemaps? As they were not marked processed until 15th and 17th? Question 2: The indexed pages are already slipping, I'm working on fixing the site by reducing pages and improving internal structure and content, which I'm hoping will fix the crawling issue. But how often will Google crawl these XML sitemaps? Thanks in advance for any help.0 -
Google webmaster tools says access denied error 403
Hi, this keeps on happening, just check early today and it tells me i have access denied and 403 errors I have this from time to time in my google webmaster tools and i have checked the pages and they work properly, so i am puzzled why this has happened. I have contacted my hosting company who have said there is not a problem but there must be a problem somewhere which could affect my site rankings. can anyone let me know what this could be please. i work in joomla | parenting-magazine | 403 | 8/10/13 |
Technical SEO | | ClaireH-184886
| | 2 | personal-finance-money-advice | 403 | 8/10/13 |
| | 3 | 201308081607/emmerdale/emmerdale-chas-confronts-cameron-over-affair-with-debbie | 403 | 8/10/13 |
| | 4 | 201308081606/emmerdale/emmerdale-declan-gets-a-visit-from-the-police | 403 | 8/10/13 |
| | 5 | 201308081608/emmerdale/emmerdale-cameron-debbie-affair-is-out-in-the-open | 403 | 8/10/13 |
| | 6 | 201308081614/uk-holiday-news/visitscotland-launch-campaign-to-boost-tourism | 403 | 8/10/13 |
| | 7 | dog-advice/training-your-puppy-a-beginners-guide | 403 | 8/10/13 |
| | 8 | gadgets/hp-envy-13-laptop-review | 403 | 8/10/13 |
| | 9 | gadget-talk/everyday-smartphone-gadgets-which-could-revolutionise-your-life | 403 | 8/10/13 |
| | 10 | news-gadgets/the-htc-one-mobile-phone-review | 403 | 8/10/13 |
| | 11 | gadget-talk/five-iphone-apps-for-home-improvement | 403 | 8/10/13 |
| | 12 | gadget-talk/are-android-apps-useful-for-business-success | 403 | 8/10/13 |
| | 13 | gadget-talk/television-gadgets-the-future-of-television-is-coming | 403 | 8/10/13 | | | |0 -
DNS error on webmaster tool
Google webmaster tool is showing DNS error and that is leading to many server error (502,500) almost 50+ in every crawl. Recently Google crawled one of our sub domains that we did not want google to crawl. We blocked it via Robots.txt and also removed all the URL's and since then we are having this issue. Any suggestions how to fix this DNS error? Thanks in advance.
Technical SEO | | tpt.com0 -
URL Error "NODE"
Hey guys, So I crawled my site after fixing a few issues, but for some reason I'm getting this strange node error that goes www.url.com/node/35801 which I haven't seen before. It appears to originate from user submitted content and when I go to the page it's a YouTube video with no video playing just a black blank screen. Has anyone had this issue before. I think it can probably just be taken off the site, but if it's a programming error of some sort I'd just like to know what it is to avoid it in the future. Thanks
Technical SEO | | KateGMaker0 -
.htaccess and error 404
Hi, I permit to contact the community again because you have good and quick answer ! Yesterday, I lost the file .htaccess on my server. Right now, only the home page is working and the other pages give me this message : Not Found The requested URL /freshadmin/user/login/ was not found on this server Could you help me please? Thanks
Technical SEO | | Probikeshop0 -
Nginx 403 and 503 errors
I have a client with a website that is hosted on a shared webserver running on an Nginx server. When I started working on the website a few months ago I found the server was throwing 100s of 403s and 503s and at one point googlebot couldn't access robots.txt. Needless to say this didn't help rankings! Now the web hosting company has partially resolved the errors by switching to a new server and I'm now just seeing intermittent spikes in Webmaster Tools of 30 to 70 403 ad 503 errors. My questions: Am I right in saying there should (pretty much) be no such errors (for pages that we make public and crawlable). Having already asked the web hosting company to look in to this. Any advice on specifically what I should be asking them to look at on the server? If this doesn't work out, does anyone having a recommendation for a reliable web hosting company in the U.S. for a lead generation website with over 20,000 pages and currently 500 to 1000 visits per day? Thanks for the help Mozzers 🙂
Technical SEO | | MatShepSEO0 -
Why would SEOMoz and GWT report 404 errors for pages that are not 404ing?
Recently, I've noticed that nearly all of the 404 errors (not soft 404) reported in GWT actually resolve to a legitimate page. This was weird, but I thought it might just be old info, so I would go through the process of checking and "mark as fixed" as necessary. However, I noticed that SEOMoz is picking up on these 404 errors in the diagnostics of the site as well, and now I'm concerned with what the problem could be. Anyone have any insight into this? Rich
Technical SEO | | secretstache0 -
403 forbidden error website
Hi Mozzers, I got a question about new website from a new costumer http://www.eindexamensite.nl/. There is a 403 forbidden error on it, and I can't find what the problem is. I have checked on: http://gsitecrawler.com/tools/Server-Status.aspx
Technical SEO | | MaartenvandenBos
result:
URL=http://www.eindexamensite.nl/ **Result code: 403 (Forbidden / Forbidden)** When I delete the .htaccess from the server there is a 200 OK :-). So it is in the .htaccess. .htaccess code: ErrorDocument 404 /error.html RewriteEngine On
RewriteRule ^home$ / [L]
RewriteRule ^typo3$ - [L]
RewriteRule ^typo3/.$ - [L]
RewriteRule ^uploads/.$ - [L]
RewriteRule ^fileadmin/.$ - [L]
RewriteRule ^typo3conf/.$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-l
RewriteRule .* index.php Start rewrites for Static file caching RewriteRule ^(typo3|typo3temp|typo3conf|t3lib|tslib|fileadmin|uploads|screens|showpic.php)/ - [L]
RewriteRule ^home$ / [L] Don't pull *.xml, *.css etc. from the cache RewriteCond %{REQUEST_FILENAME} !^..xml$
RewriteCond %{REQUEST_FILENAME} !^..css$
RewriteCond %{REQUEST_FILENAME} !^.*.php$ Check for Ctrl Shift reload RewriteCond %{HTTP:Pragma} !no-cache
RewriteCond %{HTTP:Cache-Control} !no-cache NO backend user is logged in. RewriteCond %{HTTP_COOKIE} !be_typo_user [NC] NO frontend user is logged in. RewriteCond %{HTTP_COOKIE} !nc_staticfilecache [NC] We only redirect GET requests RewriteCond %{REQUEST_METHOD} GET We only redirect URI's without query strings RewriteCond %{QUERY_STRING} ^$ We only redirect if a cache file actually exists RewriteCond %{DOCUMENT_ROOT}/typo3temp/tx_ncstaticfilecache/%{HTTP_HOST}/%{REQUEST_URI}/index.html -f
RewriteRule .* typo3temp/tx_ncstaticfilecache/%{HTTP_HOST}/%{REQUEST_URI}/index.html [L] End static file caching DirectoryIndex index.html CMS is typo3. any ideas? Thanks!
Maarten0