Weird 404 Errors
-
Hi All,
Although my Moz error scans have been pretty clean for a while, a law firm site I manage recently cropped up with 80+ 404 errors since the last scan.
I'm a little baffled as the url it shows being returned looks like this:
http://www.yoursite.com/ http://www.yoursite.com/resource.html
For some reason it seems to be initiating a query to call the root domain twice before the actual resource.
I installed ModX Revolution 2.2.6-PL on the site in question, and am hoping a canonical plugin I just started using will take care of these.
Has this happened to anyone else? What did you do to solve the issue?
Thanks for your time and any tips!
-
Hey Dan,
I had kind of assumed that it might be a false alarm from the Moz scan. I typically use Xenu to check for broken links periodically and it hasn't shown any.
Thanks for the tip!
-
Hi David
I would recommend cross checking this with Screaming Frog and/or Webmaster Tools. It's only a concern really if the web spiders and users are experiencing these 404's. I have seen it happen where Moz's crawler may hit 404's that Google and/or users do not.
If you get the errors in Screaming Frog or Webmaster Tools as well - here's what you need to do to fix them.
- Go to the source page of one of the broken links.
- View the HTML source.
- Do a control-F and search for the broken link (have it copied to your clipboard)
- Determine where in the code it's coming from.
- Then you can probably debug it from there.
Let us know if that gets you there.
Thanks!
-Dan
-
Hi David, I apologize for the delayed reply. I'm going to check with some other Associates and see if they can help trouble-shoot. In the meantime, please let us know if you have any updates. Thanks! (Christy)
-
Hey Christy,
Nope, Never did figure out an answer!
I took a break for a while but now I'm back trying to divine a solution.
Re: Dana's suggestion, I did make sure that our canonical plugin was using absolute urls, but it looks as though that did not solve the issue.
Not sure how to locate a potential PHP glitch that might be causing this... any pointers?
-
Hi David, were you able to resolve this issue?
-
Yes, I have seen this problem before. Bradley and Michael are both correct in that it had something to do with relative versus absolute URLs. In our case, it was being caused because we had relative URLs in all of our canonical tags. As soon as we fixed them to absolute URLs the strange looking 404 errors went away. Hope that helps!
-
Bradley's response is spot on. I coincidentally manage a large site in the legal area that has had errors like yours although the error isn't law related! As Bradley implies, this is typically the unintended result of code that is spitting out the unintended line feed in the CMS. I'm guessing that what you're seeing is probably the result of something related to PHP rather than an error in user input. Typically entering white space into user editable areas will result in it being stripped. When you have actual code like this inserted, it's the result of some line of PHP someone edited and saved without realizing the effect. I've had this happen before with RSS feeds where one little glitch will put a forward slash to the end of a URL and connect the beginning of another. Good luck with finding the solution, which shouldn't bee too tough.
-
%0A is the line feed character, so it looks like your CMS may be spitting out links that browsers and crawls interpret as relative links.
If your link appears like this:
[The link will be interpreted as relative and result in the link that you found on your Moz error report.
It's probably a problem with how the CMS is spitting out the href attribute, but it's hard to say without knowing more information.](%0Ahttp://www.yoursite.com/resource.html)
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Error Code 804: HTTPS (SSL) Error Encountered
I'm seeing the following error in Moz as below. I have not seen any errors when crawling with different tools - is this common or is it something that needs to be looked at. SO far I only have info below. Assuming I would need to open a ticket with hosting provider for this? Thanks! Error Code 804: HTTPS (SSL) Error Encountered Your page requires an SSL security certificate to load (using HTTPS), but the Moz Crawler encountered an error when trying to load the certificate. Our crawler is pretty standard, so it's likely that other browsers and crawlers may also encounter this error. If you have this error on your homepage, it prevents the Moz crawler (and some search engines) from crawling the rest of your site.
Moz Pro | | w4rdy0 -
503 Error or 200 OK??
So, in a Moz crawl and a Screaming From crawl, I'm getting some 503 Service Unavailable responses on the some pages. So I go to the pages in question, and the Moz bar is showing a 200 OK. The SEOBook http status checker (http://tools.seobook.com/server-header-checker/) also shows a 200 OK. What gives? The only reason I'm looking at this is because rankings plummeted a couple of weeks ago. Thanks! UPDATE So, I decided to use the mozbar to set the user agent as Googlebot and when I tried to access the pages in question I receive this message. I don't think this is an issue... anyone else have much experience here? Your access to this site has been limited Your access to this service has been temporarily limited. Please try again in a few minutes. (HTTP response code 503) Reason: Fake Google crawler automatically blocked Important note for site admins: If you are the administrator of this website note that your access has been limited because you broke one of the Wordfence firewall rules. The reason you access was limited is: "Fake Google crawler automatically blocked". If this is a false positive, meaning that your access to your own site has been limited incorrectly, then you will need to regain access to your site, go to the Wordfence "options" page, go to the section for Firewall Rules and disable the rule that caused you to be blocked. For example, if you were blocked because it was detected that you are a fake Google crawler, then disable the rule that blocks fake google crawlers. Or if you were blocked because you were accessing your site too quickly, then increase the number of accesses allowed per minute. If you're still having trouble, then simply disable the Wordfence firewall and you will still benefit from the other security features that Wordfence provides. If you are a site administrator and have been accidentally locked out, please enter your email in the box below and click "Send". If the email address you enter belongs to a known site administrator or someone set to receive Wordfence alerts, we will send you an email to help you regain access. Please read our FAQ if this does not work.
Moz Pro | | wiredseo0 -
403 error for a member site
Perhaps a stupid question but SEOmoz registers 403 errors for pages behind a membersite (ie. they are restricted on purpose). Should I noindex these pages or just let SEOmoz register these "errors"?
Moz Pro | | Crunchii0 -
5xx (Server Errors)-in Wordpress
Since going to a wordpress platform in November, I have seen many 501 server errors in the crawl report. When I click on the link in the report however, the link shows the actual page with no errors. I reviewed all the Q&A but didn't see anything related to this issue. Does anyone have an idea as to why the actual link works when I click on it but the SEOMOZ crawl bot is showing a 5XX error. Thanks for any ideas or feedback you may have.
Moz Pro | | FidelityOne0 -
Error on duplicated content, but when checking shouldn't been possible
Dear all, Every week I look at the different crawl reports for our website, since the start of my SeoMoz membership the Errors for duplicated content and duplicated Title is rising. But if I take out the .csv file and look in more detail, and select a pages which is marked as duplicated content, a canonical is actually existing on this page. So it shouldn't be an warning, I have no idea what the issue could be. For example pagesare marked as duplicated content, <colgroup><col width="966"></colgroup>
Moz Pro | | Letty
| http://www.zylom.com/es/descargar-juegos/3-en-raya/?sortby=2 |
| http://www.zylom.com/es/descargar-juegos/3-en-raya/?startnumber=60&sortby=2 |
| http://www.zylom.com/es/descargar-juegos/3-en-raya/?startnumber=80&sortby=2 | the parameters after '?' (question mark) are necessary for our internal system. To overcome duplicated content we coded that a canonical tag onis placed on every page with parameters and the main page is http://www.zylom.com/es/descargar-juegos/3-en-raya/ but it doesn't seem to work, because my error warnings are still rising. Please advice me Kind regards, Ms Letty van Eembergen0 -
How do I fix a duplicate content error with a top level domain?
Hi, I'm getting a duplicate content error from the SEOmoz crawler due to an issue with trailing slashes. It's showing www.milengo.com and www.milengo.com/ as having duplicate page titles. However I'm pretty sure this has been fixed in the .htaccess file since if you type in the domain with a trailing slash it automatically redirects to the domain without a trailing slash, so this shouldn't be an issue. I'm stuck here. Any ideas? Thanks. Rob
Moz Pro | | milengo0