Weird 404 Errors
-
Hi All,
Although my Moz error scans have been pretty clean for a while, a law firm site I manage recently cropped up with 80+ 404 errors since the last scan.
I'm a little baffled as the url it shows being returned looks like this:
http://www.yoursite.com/ http://www.yoursite.com/resource.html
For some reason it seems to be initiating a query to call the root domain twice before the actual resource.
I installed ModX Revolution 2.2.6-PL on the site in question, and am hoping a canonical plugin I just started using will take care of these.
Has this happened to anyone else? What did you do to solve the issue?
Thanks for your time and any tips!
-
Hey Dan,
I had kind of assumed that it might be a false alarm from the Moz scan. I typically use Xenu to check for broken links periodically and it hasn't shown any.
Thanks for the tip!
-
Hi David
I would recommend cross checking this with Screaming Frog and/or Webmaster Tools. It's only a concern really if the web spiders and users are experiencing these 404's. I have seen it happen where Moz's crawler may hit 404's that Google and/or users do not.
If you get the errors in Screaming Frog or Webmaster Tools as well - here's what you need to do to fix them.
- Go to the source page of one of the broken links.
- View the HTML source.
- Do a control-F and search for the broken link (have it copied to your clipboard)
- Determine where in the code it's coming from.
- Then you can probably debug it from there.
Let us know if that gets you there.
Thanks!
-Dan
-
Hi David, I apologize for the delayed reply. I'm going to check with some other Associates and see if they can help trouble-shoot. In the meantime, please let us know if you have any updates. Thanks! (Christy)
-
Hey Christy,
Nope, Never did figure out an answer!
I took a break for a while but now I'm back trying to divine a solution.
Re: Dana's suggestion, I did make sure that our canonical plugin was using absolute urls, but it looks as though that did not solve the issue.
Not sure how to locate a potential PHP glitch that might be causing this... any pointers?
-
Hi David, were you able to resolve this issue?
-
Yes, I have seen this problem before. Bradley and Michael are both correct in that it had something to do with relative versus absolute URLs. In our case, it was being caused because we had relative URLs in all of our canonical tags. As soon as we fixed them to absolute URLs the strange looking 404 errors went away. Hope that helps!
-
Bradley's response is spot on. I coincidentally manage a large site in the legal area that has had errors like yours although the error isn't law related! As Bradley implies, this is typically the unintended result of code that is spitting out the unintended line feed in the CMS. I'm guessing that what you're seeing is probably the result of something related to PHP rather than an error in user input. Typically entering white space into user editable areas will result in it being stripped. When you have actual code like this inserted, it's the result of some line of PHP someone edited and saved without realizing the effect. I've had this happen before with RSS feeds where one little glitch will put a forward slash to the end of a URL and connect the beginning of another. Good luck with finding the solution, which shouldn't bee too tough.
-
%0A is the line feed character, so it looks like your CMS may be spitting out links that browsers and crawls interpret as relative links.
If your link appears like this:
[The link will be interpreted as relative and result in the link that you found on your Moz error report.
It's probably a problem with how the CMS is spitting out the href attribute, but it's hard to say without knowing more information.](%0Ahttp://www.yoursite.com/resource.html)
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How do I fix 5xx errors on pdf's?
Hi all, I'm having a huge (and very frustrating) issue-- I have 437 pages with 5xx errors and have no idea how to go about fixing them. All of them are links to pdf's, and when I click on the link on the Site Crawl page, it opens just fine. I've been trying to find the answer to how to fix this issue for weeks with no luck at all. I tried linking the pdf's to their parent page, tried renaming the links to be more concise and crawler friendly, and I've searched all types of articles involving pdf's and 5xx errors, all with no luck. Does anyone have any idea why these pages are getting this error and how I can fix them? Huge thanks in advance! Daniela S.
Moz Pro | | laurelleafep0 -
How to remove 404 pages wordpress
I used the crawl tool and it return a 404 error for several pages that I no longer have published in Wordpress. They must still be on the server somewhere? Do you know how to remove them? I think they are not a file on the server like an html file since Wordpress uses databases? I figure that getting rid of the 404 errors will improve SEO is this correct? Thanks, David
Moz Pro | | DJDavid0 -
Site Crawl Error
In moz crawling error this message is appears: MOST COMMON ISSUES 1Search Engine Blocked by robots.txt Error Code 612: Error response for robots.txt i asked help staff but they crawled again and nothing changed. there's only robots.XML (not TXT) in root of my webpage it contains: User-agent: *
Moz Pro | | nopsts
Allow: /
Allow: /sitemap.htm anyone please help me? thank you0 -
Having 1 page crawl error on 2 sites
Help! A few weeks back, my dev team did some "changes" (that I don't know anything about), but ever since then, my Moz crawl has only shown one page for either http://betamerica.com or http://fanex.com. Moz service was helpful in talking about a redirect loop that existed, and I asked my team to fix it, which it looks to me like they have. Still, 1 page. I used SEO Book's spider tool and it also only sees 1 page, and sees the sites as http://https://betamerica.com (for example), which is just weird. I don't know enough about HT Access or server stuff to figure out what's going on, so if someone can help me figure that out, I'd appreciate it.
Moz Pro | | BetAmerica0 -
Warnings, Notices, and Errors- don't know how to correct these
I have been watching my Notices, Warnings and Errors increase since I added a blog to our WordPress site. Is this effecting our SEO? We now have the following: 2 4XX errors. 1 is for a page that we changed the title and nav for in mid March. And one for a page we removed. The nav on the site is working as far as I can see. This seems like a cache issue, but who knows? 20 warnings for “missing meta description tag”. These are all blog archive and author pages. Some have resulted from pagination and are “Part 2, Part 3, Part 4” etc. Others are the first page for authors. And there is one called “new page” that I can’t locate in our Pages admin and have no idea what it is. 5 warnings for “title element too long”. These are also archive pages that have the blog name and so are pages I can’t access through the admin to control page title plus “part 2’s and so on. 71 Notices for “Rel Cononical”. The rel cononicals are all being generated automatically and are for pages of all sorts. Some are for a content pages within the site, a bunch are blog posts, and archive pages for date, blog category and pagination archive pages 6 are 301’s. These are split between blog pagination, author and a couple of site content pages- contact and portfolio. Can’t imagine why these are here. 8 meta-robot nofollow. These are blog articles but only some of the posts. Don’t know why we are generating this for some and not all. And half of them are for the exact same page so there are really only 4 originals on this list. The others are dupes. 8 Blocked my meta-robots. And are also for the same 4 blog posts but duplicated twice each. We use All in One SEO. There is an option to use noindex for archives, categories that I do not have enabled. And also to autogenerate descriptions which I do not have enabled. I wasn’t concerned about these at first, but I read these (below) questions yesterday, and think I'd better do something as these are mounting up. I’m wondering if I should be asking our team for some code changes but not sure what exactly would be best. http://www.seomoz.org/q/pages-i-dont-want-customers-to-see http://www.robotstxt.org/meta.html Our site is http://www.fateyes.com Thanks so much for any assistance on this!
Moz Pro | | gfiedel0 -
Dot Net Nuke generating long URL showing up as crawl errors!
Since early July a DotNetNuke site is generating long urls that are showing in campaigns as crawl errors: long url, duplicate content, duplicate page title. URL: http://www.wakefieldpetvet.com/Home/tabid/223/ctl/SendPassword/Default.aspx?returnurl=%2F Is this a problem with DNN or a nuance to be ignored? Can it be controlled? Google webmaster tools shows no crawl errors like this.
Moz Pro | | EricSchmidt0 -
54 new 404 errors on my website?
Hi There In the latest report I have 54 404-errors. All last week, previously I had 2 404s that I fixed. In report say: Title404 : ErrorMeta DescriptionTraceback (most recent call last): File "build/bdist.linux- x86_64/egg/downpour/init.py", line 391, in _error failure.raiseException() File "/usr/local/lib/python2.7/site- packages/twisted/python/failure.py", line 370, in raiseException raise self.type, self.value, self.tb Error: 404 Not FoundMeta RobotsNot present/emptyMeta RefreshNot present/empty Are these normal 404 errors I have to look at and fix? Or is this some script that running on my server and causing errors? In general - what should I do to fix this? Thanks Dean
Moz Pro | | Passanger880 -
Company Name in Page Title creating thousands of "Duplicate Page Title" errors
I am new, and I just got back my crawl results (after a week or more). The first thing I noticed is that the "duplicate page title" is in the thousands, my urls and page titles are different. The only thing I can see is that our company name is at appended to the name of every title. I did search and found one other person with this problem, but no answer was given. Can anyone offer some advice? This doesn't seem right... Thanks,
Moz Pro | | AoyamaJPN0