4xx (not found) errors seem spurious, caused by a "\" added to the URL

GPN

Hi SEOmoz folks

We're getting a lot of 404 (not found) errors in our weekly crawl.

However the weird thing is that the URLs in question all have the same issue.

They are all a valid URL with a backsalsh ("") added. In URL encoding, this is an extra %5C at the end of the URL.

Even weirder, we do not have any such URLs in our (Wordpress-based) website.

Any insight on how to get rid of this issue?

Thanks

GPN

No, Google Webmaster tools do not list an error here.

Its indeed an SEOmoz bug. Ryan, thanks for trying though!

RyanKent

My request is for a real link that I can click on and view the page.

In most cases where someone described an issue to me, either a key piece of information was left out or missed. If you cannot share that information, I understand. In the interest of being helpful, I wanted to ask.

It is entirely possible this is a crawler issue, but it is also possible the crawler is functioning perfectly and Google's crawler will produce the same result. That is my concern.

GPN

Well actualy I did already. The example I gave above is exactly that, only I replaced the real URL with "URL".

In a bit greater detail, the referring page is actually URL1 and this page contains the javascript

item = '

text';

which produces 404 errors for URL2 in the SEOmoz crawl report.

RyanKent

It is entirely possible the issue is with the SEOmoz crawler. I would like to see it improved as well.

I am concerned the root issue may actually be with your site. Would you be willing to share an example of a link which is flagged in your report along with the referring page?

GPN

Thanks for the tips. After drilling down on the referer, this looks like an SEOmoz bug.

We are using a wordpress plugin called "collapsing archives" which creates LEGAL archive links with a javascript snippet like this:

item = '

text';

As you can see this is totally legal javascript. But it seems SEOmoz is scanning the javascript without interpretation and picking up the escaped quotation mark ' after the URL and interpreting it as an additional \ at the end of the URL.

Since the plugin is behaving legally and works well - we want to keep using it. What's the chance that SEOmoz will fix the bug?

RyanKent

Many people do not realize when you add the backslash character, you change the URL. You can actually present a different web page for the URL with the trailing slash.

A popular cause of the problem is linking. If you check your weekly crawl report, there will be a column called Referrer. That is the source of the link. Check the referring page and find the link. Fix the link (i.e. remove the trailing slash) and the problem will go away on the next crawl. Of course, you want to determine how the link appeared and ensure it doesn't happen again.

DanHill

If I had to have a guess I'd look into any javascript on the page that is perhaps adding or pointing to the URL with backslash.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

4xx (not found) errors seem spurious, caused by a "\" added to the URL

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Solving URL Too Long Issues

Can someone kindly explain what 'Crawl Issue Found: No rel="canonical" Tags' means? Is this a critical error and how can it be rectified?

Question about Crawl Diagnostics - 4xx (Client Error) report

How can I prevent errors of duplicate page content generated by my tags from my wordpress on-site blog platform?

How reliable are the external link metrics found in the Open Site Explorer research tool?

How to remove /index.html that causes duplicated content

Duplicate content due to "Email a Friend" and "PhotoGallery"

Crawl Diagnostics bringing 20k+ errors as duplicate content due to session ids