404 from a 404 that 301s

MikeRoberts

I must be missing something or skipping a step or lacking proper levels of caffeine.

Under my High Priority warnings I have a handful of 404s which are like that on purpose but I'm not sure how Moz is finding them. When I check the referrer info, the 404 is being linked to from a different 404 which is now a 301 (due to craziness of our system and what was easiest for the coders to fix a different problem ages ago). Basically, if a user decides to type in a non-existent model number into the URL there is a specific 404 that comes up. While the 404 error is "site.com/product/?model=abc123" the referrer is "site.com/product?model=abc123" (or more simply, one slash is missing). I can't see how Moz is finding the referrer so I can't figure out how to make Moz stop crawling it. I actually have the same problem in Google WMT for the same group of 404s.

What am I just not seeing that will fix this?

Ali_Sadr

Let me know if it works Mike. There is actually a third possibility which is;

Some page(s) might generate a dynamic URL only upon being visited by a browser/search agent. If that's the case, then you can set up an event tracking through your website in conjuction with Google Analytics and track teh refferer;

_gaq.push(['_trackEvent', 'Error', '404', 'page: ' + document.location.pathname + document.location.search + ' ref: ' + document.referrer ]);

After you collect some data (Submit your website to Google WMT or wait for next MOZ visit) you can export and run your filter.

The alternative to this method could be one of the 2 following;

enabling extreme debug/log mode on your programming platform and collect logs for further processing. You can run a small Python script to find the RegEx pattern. I advise to setup a demo copycat of your website on a subdomain and then run this experiment. You can then submit the demo sub domain to Google Webmaster tools and wait for the crawlers.
Reconfigure your webserver logging (httpd.conf if using Apache) to log more details. Make sure you turn back into to the normal data collecting configuration to avoid storage consumption.

Good luck,

Ali

MikeRoberts

I had done about half of that... I'll take a look at all of it and try again tomorrow following your suggestions and see if I can figure it out then. Thanks.

Ali_Sadr

Hi Mike,

Hope all is well. There are two things that might have made this confusion. Either you have some outdated links somewhere on your website that are leading to the custom 404 page or some external link is pointing back to your website with a wrong URL or missing product. In order to find the link (I say so, because a crawler has to hit a link to crawl so there is definitely one), you can use tools like Ahrefs link analysis and see what is pointing where. export to an excel and filter based on a RegEx you'd make out a 404 generating pattern you already have with Moz or Google WMT. You find one and you'll know where they are coming from and how to fix them. You'd be able to write custom redirects in your htaccess if they are not many. If they are many though, htaccess could slow down your website and the best way would be a back-end base redirect either custom coded or through a plugin based on your platform. I would start from

my error_logs in webserver logs and match them with WMT and Moz report.
download CSV and import to excel or program of your choice
filter based on the pattern
Match it with where you've found the link through Ahref
and Voila, now you know exactly how to clean them up

Hope this helps Mike,

Have a nice day,

Ali

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

404 from a 404 that 301s

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

How do I fix a 404 redirect chain

404 errors

GWT Soft 404 count is climbing. Important to fix?

Increase 404 errors or 301 redirects?

404's in WMT are old pages and referrer links no longer linking to them.

Locating 404 Page Errors for Deletion

I am using SEOmoz pro software and my blog tags are bringing up 404 errors.

Thoughts about stub pages - 200 & noindex ok, or 404?