404 from a 404 that 301s
-
I must be missing something or skipping a step or lacking proper levels of caffeine.
Under my High Priority warnings I have a handful of 404s which are like that on purpose but I'm not sure how Moz is finding them. When I check the referrer info, the 404 is being linked to from a different 404 which is now a 301 (due to craziness of our system and what was easiest for the coders to fix a different problem ages ago). Basically, if a user decides to type in a non-existent model number into the URL there is a specific 404 that comes up. While the 404 error is "site.com/product/?model=abc123" the referrer is "site.com/product?model=abc123" (or more simply, one slash is missing). I can't see how Moz is finding the referrer so I can't figure out how to make Moz stop crawling it. I actually have the same problem in Google WMT for the same group of 404s.
What am I just not seeing that will fix this?
-
Let me know if it works Mike. There is actually a third possibility which is;
Some page(s) might generate a dynamic URL only upon being visited by a browser/search agent. If that's the case, then you can set up an event tracking through your website in conjuction with Google Analytics and track teh refferer;
_gaq.push(['_trackEvent', 'Error', '404', 'page: ' + document.location.pathname + document.location.search + ' ref: ' + document.referrer ]);
After you collect some data (Submit your website to Google WMT or wait for next MOZ visit) you can export and run your filter.
The alternative to this method could be one of the 2 following;
- enabling extreme debug/log mode on your programming platform and collect logs for further processing. You can run a small Python script to find the RegEx pattern. I advise to setup a demo copycat of your website on a subdomain and then run this experiment. You can then submit the demo sub domain to Google Webmaster tools and wait for the crawlers.
- Reconfigure your webserver logging (httpd.conf if using Apache) to log more details. Make sure you turn back into to the normal data collecting configuration to avoid storage consumption.
Good luck,
Ali
-
I had done about half of that... I'll take a look at all of it and try again tomorrow following your suggestions and see if I can figure it out then. Thanks.
-
Hi Mike,
Hope all is well. There are two things that might have made this confusion. Either you have some outdated links somewhere on your website that are leading to the custom 404 page or some external link is pointing back to your website with a wrong URL or missing product. In order to find the link (I say so, because a crawler has to hit a link to crawl so there is definitely one), you can use tools like Ahrefs link analysis and see what is pointing where. export to an excel and filter based on a RegEx you'd make out a 404 generating pattern you already have with Moz or Google WMT. You find one and you'll know where they are coming from and how to fix them. You'd be able to write custom redirects in your htaccess if they are not many. If they are many though, htaccess could slow down your website and the best way would be a back-end base redirect either custom coded or through a plugin based on your platform. I would start from
- my error_logs in webserver logs and match them with WMT and Moz report.
- download CSV and import to excel or program of your choice
- filter based on the pattern
- Match it with where you've found the link through Ahref
- and Voila, now you know exactly how to clean them up
Hope this helps Mike,
Have a nice day,
Ali
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Help: Blog post translations resulting in 404 Not Found?
A client set up a website that has multilingual functionality (WPML) and the back end is a bit of a mess. The site has around 6 translated versions of the 30 or so existing English blog posts in French, Italian and Spanish - all with their own URLs. The problem is that on the remaining 24 English blog posts, the language changer in the header is still there - even though the majority of posts have not been translated - so when you go to change the language to French, it adds **?lang=fr **onto the existing english URL, and is a page not found (4xx client error). I can't redirect anything because the page does not exist. Is there a way to stop this from happening? I have noticed it's also creating italian/french/spanish translation of the english Categories too. Thanks in advance.
Technical SEO | | skehoe0 -
Soft 404 in Search Console
Search console is showing quite a lot of soft 404 pages on my site, but when I click on the links, the pages are all there. Is there a reason for this? It's a pretty big site - I'm getting 141 soft 404s from about 20,000 pages
Technical SEO | | abisti20 -
Rebranding: 404 to homepage?
Hello all!
Technical SEO | | JohnPalmer
I did a rebranding, [Domain A] -> [Domain B]. what to do with all the 404 pages? 1. [Domain A (404)] -> [Domain B (homepage)]?
2. [Domain A (404)] -> [Domain B (404 page + same url) - for example: xixix.com/page/bla What do you think ?0 -
What to do with 404 errors when you don't have a similar new page to 301 to ??
Hi If you have 404 errors for pages that you dont have similar content pages to 301 them to, should you just leave them (the 404's are optimised/qood quality with related links & branding etc) and they will eventually be de-indexed since no longer exist or should you 'remove url' in GWT ? Cheers Dan
Technical SEO | | Dan-Lawrence0 -
Error 404, Wordpress adds the domain automaticly to the end of the pages, WHY?
Hello guys, I'm using wordpress and the Yoast to help me improve my SEO. Everything went well except for today because "Moz" found 404 errors when scrolling the website saying showing the domain of my website at the end of 12 url. For example :
Technical SEO | | abonnisseau
www.domain.com/service-1/www.domain.com www.domain.com/contact-page/**www.domain.com ** Do you have any idea where does that come from ? Thanks Alex0 -
404 error due to a page which requires a login
what do I do with 404 errors reported in webmaster tools that are actually URLs where users are clicking a link that requires them to log in (so they get sent to a login page). what's the best practice in these cases? Thanks in advance!
Technical SEO | | joshuakrafchin0 -
404 Errors - How to get rid of them?
Hi, I am starting an SEO job on an academic site that has been completely redone. The SEOMoz crawl detected three 404 Errors to pages that cannot be found anywhere on either Joomla or the server. What can I do to solve this? Thanks!!
Technical SEO | | michalseo0 -
Htaccess 301s to 3 different sites
Hi, I'm an htaccess newbie, and I have to redirect and split traffic to three new domains from site A. The original home page has most of the inbound links so I've set up a 301 that goes to site B, the new corporate domain. Options +FollowSymLinks
Technical SEO | | ellenru
RewriteEngine on
RewriteRule (.*) http://www.newdomain.com/$1 [R=301,L] Brand websites C and D need 301s for their folders in site A but I have no idea how to write that in relationship to the first redirect, which really is about the home page, contact and only a few other pages. The urls are duplicates except for the new domain names. They're all on Linux..Site A is about 150 pages, should I write it by page, or can I do some kind of catch all (the first 301) plus the two folders? I'd really appreciate any insight you have and especially if you can show me how to write it. Thanks 🙂0