20 x '400' errors in site but URLs work fine in browser...
-
Hi, I have a new client set-up in SEOmoz and the crawl completed this morning... I am picking up 20 x '400' errors, but the pages listed in the crawl report load fine... any ideas?
example -
-
Most major robots obey crawl delays. You could check your errors in Google Webmaster Tools to see if your site is serving a lot of error pages when Google crawls.
I suspect Google is pretty smart about slowing down its crawl rate when it encounters too many errors, so it's probably safe to not include a crawl delay for Google.
-
Sorry, one last question.
Do I need to add a similar delay for Google Bots, or is this issue specifically a Roger Bot problem?
Thanks
-
Fantastic, thanks, Cyrus and Tampa, prevented many more hours of scratching head!!!
-
Hi Justin,
Sometimes when rogerbot crawls a site, the servers and/or the content management system can get overwhelmed if roger is going to fast, and this causes your site to deliver error pages as roger crawls.
If the problem persists, you might consider installing a crawl delay for roger in your robots.txt file. It would look something like this:
User-agent: rogerbot
Crawl-delay: 5This would cause the SEOmoz crawlers to wait 5 seconds before fetching each page. Then, if the problem still persists, feel free to contact the help team at help@seomoz.org
Hope this helps! Best of luck with your SEO!
-
Thanks Tampa SEO, good advice.
Interestingly, the URL listed in SEOmoz is as follows:
www.morethansport.co.uk/brand/adidas?sortDirection=ascending&sortField=Price&category=sport and leisure
But when I look at the link in the referring page it is as follows:
/brand/adidas?sortDirection=ascending&sortField=Price&category=sport%20and%20leisure
notice the "%" symbol instead of the spaces.
The actual URL is the one listed in SEOmoz but even if I copy and paste the % version, the browser removed the '%' and the page loads fine.
I still can't get the site to throw-up a 400.
-
Just ran the example link that you provided through two independent HTTP response code checkers, and both are giving me a 200 response, i.e. the site is OK.
This question has been asked before on here; you're definitely not the first person to run into the issue.
One way to diagnose what's going on is to dig a little deeper into the crawling report that SEOmoz generated. Download the CSV file and look at the referring link, i.e. on which page Roger found the link. Then go to that page and look if your CMS is doing anything weird with the way it outputs the links that you create. I recall someone back in December having the same issue and eventually resolved it by noticing that his CMS put all sort of weird slashes (i.e. /.../...) into the link.
Good luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Site Structure question?
Hey guys, Sorry for posting this again but the last thread got a bit too wayword. I'll sum it up better here. We're producing a WordPress theme every 3-6 months. Each is differently niched (eg: ecommerce, restaurant, magazine, etc...) Which option is better for our products going forward (even the ones we've yet to launch...eg...which method will get future projects more "trust juice" from google): A: create a subfolder for each theme eg: http://bigbangthemes.net/TicketLab_WP/wordpress-ticket-system & http://bigbangthemes.net/Showoff_WP/landing-page/ **This is currently what we're doing.**B: have them all under bigbangthemes.net/wordpress-themes/ eg: bigbangthemes.net/wordpress-themes/wordpress-ticket-system & bigbangthemes.net/wordpress-themes/showoff-startup-agency-theme Thanks for the help!
On-Page Optimization | | andy.bigbangthemes0 -
Hiding body copy with a 'read more' button
Hi Whats the consequences of hiding half of the lovingly crafted body copy/written content (good quality modern version of what we used to call seo text) i have written for a clients main site sections and then having a 'read more' button to reveal ? I have written 500+ words for each page but client wants to reduce word count displayed since thinks looks too 'wordy'! I know that this is possible and used to be fine if done in a manner that was still crawlable, is this still the case ? Cheers Dan
On-Page Optimization | | Dan-Lawrence0 -
My Site's Name Not Ranking in Google
Hey all, I've seen a few posts like this. But I wanted to start a new thread in hopes I may find the underlying issue. I've had my site: http://www.ctrl-alt-success.com for about 2 years. Recently I've started really adding a lot of content to it. (about 2-3 posts a week). I get zero organic views which is fine as I know it's still in the beginning. But here's my main question. If I type "ctrl-alt-success" into google. I get some site that shows up. "ctrlaltsuccess.com" I've been looking at this issue forever. That site has been "coming soon" for nearly 2 years. lol My site doesn't even show up on the first 10 pages of google. However in Bing and Yahoo it ranks on the first page. What could my site be doing wrong that it's not even ranking for the exact domain name? Keep in mind, if I google "ctrl-alt-success.com" my site comes up fine. Any help would be appreciated, thanks!
On-Page Optimization | | Ctrl-Alt-Success0 -
400 error - Phone number link.
I am getting 400 errors for all my pages that have a phone number with a link to Skype etc on click, is this a genuine issue or am I ok? How do I resolve this? Any bright ideas, here is an example of the issue - http://www.arts1.co.uk/5-reasons-to-choose-arts1 There are pages of these and I am not sure what to do? Many Thanks James Grimsey
On-Page Optimization | | jamesgrimsey0 -
How to use canonical with mobile site to main site
I am pretty sure that the mobile version of the main site needs to be the same canonical link from what I understand. I am trying to find good docuementation that supports this. Even better if its from Google or Matt Cutts. I have a main domain like http://www.mydomain.com the mobile version of this is http://www.mydomain.com/m/ Should my canonical be rel="canonical" href="http://www.mydomain.com"/> for both these pages?
On-Page Optimization | | cbielich0 -
Do images work as a H1
Is a h1 tag wrapped image with a optimized alt tag as effective as text wrapped in a h1 tag?
On-Page Optimization | | EAOM0 -
Meta refresh - nojavascript url
seomox is telling me that I am getting a page that is not being indexed or crawled and since the crawl status code is 200 and there are no robots the meta-refresh url must be the problem. the meta refresh url is different than the on page report card url as it's the nojavascript url which my developer says should be ok. see his comments below. The is redirecting to http://mastermindtoys.com/store/nojavascript.html only in case if the JavaScript is disabled in the client browser. This is the right way to do it, I don’t understand why this might be a problem, otherwise MM has to implement Noscript pages that have a real content. I didn’t get what’s wrong about accessibility. The code 200 means it is accessible, and yes there is nothing to access if JavaScript is disabled on browser. I think there are no modern retail sites that would do any sensible business with the scripting disabled in browsers.The H1 is really present 2 times and second occurrence can be removed, though I highly doubt about importance of this change.Regarding duplicates – what URLs are considered duplicates? Can you please send me examples?I am not aware of canonical URL problem for MM site unless we consider old .asp links as duplicate links of the canonical product pages. I would appreciate if SEOMoz gave us an example what they mean.I suspect that the page is not getting indexed as a result of this or I'm just not getting a good score. Which is it?
On-Page Optimization | | mastermindtoys0 -
What's the best practice for implementing a "content disclaimer" that doesn't block search robots?
Our client needs a content disclaimer on their site. This is a simple "If you agree to these rules then click YES if not click NO" and you're pushed back to the home page. I have this gut feeling that this may cause an upset with the search robots. Any advice? R/ John
On-Page Optimization | | TheNorthernOffice790