20 x '400' errors in site but URLs work fine in browser...
-
Hi, I have a new client set-up in SEOmoz and the crawl completed this morning... I am picking up 20 x '400' errors, but the pages listed in the crawl report load fine... any ideas?
example -
-
Most major robots obey crawl delays. You could check your errors in Google Webmaster Tools to see if your site is serving a lot of error pages when Google crawls.
I suspect Google is pretty smart about slowing down its crawl rate when it encounters too many errors, so it's probably safe to not include a crawl delay for Google.
-
Sorry, one last question.
Do I need to add a similar delay for Google Bots, or is this issue specifically a Roger Bot problem?
Thanks
-
Fantastic, thanks, Cyrus and Tampa, prevented many more hours of scratching head!!!
-
Hi Justin,
Sometimes when rogerbot crawls a site, the servers and/or the content management system can get overwhelmed if roger is going to fast, and this causes your site to deliver error pages as roger crawls.
If the problem persists, you might consider installing a crawl delay for roger in your robots.txt file. It would look something like this:
User-agent: rogerbot
Crawl-delay: 5This would cause the SEOmoz crawlers to wait 5 seconds before fetching each page. Then, if the problem still persists, feel free to contact the help team at help@seomoz.org
Hope this helps! Best of luck with your SEO!
-
Thanks Tampa SEO, good advice.
Interestingly, the URL listed in SEOmoz is as follows:
www.morethansport.co.uk/brand/adidas?sortDirection=ascending&sortField=Price&category=sport and leisure
But when I look at the link in the referring page it is as follows:
/brand/adidas?sortDirection=ascending&sortField=Price&category=sport%20and%20leisure
notice the "%" symbol instead of the spaces.
The actual URL is the one listed in SEOmoz but even if I copy and paste the % version, the browser removed the '%' and the page loads fine.
I still can't get the site to throw-up a 400.
-
Just ran the example link that you provided through two independent HTTP response code checkers, and both are giving me a 200 response, i.e. the site is OK.
This question has been asked before on here; you're definitely not the first person to run into the issue.
One way to diagnose what's going on is to dig a little deeper into the crawling report that SEOmoz generated. Download the CSV file and look at the referring link, i.e. on which page Roger found the link. Then go to that page and look if your CMS is doing anything weird with the way it outputs the links that you create. I recall someone back in December having the same issue and eventually resolved it by noticing that his CMS put all sort of weird slashes (i.e. /.../...) into the link.
Good luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
My site on desktop browser: page 2 /mobile browser: page 0
Using my two most pertinent keywords in Chome my site shows up page two. Using the same keywords on my iPhone does not show my site at all (I clicked on to page 15). I have a mobile ranking of 84 on Google PageSpeed Insights. Could be a bit higher but not enough to totally ignore my site. What am I missing?
On-Page Optimization | | artsp0 -
Site Wide Links
Howdy Moz! So our agency has been around for long enough to have a few sites we've built that have our credit in their footer resulting in a site wide link. Mostly just our name. We've heard that Google does not particularly like site wide links, should we go through and remove some of these old links?
On-Page Optimization | | wearehappymedia0 -
URL Structure Suggestion
Hi
On-Page Optimization | | sandeep.clickdesk
My site url: http://goo.gl/AiOgu1
We are working on URL structure of our website. I have one query about URL structure.
Which one is good URL structure according to user and SEO prospective.
The targeted keyword for the particular page is "wordpress live chat". Is it worthful to rewrite the present url "https://www.abc.com/wordpress" to "https://www.abc.com/wordpress-live-chat" Please suggest.0 -
Short URL's vs Optimised URL's
Howdy Mozzers! What are your thoughts on short URL's vs Optimised URL's. For example if a website currently sells wood furniture and wants to target the keyword "Wood Furniture For Sale", which URL would be preferable: Short URL: www.domain.com/wood-furniture Optimised URL: www.domain.com/wood-furniture-for-sale The website also uses facet navigation and selected attributes are added in a fixed order sequence after the category. For example if Cane is selected as wood type: Short URL: www.domain.com/wood-furniture/Cane Optimised URL: www.domain.com/wood-furniture-for-sale/Cane Which one do you prefer (between the short URL and optimised URL) and why? Cheers! MozAddict
On-Page Optimization | | MozAddict0 -
When You Add a Robots.txt file to a website to block certain URLs, do they disappear from Google's index?
I have seen several websites recently that have have far too many webpages indexed by Google, because for each blog post they publish, Google might index the following: www.mywebsite.com/blog/title-of-post www.mywebsite.com/blog/tag/tag1 www.mywebsite.com/blog/tag/tag2 www.mywebsite.com/blog/category/categoryA etc My question is: if you add a robots.txt file that tells Google NOT to index pages in the "tag" and "category" folder, does that mean that the previously indexed pages will eventually disappear from Google's index? Or does it just mean that newly created pages won't get added to the index? Or does it mean nothing at all? thanks for any insight!
On-Page Optimization | | williammarlow0 -
Why so many crawl errors?
Our site is showing it has a ton of crawl errors in the back end, mostly concerning duplicate content within our blog. The content is unique however. We know this for certain because it's done in house or put together by some of the freelance writers we work with. The site is for an RV dealership and we're using a template-based system from a well known company. Any ideas on what may be causing this?
On-Page Optimization | | BlakeArbogast0 -
URL Length
I know a URL should "technically" shorter than 75 characters. Does that include the http://www.domainname.com ? Thank you 🙂
On-Page Optimization | | Libra0130