20 x '400' errors in site but URLs work fine in browser...
-
Hi, I have a new client set-up in SEOmoz and the crawl completed this morning... I am picking up 20 x '400' errors, but the pages listed in the crawl report load fine... any ideas?
example -
-
Most major robots obey crawl delays. You could check your errors in Google Webmaster Tools to see if your site is serving a lot of error pages when Google crawls.
I suspect Google is pretty smart about slowing down its crawl rate when it encounters too many errors, so it's probably safe to not include a crawl delay for Google.
-
Sorry, one last question.
Do I need to add a similar delay for Google Bots, or is this issue specifically a Roger Bot problem?
Thanks
-
Fantastic, thanks, Cyrus and Tampa, prevented many more hours of scratching head!!!
-
Hi Justin,
Sometimes when rogerbot crawls a site, the servers and/or the content management system can get overwhelmed if roger is going to fast, and this causes your site to deliver error pages as roger crawls.
If the problem persists, you might consider installing a crawl delay for roger in your robots.txt file. It would look something like this:
User-agent: rogerbot
Crawl-delay: 5This would cause the SEOmoz crawlers to wait 5 seconds before fetching each page. Then, if the problem still persists, feel free to contact the help team at help@seomoz.org
Hope this helps! Best of luck with your SEO!
-
Thanks Tampa SEO, good advice.
Interestingly, the URL listed in SEOmoz is as follows:
www.morethansport.co.uk/brand/adidas?sortDirection=ascending&sortField=Price&category=sport and leisure
But when I look at the link in the referring page it is as follows:
/brand/adidas?sortDirection=ascending&sortField=Price&category=sport%20and%20leisure
notice the "%" symbol instead of the spaces.
The actual URL is the one listed in SEOmoz but even if I copy and paste the % version, the browser removed the '%' and the page loads fine.
I still can't get the site to throw-up a 400.
-
Just ran the example link that you provided through two independent HTTP response code checkers, and both are giving me a 200 response, i.e. the site is OK.
This question has been asked before on here; you're definitely not the first person to run into the issue.
One way to diagnose what's going on is to dig a little deeper into the crawling report that SEOmoz generated. Download the CSV file and look at the referring link, i.e. on which page Roger found the link. Then go to that page and look if your CMS is doing anything weird with the way it outputs the links that you create. I recall someone back in December having the same issue and eventually resolved it by noticing that his CMS put all sort of weird slashes (i.e. /.../...) into the link.
Good luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is there a limit to the number of duplicate pages pointing to a rel='canonical ' primary?
We have a situation on twiends where a number of our 'dead' user pages have generated links for us over the years. Our options are to 404 them, 301 them to the home page, or just serve back the home page with a canonical tag. We've been 404'ing them for years, but i understand that we lose all the link juice from doing this. Correct me if I'm wrong? Our next plan would be to 301 them to the home page. Probably the best solution but our concern is if a user page is only temporarily down (under review, etc) it could be permanently removed from the index, or at least cached for a very long time. A final plan is to just serve back the home page on the old URL, with a canonical tag pointing to the home page URL. This is quick, retains most of the link juice, and allows the URL to become active again in future. The problem is that there could be 100,000's of these. Q1) Is it a problem to have 100,000 URLs pointing to a primary with a rel=canonical tag? (Problem for Google?) Q2) How long does it take a canonical duplicate page to become unique in the index again if the tag is removed? Will google recrawl it and add it back into the index? Do we need to use WMT to speed this process up? Thanks
On-Page Optimization | | dsumter0 -
Should I use an acronym in my URL?
I know that Google understands various acronyms. Example: If I search for CRM System, it knows i'm searching for a customer relationship management system. However, will it recognize less known acronyms? I have a page geared specifically for SAP data archiving for human capital management systems. For those in the industry, they simply call it HCM. Here is how I view my options: Option #1: www.mywebsite.com/sap-data-archiving/human-capital-management Option #2: www.mywebsite.com/sap-data-archiving/hcm Option #3: www.mywebsite.com/sap-data-archiving/hcm-human-capital-management With option #3, i'm capturing the acronym AND the full phrase. This doesn't make my URL overly long either. Of course, in my content i'll reference both. What does everyone else think about the URL? -Alex
On-Page Optimization | | MeasureEverything0 -
SEO for a site in development
We've recently taken on a new client for an initial 6 months for SEO (until their new site is going live) to help build traffic to the site. They are currently getting a new website built so don't want work done on their current site... but due to the current structure it is making it difficult for us to improve rankings for a number of keywords. They are essentially a booking engine for services across the UK so it is just a home page with a search filtering through their services, locations and dates which leads to a results page. It is a combination of services and locations we need to target keywords for but there are no appropriate landing pages due to the site layout. The one thing they are happy for us to work on is the blog, so my question is would it be best to create landing pages on the blog targeting keywords such as 'sports massages in London' and build links to these pages? Then when it is time for the new site, with new appropriate landing pages, simply 301 redirect these pages? If anyone has any input on this idea or suggestions for other ways about it we'd be delighted to hear from you Thanks
On-Page Optimization | | Will_Craig0 -
URL parameters
Hello, Currently, I paginated a content to 5 pages eg: http://abc.com/faqs.html?&page=2 Is it right? and how to check it is correct or not?
On-Page Optimization | | JohnHuynh0 -
Does 'XXX' in Domain get filtered by Google
I have a friend that has xxx in there domain and they are a religious based sex/porn addiction company but they don't show up for the queries that they are optimized against. They have a 12+ year old domain, all good health signs in quality links and press from trusted companies. Google sends them adult traffic, mostly 'trolls' and not the users they are looking for. Has anyone experienced domain word filtering and have a work around or solution? I posted in the Google Webmaster help forums and that community seems a little 'high on their horses' and are trying to hard to be cool. I am not too religious and don't necessarily support the views of the website but just trying to help a friend of a friend with a topic that I have never encountered. here is the url: xxxchurch.com Thanks, Brian
On-Page Optimization | | Add3.com0 -
Should I rewrite all my URLs ?
Hi all, I'm pretty new here and this is a question I'm struggling with since years ! All my URLs are very long. Years ago I wanted to put as many keywords as possible but today I'm not sure anymore it was such a good idea. Example: http://www.spirit-of-metal.com/album-groupe-Take_Me_To_Janus-nom_album-Ripping_the_Heart_from_the_Chest_of_the_Earth-l-en.html The problem is I have more than 300K of these pages. I'm afraid to create a huge mess even if I 301 them all to the new pages. What's your opinion ? Is it worth the effort ? Many thanks in advance for your precious help !
On-Page Optimization | | kivanSOM0 -
URL extensions naming
I have always wrote URL extensions as www.mysite.com/two_words.html .... when I need to separate two words, I use _ as the separator ... I am a first time SEO Moz user ... I While looking around the tools on SEO Moz, I happened to stumble across the on-page analysis. A great tool indeed, rather worryingly though, one issue it flagged to me was my URL extension "Characters which are less commonly used in URLs may cause problems with accessibility, interpretation and ranking in search engines. It is considered a best practice to stick to standard URL structures to avoid potential problems." Can someone advice me if this really is a problem, its just not this project, its tons of sites I have already developed that I am also worried about ... I always write file extensions with more than one word using _ to separate the words. How should I write the extension, I am almost embarrassed to ask this question ... Surely, even Google's algorithms are not smart enough to decipher two words without some some sort of spacing .... Regards J
On-Page Optimization | | Johnny4B0 -
Sister Sites or Joint Family?
A large News Media Group has a Tv Channel, print newspaper, radio channel (for music primarly) and an online website that includes the newspaper content and other original content in different media. My question is, is it better to have independant websites for these different mediums or have all the content on one big website. Currently the newspaper and blog are online as one whereas the radio channel has its own website and the television has its own. So should we maintain sister sites and cross link to each other or have one big happy family under one house? Best, Rishad.
On-Page Optimization | | RishadShaikh590