Significant Google crawl errors
-
We've got a site that continuously like clockwork encounters server errors with when Google crawls it. Since the end of last year it will go a week fine, then it will have two straight weeks of 70%-100% error rate when Google tries to crawl it. During this time you can still put the URL in and go to the site, but spider simulators return a 404 error. Just this morning we had another error message, I did a fetch and resubmit, and magically now it's back. We changed servers on it in Jan to Go Daddy because the previous server (Tronics) kept getting hacked. IIt's built in html so I'm wondering if it's something in the code maybe?
-
This is the URL error list in Webmaster Tools
| Forms/Camp.pdf | 404 | 7/9/13 |
| | 2 | sportsinsurance.php | 404 | 5/2/13 |
| | 3 | Forms/Waiver.pdf | 404 | 7/2/13 |
| | 4 | metro/index.htm | 404 | 6/21/13 |
| | 5 | Forms/Camp_Tournament_Application.pdf | 404 | 7/9/13 |
| | 6 | Forms/Spectator.pdf | 404 | 7/9/13 |
| | 7 | Forms/Boxing.pdf | 404 | 5/6/13 |
| | 8 | sports-camp-insurance.html | 404 | 6/16/13 |
| | 9 | forms/T.C.S._ | 404 | 7/3/13 |
| | 10 | Camp | 404 | 6/14/13 |
| | 11 | Forms/Sports.pdf | 404 | 4/21/13 |
| | 12 | pages/clients.html | 404 | 4/15/13 || |
http://www.campteam.com/: Googlebot can't access your site****July 10, 2013
Over the last 24 hours, Googlebot encountered 13 errors while attempting to connect to your site. Your site's overall connection failure rate is 72.2%.
I've got 23 of these messages going back to 11/12
It tells me that no Robots.txt Fetch issues were encountered, or DNS issues. All errors are related to server connectivity according to Google.
-
I see that your site is dealing fine with 404 errors. Hrmm. Could you copy and paste the crawl error URLs you are getting from webmaster tools? Thanks!
BTW I noticed that you have a duplicate content issue in that you haven't removed the www from your URL. You should add the following code to your .htaccess file.
<code class="htaccess" title="in your .htaccess file">RewriteEngine On RewriteCond %{HTTP_HOST} !^my-domain\.com$ [NC] RewriteRule ^(.*)$ http://my-domain.com/$1 [R=301,L]</code>
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How does Google handle fractions in titles?
Which is better practice, using 1/2" or ½"? The keyword research suggests people search for "1 2" with the space being the "/". How does Google handle fractions? Would ½ be the same as 1/2?
Intermediate & Advanced SEO | | Choice2 -
Google Adsbot crawling order confirmation pages?
Hi, We have had roughly 1000+ requests per 24 hours from Google-adsbot to our confirmation pages. This generates an error as the confirmation page cannot be viewed after closing or by anyone who didn't complete the order. How is google-adsbot finding pages to crawl that are not linked to anywhere on the site, in the sitemap or linked to anywhere else? Is there any harm in a google crawler receiving a higher percentage of errors - even though the pages are not supposed to be requested. Is there anything we can do to prevent the errors for the benefit of our network team and what are the possible risks of any measures we can take? This bot seems to be for evaluating the quality of landing pages used in for Adwords so why is it trying to access confirmation pages when they have not been set for any of our adverts? We included "Disallow: /confirmation" in the robots.txt but it has continued to request these pages, generating a 403 page and an error in the log files so it seems Adsbot doesn't follow robots.txt. Thanks in advance for any help, Sam
Intermediate & Advanced SEO | | seoeuroflorist0 -
Does Google see this as duplicate content?
I'm working on a site that has too many pages in Google's index as shown in a simple count via a site search (example): site:http://www.mozquestionexample.com I ended up getting a full list of these pages and it shows pages that have been supposedly excluded from the index via GWT url parameters and/or canonicalization For instance, the list of indexed pages shows: 1. http://www.mozquestionexample.com/cool-stuff 2. http://www.mozquestionexample.com/cool-stuff?page=2 3. http://www.mozquestionexample.com?page=3 4. http://www.mozquestionexample.com?mq_source=q-and-a 5. http://www.mozquestionexample.com?type=productss&sort=1date Example #1 above is the one true page for search and the one that all the canonicals reference. Examples #2 and #3 shouldn't be in the index because the canonical points to url #1. Example #4 shouldn't be in the index, because it's just a source code that, again doesn't change the page and the canonical points to #1. Example #5 shouldn't be in the index because it's excluded in parameters as not affecting page content and the canonical is in place. Should I worry about these multiple urls for the same page and if so, what should I do about it? Thanks... Darcy
Intermediate & Advanced SEO | | 945010 -
Link from Google.com
Hi guys I've just seen a website get a link from Google's Webmaster Snippet testing tool. Basically, they've linked to a results page for their own website test. Here's an example of what this would look like for a result on my website. http://www.google.com/webmasters/tools/richsnippets?q=https%3A%2F%2Fwww.impression.co.uk There's a meta nofollow, but I just wondered what everyone's take is on Trust, etc, passing down? (Don't worry, I'm not encouraging people to go out spamming links to results pages!) Looking forward to some interesting responses!
Intermediate & Advanced SEO | | tomcraig860 -
Dropped from Google?
My website www.weddingphotojournalist.co.uk appears to have been penalised by Google. I ranked fairly well for a number of venue related searches from my blog posts. Generally I'd find myself somewhere on page one or towards the top of page two. However recently I found I am nowhere to be seen for these venue searches. I still appear if I search for my name, business name and keywords in my domain name. A quick check of Yahoo and I found I am ranking very well, it is only Google who seem to have dropped me. I looked at Google webmaster tools and there are no messages or clues as to what has happened. However it does show my traffic dropping off a cliff edge on the 19th July from 850 impressions to around 60 to 70 per day. I haven't made any changes to my website recently and hadn't added any new content in July. I haven't added any new inbound links either, a search for inbound links does not show anything suspicious. Can anyone shed any light on why this might happen?
Intermediate & Advanced SEO | | weddingphotojournalist0 -
Stop Google crawling a site at set times
Hi All I know I can use robots.txt to block Google from pages on my site but is there a way to stop Google crawling my site at set times of the day? Or to request that they crawl at other times? Thanks Sean
Intermediate & Advanced SEO | | ske110 -
Moving Code for Faster Crawl Through?
What are best practices for moving code into other folders to help speed up a crawling for bots? We once moved some javascript from an SEO's suggestion and the site suddenly looked like crap until we undid the changes. How do you figure our what code should be consolidated? What code do you use to indicate what has been moved and to where?
Intermediate & Advanced SEO | | siteoptimized0 -
Does Google penalize for having a bunch of Error 404s?
If a site removes thousands of pages in one day, without any redirects, is there reason to think Google will penalize the site for this? I have thousands of subcategory index pages. I've figured out a way to reduce the number, but it won't be easy to put in redirects for the ones I'm deleting. They will just disappear. There's no link juice issue. These pages are only linked internally, and indexed in Google. Nobody else links to them. Does anyone think it would be better to remove the pages gradually over time instead of all at once? Thanks!
Intermediate & Advanced SEO | | Interesting.com0