Google can't access/crawl my site!
-
Hi
I'm dealing with this problem for a few days. In fact i didn't realize it was this serious until today when i saw most of my site "de-indexed" and losing most of the rankings.
[URL Errors: 1st photo]
8/21/14 there were only 42 errors but in 8/22/14 this number went to 272 and it just keeps going up.
The site i'm talking about is gazetaexpress.com (media news, custom cms) with lot's of pages.
After i did some research i came to the conclusion that the problem is to the firewall, who might have blocked google bots from accessing the site. But the server administrator is saying that this isn't true and no google bots have been blocked.
Also when i go to WMT, and try to Fetch as Google the site, this is what i get:
[Fetch as Google: 2nd photo]
From more than 60 tries, 2-3 times it showed Complete (and this only to homepage, never to articles).
What can be the problem? Can i get Google to crawl properly my site and is there a chance that i will lose my previous rankings?
Thanks a lot
Granit -
What did you do specifically to mitigate the problem? You can PM me, if you would like.
-
This applies to the guy from Albania.
Oh, this IS the guy from Albania. Never mind.
-
Great, thanks for letting us know what happened with this!
-
Hi all
Just wanted to let you know that we fixed the problem. We disabled CloudFlare which we found out was blocking Google bots. More about this issue can be found at: https://support.cloudflare.com/hc/en-us/articles/200169806-I-m-getting-Google-Crawler-Errors-What-should-I-do-
-
Hi Travis, thank you for your time.
Great for your friend, I also suggest to visit Kosovo someday, you will have great time here, for sure
Back to the issue:
Here is an interesting issue that is happening with the crawler.
Our own cms uses htaccess for rewrite purposes. I created 2 new files that are independent from CMS and tried to fetch them with WMT, and it worked like a charm.
These 2 independent files are:
www.gazetaexpress.com/test_manaferra.php
www.gazetaexpress.com/xhezidja.php
Then, I created an ajax page with our CMS, which contains only plain text, tried to fetch it by WMT and strangely enough it didn't work. To make sure that the .htaccess file is not affecting this behavior, I deleted the htaccess and tried to fetch it, but it didn't worked.
The ajax page is: www.gazetaexpress.com/page/xhezidja/?pageSEO=false
The site works perfectly for humans which access it via the browser.
I'm more than confused now!
-
A friend of mine just got back from Kosovo. It was the last stop on a tour of the Balkans. He had a pretty good time. Moving along...
I crawled about 12K URLs and hit almost 90 Internal Server Errors (500). It's probably not your core problem, but it's something to look at. Here are a few examples:
http://www.gazetaexpress.com/blihet/?search_category_id=1&searchFilter=1
http://www.gazetaexpress.com/shitet/?category_id=134&searchFilter=1
http://www.gazetaexpress.com/me-qera/?category_id=131&searchFilter=1
There was one actual page that threw a 500 at the time of crawl:
http://www.gazetaexpress.com/mistere/edhe-kesaj-i-thuhet-veze-22591/
The edhe kesaj page now resolves fine. (I'm not even going to pretend to understand or write Albanian.)
So there may be some issues with the server or hosting. If you haven't already, try this troubleshooter from Cloudflare.
-
Ah OK - well keep us updated with what you find. Someone else will chip in with other info if they have some
-Andy
-
We are suspecting that CloudFlare might be causing these troubles. We are trying everything, in the meantime i'm looking here to see if anyone has any similar experience or an idea for solution.
As for warnings, the only warning we had was the one last week (8/23/14) saying that Google bot can't acces our site:
Over the last 24 hours, Googlebot encountered 316 errors while attempting to connect to your site. Your site's overall connection failure rate is 7.5%.
-Granit
-
It doesn't look like a firewall, as I can crawl it with Screaming Frog. However, the server logs will be able to answer that one for you.
Without looking in depth, I'm not seeing anything that stands out to me - do you think that there have been changes to the server that could cause issues? What firewall is the server running? Also, if there were errors in crawling the site, you would see a warning about this.
-Andy
-
In mid-march website changed it's CMS but i don't think that could be the reason because until this week everything was working perfectly. I don't think it could have been compromised too. I'm still suspecting it could be the firewall blocking bots from crawling the site, but the server administrator couldn't find any evidence of this.
-
Hi Granit,
Has any work been done to the site in the last 2-3 months? Have you had any warnings in webmaster tools at all? I did once see a strange problem where Google wasn't crawling a site correctly because it had been compromised, but after checking, there is nothing like this on yours.
-Andy
-
No prb. Thanks a lot for your time. Let just hope that someone in the community will help with a solution
-
Unfortunately, I don't have a quick answer for you. Looking forward to seeing what other community members have to say on this one!
-
I'm looking at the http version in GWT
-
If I do a site:gazetaexpress.com in Google, I get some results that are http, and some results that are https. The https ones say there is an SSL connection error.
Are you looking at the http or https version in GWT?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why doesn't my website crawl by Google?
Hi mozzers and members, I am having issues, why my website: http://profilecosmeticsurgery.com/ crawl by Google? let me share more clearly when this starts happening. A month or around 45 days back our website is being indexed and crawled quite well without any issues with having .html extension pages with static built website.
Intermediate & Advanced SEO | | SEOOOOOoooooooo
We finally thought to change to .php version and make whole website and its pages to be treated dynamically.
Once we changed all changes, thereafter this issues started. It has been more than 45 days, our website isn't being crawled since then. I didn't know what are the things preventing this to? Please help. Thanks in Advance Capture1.PNG0 -
Sites still rank who don't seem like they should. Why?
So you've been MOZing and SEOing for years and we're all convinced of the 10x factor when it comes to content and ranking for certain search terms... right? So what do you do when some older sites that don't even produce content dominate the first page of a very important search term? They're home pages with very little content and have clearly all dabbled in pre Panda SEO. Surely people are still seeing this and wondering why?
Intermediate & Advanced SEO | | wearehappymedia0 -
Site not indexed in Google UK
This site was moved to a new host by the client a month back and is still not indexed in Google UK if you search for the site directly. www.loftconversionswestsussex.com Webmaster tools shows that 55 pages have been crawled and no errors have been detected. The client also tried the "Fetch as Google Bot" tactic in GWT as well as running a PPC campaign and the site is still not appearing in Google. Any thoughts please? Cheers, SEO5..
Intermediate & Advanced SEO | | SEO5Team0 -
Does Google still don't index Hashtag Links ? No chance to get a Search Result that leads directly to a section of a page? or to one of numeras Hashtag Pages in a single HTML page?
Does Google still don't index Hashtag Links ? No chance to get a Search Result that leads directly to a section of a page? or to one of numeras Hashtag Pages in a single HTML page? If I have 4 or 5 different hashtag link section pages , consolidated into one HTML Page, no chance to get one of the Hashtag Pages to appear as a search result? like, if under one Single Page Travel Guide I have two essential sections: #Attractions #Visa no chance to direct search queries for Visa directly to the Hashtag Link Section of #Visa? Thanks for any help
Intermediate & Advanced SEO | | Muhammad_Jabali0 -
Organic Google Sitelinks - can I edit?
A client just contacted me saying a competitor is threatening legal action threatened against a page description in a Google sitelink (just double checked and its in the serps results too). I checked the site and the content doesn't appear on that page or anywhere else on the site. I also added a Meta description to that page to see if that will have an effect. The page is the home page and I don't really want to demote it. Is there anything else I can/should do?
Intermediate & Advanced SEO | | agua0 -
Is there anyway to recover my site's rankings?
My site has been top 3 for 'speed dating' on Google.co.uk since about 2003 and it went to below top 50 for a lot of it's main keywords shortly after 27 Oct 2012. I did a re-submission request and was told there was 'no manual spam action'. My conclusions is I was dropped by Google because of poor quality links I've gained over 10+ years. I have a Domain Authority of 40, a regular blog http://bit.ly/oKyi88, a KLOUT of 42, user reviews and quality content. Since Oct 2012 I've done some technical improvements and managed to get a few questionable links removed. I've continued blogging reguarly and got more active on Twitter. I've seen no improvement and my traffic is 80% down on last year. It would be great to be able to produce content that others want to link to but I've not had much success from that in over 10 years of trying and I've not seen many others in my sector, with small budgets having much success. Is there anything I can do to regain favour with Google?
Intermediate & Advanced SEO | | benners0 -
Google & Bing not indexing a Joomla Site properly....
Can someone explain the following to me please. The background: I launched a new website - new domain with no history. I added the domain to my Bing webmaster tools account, verified the domain and submitted the XML sitemap at the same time. I added the domain to my Google analytics account and link webmaster tools and verified the domain - I was NOT asked to submit the sitemap or anything. The site has only 10 pages. The situation: The site shows up in bing when I search using site:www.domain.com - Pages indexed:- 1 (the home page) The site shows up in google when I search using site:www.domain.com - Pages indexed:- 30 Please note Google found 30 pages - the sitemap and site only has 10 pages - I have found out due to the way the site has been built that there are "hidden" pages i.e. A page displaying half of a page as it is made up using element in Joomla. My questions:- 1. Why does Bing find 1 page and Google find 30 - surely Bing should at least find the 10 pages of the site as it has the sitemap? (I suspect I know the answer but I want other peoples input). 2. Why does Google find these hidden elements - Whats the best way to sort this - controllnig the htaccess or robots.txt OR have the programmer look into how Joomla works more to stop this happening. 3. Any Joomla experts out there had the same experience with "hidden" pages showing when you type site:www.domain.com into Google. I will look forward to your input! 🙂
Intermediate & Advanced SEO | | JohnW-UK0 -
How a press release can help with Google serp?
Hi, After publishing a press release, if that press release is on top position on Google News for the keyword, how will it effect the SERP for that website?
Intermediate & Advanced SEO | | purplar0