Google can't access/crawl my site!
-
Hi
I'm dealing with this problem for a few days. In fact i didn't realize it was this serious until today when i saw most of my site "de-indexed" and losing most of the rankings.
[URL Errors: 1st photo]
8/21/14 there were only 42 errors but in 8/22/14 this number went to 272 and it just keeps going up.
The site i'm talking about is gazetaexpress.com (media news, custom cms) with lot's of pages.
After i did some research i came to the conclusion that the problem is to the firewall, who might have blocked google bots from accessing the site. But the server administrator is saying that this isn't true and no google bots have been blocked.
Also when i go to WMT, and try to Fetch as Google the site, this is what i get:
[Fetch as Google: 2nd photo]
From more than 60 tries, 2-3 times it showed Complete (and this only to homepage, never to articles).
What can be the problem? Can i get Google to crawl properly my site and is there a chance that i will lose my previous rankings?
Thanks a lot
Granit -
What did you do specifically to mitigate the problem? You can PM me, if you would like.
-
This applies to the guy from Albania.
Oh, this IS the guy from Albania. Never mind.
-
Great, thanks for letting us know what happened with this!
-
Hi all
Just wanted to let you know that we fixed the problem. We disabled CloudFlare which we found out was blocking Google bots. More about this issue can be found at: https://support.cloudflare.com/hc/en-us/articles/200169806-I-m-getting-Google-Crawler-Errors-What-should-I-do-
-
Hi Travis, thank you for your time.
Great for your friend, I also suggest to visit Kosovo someday, you will have great time here, for sure
Back to the issue:
Here is an interesting issue that is happening with the crawler.
Our own cms uses htaccess for rewrite purposes. I created 2 new files that are independent from CMS and tried to fetch them with WMT, and it worked like a charm.
These 2 independent files are:
www.gazetaexpress.com/test_manaferra.php
www.gazetaexpress.com/xhezidja.php
Then, I created an ajax page with our CMS, which contains only plain text, tried to fetch it by WMT and strangely enough it didn't work. To make sure that the .htaccess file is not affecting this behavior, I deleted the htaccess and tried to fetch it, but it didn't worked.
The ajax page is: www.gazetaexpress.com/page/xhezidja/?pageSEO=false
The site works perfectly for humans which access it via the browser.
I'm more than confused now!
-
A friend of mine just got back from Kosovo. It was the last stop on a tour of the Balkans. He had a pretty good time. Moving along...
I crawled about 12K URLs and hit almost 90 Internal Server Errors (500). It's probably not your core problem, but it's something to look at. Here are a few examples:
http://www.gazetaexpress.com/blihet/?search_category_id=1&searchFilter=1
http://www.gazetaexpress.com/shitet/?category_id=134&searchFilter=1
http://www.gazetaexpress.com/me-qera/?category_id=131&searchFilter=1
There was one actual page that threw a 500 at the time of crawl:
http://www.gazetaexpress.com/mistere/edhe-kesaj-i-thuhet-veze-22591/
The edhe kesaj page now resolves fine. (I'm not even going to pretend to understand or write Albanian.)
So there may be some issues with the server or hosting. If you haven't already, try this troubleshooter from Cloudflare.
-
Ah OK - well keep us updated with what you find. Someone else will chip in with other info if they have some
-Andy
-
We are suspecting that CloudFlare might be causing these troubles. We are trying everything, in the meantime i'm looking here to see if anyone has any similar experience or an idea for solution.
As for warnings, the only warning we had was the one last week (8/23/14) saying that Google bot can't acces our site:
Over the last 24 hours, Googlebot encountered 316 errors while attempting to connect to your site. Your site's overall connection failure rate is 7.5%.
-Granit
-
It doesn't look like a firewall, as I can crawl it with Screaming Frog. However, the server logs will be able to answer that one for you.
Without looking in depth, I'm not seeing anything that stands out to me - do you think that there have been changes to the server that could cause issues? What firewall is the server running? Also, if there were errors in crawling the site, you would see a warning about this.
-Andy
-
In mid-march website changed it's CMS but i don't think that could be the reason because until this week everything was working perfectly. I don't think it could have been compromised too. I'm still suspecting it could be the firewall blocking bots from crawling the site, but the server administrator couldn't find any evidence of this.
-
Hi Granit,
Has any work been done to the site in the last 2-3 months? Have you had any warnings in webmaster tools at all? I did once see a strange problem where Google wasn't crawling a site correctly because it had been compromised, but after checking, there is nothing like this on yours.
-Andy
-
No prb. Thanks a lot for your time. Let just hope that someone in the community will help with a solution
-
Unfortunately, I don't have a quick answer for you. Looking forward to seeing what other community members have to say on this one!
-
I'm looking at the http version in GWT
-
If I do a site:gazetaexpress.com in Google, I get some results that are http, and some results that are https. The https ones say there is an SSL connection error.
Are you looking at the http or https version in GWT?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google's Knowledge Panel
Hi Moz Community. Has anyone noticed a pattern in the websites that Google pulls in to populate knowledge Panels? For example, for a lot of queries Google keeps pulling data from a specific source over and over again, and the data shown in the Knowledge Panel isn't on the target page. Is it possible that Google simply favors some sites over others and no matter what you do, you'll never make it into the Knowledge box? Thanks.
Intermediate & Advanced SEO | | yaelslater0 -
EComm Sites that Don't Display Pricing
I've got a client that only shows pricing if a user is logged in - they're B2B and only sell at a wholesale level. The site is massive, has been around for about a decade, and has had an active SEO campaign for years. They've been losing ground on top ranked keywords, primarily in the 1-2 spots, rest of the first page remains strong and actually improves regularly.My hunch is that Google recognizes the inability for anyone to make a purchase on the site. As a result, they're realizing that the searcher intent doesn't match the actions that can be taken on the site and are bumping them down. Has anyone seen a similar situation or have any evidence to suggest my hunch is correct?
Intermediate & Advanced SEO | | LoganRay0 -
Website Isn't Ranking & I'm Not Sure Why Based On The Data
Hi Moz Community,
Intermediate & Advanced SEO | | ErrickG
I am having an issue that has been killing me for some time and I could really use another opinion. One of my client’s websites hasn't been ranking for some time and I can't put my finger on it. There are no issues showing up in the webmaster tools. If you compare the site with the tops ranking sites for the websites number one keyword, the website is just as good as everyone else. My clients website is the first one on the left in the attachment. We have better quality content but instead of showing up on page 1,2,3 the site is on page 21. I am just at a lost. Anyone have any thoughts outside looking in. Thanks,
Errick rrLJZ2G0 -
How do I prevent 404's from hurting my site?
I manage a real estate broker's site on which the individual MLS listing pages continually create 404 pages as properties are sold. So, on a site with 2200 pages indexed, roughly half are 404s at any given time. What can I do to mitigate any potential harm from this?
Intermediate & Advanced SEO | | kimmiedawn0 -
Best way for Google and Bing not to crawl my /en default english pages
Hi Guys, I just transferred my old site to a new one and now have sub folder TLD's. My default pages from the front end and sitemap don't show /en after www.mysite.com. The only translation i have is in spanish where Google will crawl www.mysite.com/es (spanish). 1. On the SERPS of Google and Bing, every url that is crawled, shows the extra "/en" in my TLD. I find that very weird considering there is no physical /en in my urls. When i select the link it automatically redirects to it's default and natural page (no /en). All canonical tags do not show /en either, ONLY the SERPS. Should robots.txt be updated to "disallow /en"? 2. While i did a site transfer, we have altered some of the category url's in our domain. So we've had a lot of 301 redirects, but while searching specific keywords in the SERPS, the #1 ranked url shows up as our old url that redirects to a 404 page, and our newly created url shows up as #2 that goes to the correct page. Is there anyway to tell Google to stop showing our old url's in the SERP's? And would the "Fetch as Google" option in GWT be a great option to upload all of my url's so Google bots can crawl the right pages only? Direct Message me if you want real examples. THank you so much!
Intermediate & Advanced SEO | | Shawn1240 -
Need help on SEO for my site. Can't figure out what is wrong.
My site, findyogi.com, isn't ranking well in google SERPs. For some good content and matching keyword, my pages are ranking 200+ whereas other sites that have similar or lower authority are ranking in top 10. I must be doing something fundamentally wrong but can't seem to figure out what. I am not looking at ranking 1 on google right now but my pages don't appear even on page 2-4. Sample Keyword- "Samsung galaxy s4 price in india" . Matching page - www.findyogi.com/mobiles/samsung/samsung-galaxy-s4-b94a37/price Please help.
Intermediate & Advanced SEO | | namansr0 -
Http://blogsearch.google.com/ping
Is there any reason why a website would submit all their content (videos, photo galleries, articles) to this?
Intermediate & Advanced SEO | | MargaritaS0 -
Does Google punish sites for Backlinks?
Here is Matt Cutts video, for those of you who have not seen it already. http://www.youtube.com/watch?v=f4dAWb5jUws (Very Short) In this Video Matt explains that Google does not look at backlinks. Many link spamming sites have detected, there have been many website receiving warning messages in their Google web tools to deindex these links, etc.. My theory is that Google will not punish sites for backlinks. However, they manually check for "link farming sites" and warn anyone affiliated with them, just in case these links were built from a competitor. This way they can eliminate all the "Bad Link Farm" sites and not hurt anyone who does not deserve to be hurt. Google is not going to give us all their information to rank, they dont want us to rank. They want us to PPC. However, they do want to have the best SERPs available. I call it Google juggling! Thoughts?
Intermediate & Advanced SEO | | SEODinosaur0