Google can't access/crawl my site!
-
Hi
I'm dealing with this problem for a few days. In fact i didn't realize it was this serious until today when i saw most of my site "de-indexed" and losing most of the rankings.
[URL Errors: 1st photo]
8/21/14 there were only 42 errors but in 8/22/14 this number went to 272 and it just keeps going up.
The site i'm talking about is gazetaexpress.com (media news, custom cms) with lot's of pages.
After i did some research i came to the conclusion that the problem is to the firewall, who might have blocked google bots from accessing the site. But the server administrator is saying that this isn't true and no google bots have been blocked.
Also when i go to WMT, and try to Fetch as Google the site, this is what i get:
[Fetch as Google: 2nd photo]
From more than 60 tries, 2-3 times it showed Complete (and this only to homepage, never to articles).
What can be the problem? Can i get Google to crawl properly my site and is there a chance that i will lose my previous rankings?
Thanks a lot
Granit -
What did you do specifically to mitigate the problem? You can PM me, if you would like.
-
This applies to the guy from Albania.
Oh, this IS the guy from Albania. Never mind.
-
Great, thanks for letting us know what happened with this!
-
Hi all
Just wanted to let you know that we fixed the problem. We disabled CloudFlare which we found out was blocking Google bots. More about this issue can be found at: https://support.cloudflare.com/hc/en-us/articles/200169806-I-m-getting-Google-Crawler-Errors-What-should-I-do-
-
Hi Travis, thank you for your time.
Great for your friend, I also suggest to visit Kosovo someday, you will have great time here, for sure
Back to the issue:
Here is an interesting issue that is happening with the crawler.
Our own cms uses htaccess for rewrite purposes. I created 2 new files that are independent from CMS and tried to fetch them with WMT, and it worked like a charm.
These 2 independent files are:
www.gazetaexpress.com/test_manaferra.php
www.gazetaexpress.com/xhezidja.php
Then, I created an ajax page with our CMS, which contains only plain text, tried to fetch it by WMT and strangely enough it didn't work. To make sure that the .htaccess file is not affecting this behavior, I deleted the htaccess and tried to fetch it, but it didn't worked.
The ajax page is: www.gazetaexpress.com/page/xhezidja/?pageSEO=false
The site works perfectly for humans which access it via the browser.
I'm more than confused now!
-
A friend of mine just got back from Kosovo. It was the last stop on a tour of the Balkans. He had a pretty good time. Moving along...
I crawled about 12K URLs and hit almost 90 Internal Server Errors (500). It's probably not your core problem, but it's something to look at. Here are a few examples:
http://www.gazetaexpress.com/blihet/?search_category_id=1&searchFilter=1
http://www.gazetaexpress.com/shitet/?category_id=134&searchFilter=1
http://www.gazetaexpress.com/me-qera/?category_id=131&searchFilter=1
There was one actual page that threw a 500 at the time of crawl:
http://www.gazetaexpress.com/mistere/edhe-kesaj-i-thuhet-veze-22591/
The edhe kesaj page now resolves fine. (I'm not even going to pretend to understand or write Albanian.)
So there may be some issues with the server or hosting. If you haven't already, try this troubleshooter from Cloudflare.
-
Ah OK - well keep us updated with what you find. Someone else will chip in with other info if they have some
-Andy
-
We are suspecting that CloudFlare might be causing these troubles. We are trying everything, in the meantime i'm looking here to see if anyone has any similar experience or an idea for solution.
As for warnings, the only warning we had was the one last week (8/23/14) saying that Google bot can't acces our site:
Over the last 24 hours, Googlebot encountered 316 errors while attempting to connect to your site. Your site's overall connection failure rate is 7.5%.
-Granit
-
It doesn't look like a firewall, as I can crawl it with Screaming Frog. However, the server logs will be able to answer that one for you.
Without looking in depth, I'm not seeing anything that stands out to me - do you think that there have been changes to the server that could cause issues? What firewall is the server running? Also, if there were errors in crawling the site, you would see a warning about this.
-Andy
-
In mid-march website changed it's CMS but i don't think that could be the reason because until this week everything was working perfectly. I don't think it could have been compromised too. I'm still suspecting it could be the firewall blocking bots from crawling the site, but the server administrator couldn't find any evidence of this.
-
Hi Granit,
Has any work been done to the site in the last 2-3 months? Have you had any warnings in webmaster tools at all? I did once see a strange problem where Google wasn't crawling a site correctly because it had been compromised, but after checking, there is nothing like this on yours.
-Andy
-
No prb. Thanks a lot for your time. Let just hope that someone in the community will help with a solution
-
Unfortunately, I don't have a quick answer for you. Looking forward to seeing what other community members have to say on this one!
-
I'm looking at the http version in GWT
-
If I do a site:gazetaexpress.com in Google, I get some results that are http, and some results that are https. The https ones say there is an SSL connection error.
Are you looking at the http or https version in GWT?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Just moved to CDN and site dropped in Google
Hi there, I have been modifying a clients site for months now trying to get higher up in Google for the term "wedding dresses essex" on the website https://www.preciousmomentsbridalwear.co.uk/ It's always ranked around 7th / 8th place and we want to try and get it into 4/5th position ideally. I have optimised pages and then due to the site speed not being that great we moved it to MaxCDN this week which has made the site much faster, but now we have dropped to number 10 in Google and in danger of dropping out of the first page. I was hoping that making the site much faster for desktop and mobile would help not hinder! Any help would be appreciated! Simon
Intermediate & Advanced SEO | | Doublestruck0 -
Why isn't my site being indexed by Google?
Our domain was originally pointing to a Squarespace site that went live in March. In June, the site was rebuilt in WordPress and is currently hosted with WPEngine. Oddly, the site is being indexed by Bing and Yahoo, but is not indexed at all in Google i.e. site:example.com yields nothing. As far as I know, the site has never been indexed by Google, neither before nor after the switch. What gives? A few things to note: I am not "discouraging search engines" in WordPress Robots.txt is fine - I'm not blocking anything that shouldn't be blocked A sitemap has been submitted via Google Webmaster Tools and I have "fetched as Google" and submitted for indexing - No errors I've entered both the www and non-www in WMT and chose a preferred There are several incoming links to the site, some from popular domains The content on the site is pretty standard and crawlable, including several blog posts I have linked up the account to a Google+ page
Intermediate & Advanced SEO | | jtollaMOT0 -
Can you no index a page in Wordpress from just Google news?
I'm trying to find a plugin for Wordpress that enables you to no-index an individual page from Google news but not from Google search results. We want to remove some of our pages from Google news without hurting others.
Intermediate & Advanced SEO | | uSw0 -
My site is always in the top 4 on google, and sometimes goes to #2\. But the site at #1 is always at #1 .. how can i beat them?
So i'm sure this is a very generic question.. of course everyone wants to be #1. We are an ecommerce web site. We have all sorts of products, user ratings, and are loved by our customers. We sell over 3 million a year. So let me give you some data.. First of all one of the sites that keeps taking the #2 or #3 spot is amazons category for what we sell.. (i'm not sure if I should say who we are here.. as I don't want the #1 spot to realize we are trying to take them over!) Amazon of course has a domain authority of 100. But they never take the #1 spot. The other site that takes the #2 and #3 spot is not even selling anything. Happens to be a technical term's with the same name wikipedia page! (i wish google would figure out people aren't looking for that!) Anyways.. every day we bouce back and forth between #4 and #2.. but #1 never changes.. Here are the stats of us verse #1 from moz: #1: Page Authority: 56.8, Root Domains Linking to page: 158, Domain Authority: 54.6: root domains linking to the root domain 1.42k my site: Page Authority: 60.6, Root domains linking to the page: 562, Domain Authority: 52.8: root domains linking to the root domain: 1.03k So they beat us in domain authority SLIGHTLY and in root domains linking to the root domain. So SEO masters.. what do I do to fix this? Get better backlinks? But how.... I can't just email GQ and ask them to write about us can I? I'm open to all things.. Maybe i'm not using moz data correctly.. We should at least be #2. We get #2 every other day.
Intermediate & Advanced SEO | | 88mph0 -
Https://www.mywebsite.com/blog/tag/wolf/ setting tag pages as blog corner stone article?
We do not have enough content rich page to target all of our keywords. Because of that My SEO guy wants to set some corner stone blog articles in order to rank them for certain key words on Google. He is asking me to use the following rule in our article writing(We have blog on our website):
Intermediate & Advanced SEO | | AlirezaHamidian
For example in our articles when we use keyword "wolf", link them to the blog page:
https://www.mywebsite.com/blog/tag/wolf/
It seems like a good idea because in the tag page there are lots of material with the Keyword "wolf" . But the problem is when I search for keyword "wolf" for example on the Google, some other blog pages are ranked higher than this tag page. But he tells me in long run it is a better strategy. Any idea on this?0 -
How to remove an entire site from Google?
Hi people, I have a site with around 2.000 urls indexed in google, and 10 subdomains indexed too, which I want to remove entirely, to set up a new web. Which is the best way to do it? Regards!
Intermediate & Advanced SEO | | SeoExpertos0 -
Sites banned from Google?
How do you find out sites banned from Google? I know how to find out sites no longer cached, or is it the same thing once deindexed? As always aprpeciate your advice everyone.
Intermediate & Advanced SEO | | pauledwards0 -
How to get the 'show map of' tag/link in Google search results
I have 2 clients that have apparently random examples of the 'show map of' link in Google search results. The maps/addresses are accurate and for airports. They are both aggregators, they service the airports e.g. lax airport shuttle (not actual example) BUT DO NOT have Google Place listings for these pages either manually OR auto populated from Google, DO NOT have the map or address info on the pages that are returned in the search results with the map link. Does anyone know how this is the case? Its great that this happens for them but id like to know how/why so I can replicate across all their appropriate pages. My understanding was that for this to happen you HAD to have Google Place pages for the appropriate pages (which they cant do as they are aggregators). Thanks in advance, Andy
Intermediate & Advanced SEO | | AndyMacLean0