Google can't access/crawl my site!
-
Hi
I'm dealing with this problem for a few days. In fact i didn't realize it was this serious until today when i saw most of my site "de-indexed" and losing most of the rankings.
[URL Errors: 1st photo]
8/21/14 there were only 42 errors but in 8/22/14 this number went to 272 and it just keeps going up.
The site i'm talking about is gazetaexpress.com (media news, custom cms) with lot's of pages.
After i did some research i came to the conclusion that the problem is to the firewall, who might have blocked google bots from accessing the site. But the server administrator is saying that this isn't true and no google bots have been blocked.
Also when i go to WMT, and try to Fetch as Google the site, this is what i get:
[Fetch as Google: 2nd photo]
From more than 60 tries, 2-3 times it showed Complete (and this only to homepage, never to articles).
What can be the problem? Can i get Google to crawl properly my site and is there a chance that i will lose my previous rankings?
Thanks a lot
Granit -
What did you do specifically to mitigate the problem? You can PM me, if you would like.
-
This applies to the guy from Albania.
Oh, this IS the guy from Albania. Never mind.
-
Great, thanks for letting us know what happened with this!
-
Hi all
Just wanted to let you know that we fixed the problem. We disabled CloudFlare which we found out was blocking Google bots. More about this issue can be found at: https://support.cloudflare.com/hc/en-us/articles/200169806-I-m-getting-Google-Crawler-Errors-What-should-I-do-
-
Hi Travis, thank you for your time.
Great for your friend, I also suggest to visit Kosovo someday, you will have great time here, for sure
Back to the issue:
Here is an interesting issue that is happening with the crawler.
Our own cms uses htaccess for rewrite purposes. I created 2 new files that are independent from CMS and tried to fetch them with WMT, and it worked like a charm.
These 2 independent files are:
www.gazetaexpress.com/test_manaferra.php
www.gazetaexpress.com/xhezidja.php
Then, I created an ajax page with our CMS, which contains only plain text, tried to fetch it by WMT and strangely enough it didn't work. To make sure that the .htaccess file is not affecting this behavior, I deleted the htaccess and tried to fetch it, but it didn't worked.
The ajax page is: www.gazetaexpress.com/page/xhezidja/?pageSEO=false
The site works perfectly for humans which access it via the browser.
I'm more than confused now!
-
A friend of mine just got back from Kosovo. It was the last stop on a tour of the Balkans. He had a pretty good time. Moving along...
I crawled about 12K URLs and hit almost 90 Internal Server Errors (500). It's probably not your core problem, but it's something to look at. Here are a few examples:
http://www.gazetaexpress.com/blihet/?search_category_id=1&searchFilter=1
http://www.gazetaexpress.com/shitet/?category_id=134&searchFilter=1
http://www.gazetaexpress.com/me-qera/?category_id=131&searchFilter=1
There was one actual page that threw a 500 at the time of crawl:
http://www.gazetaexpress.com/mistere/edhe-kesaj-i-thuhet-veze-22591/
The edhe kesaj page now resolves fine. (I'm not even going to pretend to understand or write Albanian.)
So there may be some issues with the server or hosting. If you haven't already, try this troubleshooter from Cloudflare.
-
Ah OK - well keep us updated with what you find. Someone else will chip in with other info if they have some
-Andy
-
We are suspecting that CloudFlare might be causing these troubles. We are trying everything, in the meantime i'm looking here to see if anyone has any similar experience or an idea for solution.
As for warnings, the only warning we had was the one last week (8/23/14) saying that Google bot can't acces our site:
Over the last 24 hours, Googlebot encountered 316 errors while attempting to connect to your site. Your site's overall connection failure rate is 7.5%.
-Granit
-
It doesn't look like a firewall, as I can crawl it with Screaming Frog. However, the server logs will be able to answer that one for you.
Without looking in depth, I'm not seeing anything that stands out to me - do you think that there have been changes to the server that could cause issues? What firewall is the server running? Also, if there were errors in crawling the site, you would see a warning about this.
-Andy
-
In mid-march website changed it's CMS but i don't think that could be the reason because until this week everything was working perfectly. I don't think it could have been compromised too. I'm still suspecting it could be the firewall blocking bots from crawling the site, but the server administrator couldn't find any evidence of this.
-
Hi Granit,
Has any work been done to the site in the last 2-3 months? Have you had any warnings in webmaster tools at all? I did once see a strange problem where Google wasn't crawling a site correctly because it had been compromised, but after checking, there is nothing like this on yours.
-Andy
-
No prb. Thanks a lot for your time. Let just hope that someone in the community will help with a solution
-
Unfortunately, I don't have a quick answer for you. Looking forward to seeing what other community members have to say on this one!
-
I'm looking at the http version in GWT
-
If I do a site:gazetaexpress.com in Google, I get some results that are http, and some results that are https. The https ones say there is an SSL connection error.
Are you looking at the http or https version in GWT?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is there a difference between 'Mø' and 'Mo'?
The brand name is Mø but users are searching online for Mo. Should I changed all instances of Mø to be Mo on my clients website?
Intermediate & Advanced SEO | | ben_mozbot010 -
How necessary is it to disavow links in 2017? Doesn't Google's algorithm take care of determining what it will count or not?
Hi All, So this is a obvious question now. We can see sudden fall or rise of rankings; heavy fluctuations. New backlinks are contributing enough. Google claims it'll take care of any low quality backlinks without passing pagerank to website. Other end we can many scenarios where websites improved ranking and out of penalty using disavow tool. Google's statement and Disavow tool, both are opposite concepts. So when some unknown low quality backlinks are pointing and been increasing to a website? What's the ideal measure to be taken?
Intermediate & Advanced SEO | | vtmoz0 -
Why Google isn't indexing my images?
Hello, on my fairly new website Worthminer.com I am noticing that Google is not indexing images from my sitemap. Already 560 images submitted and Google indexed only 3 of them. Altough there is more images indexed they are not indexing any new images, and I have no idea why. Posts, categories and other urls are indexing just fine, but images not. I am using Wordpress and for sitemaps Wordpress SEO by yoast. Am I missing something here? Why Google won't index my images? Thanks, I appreciate any help, David xv1GtwK.jpg
Intermediate & Advanced SEO | | Worthminer1 -
Can Google read content/see links on subscription sites?
If an article is published on The Times (for example), can Google by-pass the subscription sign-in to read the content and index the links in the article? Example: http://www.thetimes.co.uk/tto/life/property/overseas/article4245346.ece In the above article there is a link to the resort's website but you can't see this unless you subscribe. I checked the source code of the page with the subscription prompt present and the link isn't there. Is there a way that these sites deal with search engines differently to other user agents to allow the content to be crawled and indexed?
Intermediate & Advanced SEO | | CustardOnlineMarketing0 -
Weird rankings on my website, can't figure it out
Hey guys, One of my post popular pages for "Rust Hacks" use to be - http://www.ilikecheats.com/01/rust-cheats-hacks-aimbot/ Now when searching Google for site:ilikecheats.com rust hacks This page shows as the highest ranking - http://forum.ilikecheats.com/forums/221-Rust-Hacks-Rust-Cheats-Public-Forum What's weird is it seems the entire front end (Wordpress site) isn't ranking well anymore on page #1 of Google and the forums are ranking better currently. I did have a huge penalty from backlinks last year but cleared it. I got Yoast to do a site review and I'm cleaning up everything now. I also cleared most of the bad links via the disavow tool. Another example is when I search for "warz hacks" the forums show up in 4th place but the main website isn't showing at all back to page 10. If I search site:ilikecheats.com warz hacks the links directly to the main site doesn't show until page #2. So is this still a penalty that is carried over or is something else going on? Can't seem to figure it out, thanks in advance for looking. 😃 Any ideas what's going on and why the main pages no longer rank - http://www.ilikecheats.com
Intermediate & Advanced SEO | | Draden670 -
How to remove wrong crawled domain from Google index
Hello, I'm running a Wordpress multisite. When I create a new site for a client, we do the preparation using the multisite domain address (ex: cameleor.cobea.be). To keep the site protected we use the "multisite privacy" plugin which allows us to restrict the site to admin only. When site is ready we a domain mapping plugin to redirect the client domain to the multisite (ex: cameleor.com). Unfortunately, recently we switched our domain mappin plugin by another one and 2 sites got crawled by Google on their multsite address as well. So now when you type "cameleor" in Google you get the 2 domains in SERPS (see here http://screencast.com/t/0wzdrYSR). It's been 2 weeks or so that we fixed the plugin issue and now cameleor.cobea.be is redirected to the correct address cameleor.com. My question: how can I get rid of those wrong urls ? I can't remove it in Google Webmaster Tools as they belong to another domain (cf. cameleor.cobea.be for which I can't get authenticated) and I wonder if will ever get removed from index as they still redirect to something (no error to the eyes of Google)..? Does anybody has an idea or a solution for me please ? Thank you very much for your help Regards Jean-Louis
Intermediate & Advanced SEO | | JeanlouisSEO0 -
How can we get a site reconsidered for Google indexing?
We recently completed a re-design for a site and are having trouble getting it indexed. This site may have been penalized previously. They were having issues getting it ranked and the design was horrible. Any advise on how to get the new site reconsidered to get the rank where it should be? (Yes, Webmaster Tools is all set up with the sitemap linked) Many thanks for any help with this one!
Intermediate & Advanced SEO | | d25kart0 -
Strategies to compete with a new domain/site
Hi all, What would be ( highlights ) your strategy in order to rank and compete with a new domain against competitors that have an average of 50% domain authority and around 2000 root domain linking to them, if you would start with a completely new website/domain? How long would you estimate the new site to be competitive? In the retail area. Working on it a month full time I would go with On page SEO off course, detailling each products and building the internal link structure Get back links, backlinks, backlinks and... backlinks... Build the social media network feed a blog Thanks for your input Considering working on the site for a month full time, I would estimate a ranking after a month or 2 although the competitions very high. Your thoughts ?
Intermediate & Advanced SEO | | Derek_A0