Google can't access/crawl my site!
-
Hi
I'm dealing with this problem for a few days. In fact i didn't realize it was this serious until today when i saw most of my site "de-indexed" and losing most of the rankings.
[URL Errors: 1st photo]
8/21/14 there were only 42 errors but in 8/22/14 this number went to 272 and it just keeps going up.
The site i'm talking about is gazetaexpress.com (media news, custom cms) with lot's of pages.
After i did some research i came to the conclusion that the problem is to the firewall, who might have blocked google bots from accessing the site. But the server administrator is saying that this isn't true and no google bots have been blocked.
Also when i go to WMT, and try to Fetch as Google the site, this is what i get:
[Fetch as Google: 2nd photo]
From more than 60 tries, 2-3 times it showed Complete (and this only to homepage, never to articles).
What can be the problem? Can i get Google to crawl properly my site and is there a chance that i will lose my previous rankings?
Thanks a lot
Granit -
What did you do specifically to mitigate the problem? You can PM me, if you would like.
-
This applies to the guy from Albania.
Oh, this IS the guy from Albania. Never mind.
-
Great, thanks for letting us know what happened with this!
-
Hi all
Just wanted to let you know that we fixed the problem. We disabled CloudFlare which we found out was blocking Google bots. More about this issue can be found at: https://support.cloudflare.com/hc/en-us/articles/200169806-I-m-getting-Google-Crawler-Errors-What-should-I-do-
-
Hi Travis, thank you for your time.
Great for your friend, I also suggest to visit Kosovo someday, you will have great time here, for sure
Back to the issue:
Here is an interesting issue that is happening with the crawler.
Our own cms uses htaccess for rewrite purposes. I created 2 new files that are independent from CMS and tried to fetch them with WMT, and it worked like a charm.
These 2 independent files are:
www.gazetaexpress.com/test_manaferra.php
www.gazetaexpress.com/xhezidja.php
Then, I created an ajax page with our CMS, which contains only plain text, tried to fetch it by WMT and strangely enough it didn't work. To make sure that the .htaccess file is not affecting this behavior, I deleted the htaccess and tried to fetch it, but it didn't worked.
The ajax page is: www.gazetaexpress.com/page/xhezidja/?pageSEO=false
The site works perfectly for humans which access it via the browser.
I'm more than confused now!
-
A friend of mine just got back from Kosovo. It was the last stop on a tour of the Balkans. He had a pretty good time. Moving along...
I crawled about 12K URLs and hit almost 90 Internal Server Errors (500). It's probably not your core problem, but it's something to look at. Here are a few examples:
http://www.gazetaexpress.com/blihet/?search_category_id=1&searchFilter=1
http://www.gazetaexpress.com/shitet/?category_id=134&searchFilter=1
http://www.gazetaexpress.com/me-qera/?category_id=131&searchFilter=1
There was one actual page that threw a 500 at the time of crawl:
http://www.gazetaexpress.com/mistere/edhe-kesaj-i-thuhet-veze-22591/
The edhe kesaj page now resolves fine. (I'm not even going to pretend to understand or write Albanian.)
So there may be some issues with the server or hosting. If you haven't already, try this troubleshooter from Cloudflare.
-
Ah OK - well keep us updated with what you find. Someone else will chip in with other info if they have some
-Andy
-
We are suspecting that CloudFlare might be causing these troubles. We are trying everything, in the meantime i'm looking here to see if anyone has any similar experience or an idea for solution.
As for warnings, the only warning we had was the one last week (8/23/14) saying that Google bot can't acces our site:
Over the last 24 hours, Googlebot encountered 316 errors while attempting to connect to your site. Your site's overall connection failure rate is 7.5%.
-Granit
-
It doesn't look like a firewall, as I can crawl it with Screaming Frog. However, the server logs will be able to answer that one for you.
Without looking in depth, I'm not seeing anything that stands out to me - do you think that there have been changes to the server that could cause issues? What firewall is the server running? Also, if there were errors in crawling the site, you would see a warning about this.
-Andy
-
In mid-march website changed it's CMS but i don't think that could be the reason because until this week everything was working perfectly. I don't think it could have been compromised too. I'm still suspecting it could be the firewall blocking bots from crawling the site, but the server administrator couldn't find any evidence of this.
-
Hi Granit,
Has any work been done to the site in the last 2-3 months? Have you had any warnings in webmaster tools at all? I did once see a strange problem where Google wasn't crawling a site correctly because it had been compromised, but after checking, there is nothing like this on yours.
-Andy
-
No prb. Thanks a lot for your time. Let just hope that someone in the community will help with a solution
-
Unfortunately, I don't have a quick answer for you. Looking forward to seeing what other community members have to say on this one!
-
I'm looking at the http version in GWT
-
If I do a site:gazetaexpress.com in Google, I get some results that are http, and some results that are https. The https ones say there is an SSL connection error.
Are you looking at the http or https version in GWT?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
ScreamingFrog won't crawl my site.
Hey guys, My site is Netspiren.dk and when I use a tool like Screaming Frog or Integrity, it only crawls my homepage and menu's - not product-pages. Examples
Intermediate & Advanced SEO | | FrederikTrovatten22
A menu: http://www.netspiren.dk/pl/Helse-Kosttilskud-Blandingsolie_57699.aspx
A product: http://www.netspiren.dk/pi/All-Omega-3-6-9-180-kapsler_1412956_57699.aspx Is it because the products are being loaded in Javascript?
What's your recommendation? All best,
Fred.0 -
Google is ranking the wrong page and I don't know why?
I have an E-Commerce store and to make things easy, let's say I am selling shoes. There is: Category named 'Shoes' and 3 products 'Sport shoes', 'Hiking shoes' and 'Dancing shoes' My problem: For the keyword 'Shoes' Google is showing the product result 'Sport shoes'. This makes no sense from user perspective. (It's like searching for 'iPhone' and getting a result for 'iPhone 4s' instead of a general overview.) Now what are the specifics of my category page (Which I want Google to rank): It has more external links with higher quality It has more internal links It has much higher page authority It has useful text to guide the user for the keyword It is a category instead of a product All this given, I just don't know how I can signal Google that this page makes sense to show in SERPs? Hope you can help with this!
Intermediate & Advanced SEO | | soralsokal0 -
Google didn't indexed my domain.
I bought *out.com more than 1 year, google bot even don't come, then I put the domain to the domain parking. what can I do? I want google index me.
Intermediate & Advanced SEO | | Yue0 -
Why isn't google indexing our site?
Hi, We have majorly redesigned our site. Is is not a big site it is a SaaS site so has the typical structure, Landing, Features, Pricing, Sign Up, Contact Us etc... The main part of the site is after login so out of google's reach. Since the new release a month ago, google has indexed some pages, mainly the blog, which is brand new, it has reindexed a few of the original pages I am guessing this as if I click cached on a site: search it shows the new site. All new pages (of which there are 2) are totally missed. One is HTTP and one HTTPS, does HTTPS make a difference. I have submitted the site via webmaster tools and it says "URL and linked pages submitted to index" but a site: search doesn't bring all the pages? What is going on here please? What are we missing? We just want google to recognise the old site has gone and ALL the new site is here ready and waiting for it. Thanks Andrew
Intermediate & Advanced SEO | | Studio330 -
Why did this website disappear from Google's SERPs?
For the first several months this website, WEBSITE, ranked well in Google for several local search terms like, "Columbia MO spinal decompression" and "Columbia, MO car accident therapy." Recently the website has completely disappeared from Google's SEPRs. It does not even exist when I copy and paste full paragraphs into Google's search bar. The website still ranks fine in Bing and Yahoo, but something happened that caused it to be removed from Google. Beside for optimizing the meta data, adding headers, alt tags, and all of the typical on-page SEO stuff, we did create a guest post for a relevant, local blog. Here is the post: Guest Post. The post's content is 100% unique. I realize the post has way to many internal/external links, which we definitely did not recommend, but can anyone find a reason why this website was removed from Google's SERPs? And possibly how we should go about getting it back into Google's SERPs? Thanks in advance for any help.
Intermediate & Advanced SEO | | VentaMarketing0 -
How to remove an entire site from Google?
Hi people, I have a site with around 2.000 urls indexed in google, and 10 subdomains indexed too, which I want to remove entirely, to set up a new web. Which is the best way to do it? Regards!
Intermediate & Advanced SEO | | SeoExpertos0 -
What's the best way to manage content that is shared on two sites and keep both sites in search results?
I manage two sites that share some content. Currently we do not use a cross-domain canonical URL and allow both sites to be fully indexed. For business reasons, we want both sites to appear in results and need both to accumulate PR and other SEO/Social metrics. How can I manage the threat of duplicate content and still make sure business needs are met?
Intermediate & Advanced SEO | | BostonWright0 -
Blog/Shop/Forum site structure - are we right to make these changes?
We run a fairly large online community with a popular blog and Europe's largest online shop for drift-specific motor sport parts and our website has been around since 2004 I believe. Since it was launched, the blog (or previous CMS system) has been at the domain root, the forums have been located at /forum and the shop at /shop (or similar) but we have decided to move things around a bit and would like some comments as to whether we are doing the right thing or if you would make any addition or different changes to us. Currently the entire website gets around 3m page views per month from 500,000 visitors, but this is split roughly 75% to the forums, 10% to the shop and 15% to the blog (but remember the blog is at the root so anyone who visits our homepage "visits" the blog). We plan to move the shop to the domain root (since the shop provides the income for the business - surely it should be the 1st thing visitors see?), the blog from root to /blog and the forums will stay where they are at /forum. We have read Steven Macdonald's post here, and have taken notes to help minimize traffic loss and disruption to our army of users and hopefully avoid too many penalties from Google and plan to: 301 redirect old URLs to new ones where they have changed. Submit new site maps to search engines. Update old links where we have control (such as forums where we are paid traders etc.). Send out a newsletter to our subscribers. Update our forum members. Fix errors via WMT before and after the re-structure. Should we be taking this opportunity to actually set each of the three sections of the site to it's own sub domain? Our thoughts are that if we are disrupting things, it's surely best to have lots of disruption once rather than a little bit of disruption several times over a 3-6 month period? OSE shows us to have roughly 1500 inbound links to /shop, 2100 to /forum and 4800 to the root / - if we proceed with our plan and put 301 redirects in place this seems to be the best plan to retain the value of these links but if we were to switch to sub domains would the 301s lose most of the link values due to them being on "different" domains? Any help, advise or suggestions are very welcome but comments from experience are what we are seeking ideally! Thanks Jay
Intermediate & Advanced SEO | | DWJames0