We got an answer from JohnMu - Webmaster Trends Analyst at Google. The reason of crawling is (as we find out) the filters which have infinite variations (one of developers was sleeping), we will correct this. Disallowing in Robot.txt is adviced as the quickest fix to stop the mega-crawling. This case will be used for further research because of the disproportionate capacity usage. You're right, Google initially will crawl everything, but they don't want Googlebot crawling looks like a "mini-Ddos-like attack".
- Home
- Olaf
Olaf
@Olaf
Job Title: Digital Marketeer
Company: Perplex Digital B.V.
Digital Marketeer at Perplex Digital B.V.
Favorite Thing about SEO
content
Latest posts made by Olaf
-
RE: Googlebot on steroids... Why?
-
RE: Googlebot on steroids... Why?
Thanks for your help!
I think you're probably right. The initial crawling must be complete if Google wants to put everything into the right perspective. But we manage en host more than 300 sites, including large A-brand sites. And even at those sites I had not seen this kind of volumes before.
The server logs also show the same amount of request this night (day five). I will keep you posted if this still continues after the weekend.
-
RE: Googlebot on steroids... Why?
Mmm, is that correct? I thought that the amount of resources Google will put in crawling your (new) website also depends of it's authority. 9 million url's, for four days now... It seems to bee so much for this small website...
-
Googlebot on steroids... Why?
We launched a new website (www.gelderlandgroep.com). The site contains 500 pages, but some pages (like https://www.gelderlandgroep.com/collectie/) contains filters (so there are a lot possible url parameters). Last week we mentioned a tremendous amount of traffic (25 GB!!) and CPU usage on the server.
2017-12-04 16:11:57 W3SVC66 IIS14 83.219.93.171 GET /collectie model=6511,6901,7780,7830,2105-illusion&ontwerper=henk-vos,foklab 443 - 66.249.76.153 HTTP/1.1 Mozilla/5.0+(Linux;+Android+6.0.1;+Nexus+5X+Build/MMB29P)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/41.0.2272.96+Mobile+Safari/537.36+(compatible;+Googlebot/2.1;++http://www.google.com/bot.html) - - www.gelderlandgroep.com 200 0 0 9445 501 312
We find out that "Googlebot" was firing many, many requests. At first we did a nslookup for the IPadres where it actually seems to be googlebot.
Second we visited Google Searchconsole and I was really surprised... Googlebot on steroids? Googlebot requested 922.565 different url's and made combinations for every filter/ parameter combination on the site. Why? The sitemap.xml contains 500 url's... The authority of the site isn't very high, no other signal that this is a special website... Why so much "Google resources"?
Of course we will exclude the parameters in SearchConsole, but I never saw a Googlebot activity for a small website like this before! Does anybody have any clue?
Regards Olaf
-
RE: Domain Sub folder tracking from Google webmaster tool
Hey Cristopher,
Did you check Analytics? How many Organic results visitors did you really receive? It will take some time before you see data if the numbers are very small.
-
RE: Street Address Not Appearing on Business Google+ Page
Did you check your listings at other local listing sites? Google possibly would like to check your NAP information at other local listings. You can use getlisted.org. Please make sure you accurately implement name, address and phone number of your business in all the local directories. And more importantly, make sure that all the citations have your name, address and phone number listed in exactly the same way across all the directories (including your own website).
-
RE: Analytics: Goal Tracking
I don't think so, cross tracking is sometimes very hard... I'm not sure, but using searchtype "regular expression" and goal "shop.gardio.* "?? Will that work for you?
When you just want to know the % of visitors which went to shop.domain.com maybe Enhanced Link Attribution is also interesting: http://support.google.com/analytics/answer/2558867?hl=en&ref_topic=2558810
-
RE: URL Structure for Multilingual Site With Two Major Locations
We prefer the way you suggest, use and also translate the menu names:
domain.com/location-1 – to target English visitors
domain.com/es/establecimiento-1 – to target Spanish visitors
-
RE: Analytics: Goal Tracking
Hi Sven,
This article will help you implementing cross (sub)domain tracking: https://developers.google.com/analytics/devguides/collection/gajs/gaTrackingSite
When you implement this well, you can use one goal url.
-
RE: Analytics: Goal Tracking
Maybe this article will help you: http://www.ericmobley.net/guide-to-tracking-multiple-subdomains-in-google-analytics/
Best posts made by Olaf
-
RE: Websites on same c class IP address
A "C" Block address is based on your IP address.
For example 190.245.111.001 is a standard IP address. The c-blocks in this case are: AAA.BBB.CCC.001-254
So these are within the same C-class:
190.245.111.001
190.245.111.230And these are different C-Class IP's:
190.245.111.001
190.245.222.001Google may assume that sites hosted in differnet C-blocks are more likely to be from different people.
-
RE: URL Structure for Multilingual Site With Two Major Locations
We prefer the way you suggest, use and also translate the menu names:
domain.com/location-1 – to target English visitors
domain.com/es/establecimiento-1 – to target Spanish visitors
-
RE: Google analytics - help
To track the results of your campagnes (in Analytics) on the internet you can add some Analytics code to links. For example ' utm_campaign=summer2011' . When visitors reach your site via a link which contains the Analytics code " utm_campaign=summer2011", Analytics will allocate this visit to this (self defined) campagne.
Google Analytics stores the type of referral information in a cookie. The expiration date for the cookie is set as 6 months into the future.
-
RE: Is there a tool to upload multiple URLs and gather statistics and page rank?
You can use majesticseo.com if you want bulk backlink information. The tool also gives you, what they call a/c rank (something like pagerank) and the alexa ranking.
-
CNAME instead of A-record: seo problem?
Our supplier of a hosted CMS is hosting a few hundred website on his server. For every domainname pointing to a website we need to use the same IP-adres in the DNS A-record. They are now asking us to delete de A-record for all the websites and add a CNAME record. So they can send the traffic via a company like Versign in case of a DDOS attack.
A lot of the websites rank well. Will there be a SEO problem when we start using the CNAME's instead of the A-records?
Thanks, Olaf
-
RE: Adwords Keyword Research - Impressions, CTR
What's a top keyword? That's the question... It's not always the keyword with the most traffic.
You can use the Adwords Keyword tool (for example) to find related search queries and the search volume. But even when their is a lot of traffic, it's not sure it's a 'good' keyword for you.
When you are running an Adwords campaign you can find search queries "Tab Keywords -> See Search terms". This Search queries are also in Google Analytics, but with information about conversion, time of site visit, etc. So you can use Adwords + Google Analytics to find 'converting' search queries. Thereafter you can start optimising (seo) for this search queries.
See also http://www.seomoz.org/ugc/advanced-seo-keyword-research-tips-and-ideas-14216 for a lot of information.
-
RE: Is there a tool to upload multiple URLs and gather statistics and page rank?
No sorry... we subscribed to a gold plan and are able to upload 300 url's. That only seven copy paste actions.
-
RE: Analytics: Goal Tracking
Maybe this article will help you: http://www.ericmobley.net/guide-to-tracking-multiple-subdomains-in-google-analytics/
-
RE: Analytics: Goal Tracking
I don't think so, cross tracking is sometimes very hard... I'm not sure, but using searchtype "regular expression" and goal "shop.gardio.* "?? Will that work for you?
When you just want to know the % of visitors which went to shop.domain.com maybe Enhanced Link Attribution is also interesting: http://support.google.com/analytics/answer/2558867?hl=en&ref_topic=2558810
Internetmarketeer and director of Perplex Internetmarketing B.V.
Looks like your connection to Moz was lost, please wait while we try to reconnect.