Blocking Google from telemetry requests
-
At Magnet.me we track the items people are viewing in order to optimize our recommendations. As such we fire POST requests back to our backends every few seconds when enough user initiated actions have happened (think about scrolling for example). In order to eliminate bots from distorting statistics we ignore their values serverside.
Based on some internal logging, we see that Googlebot is also performing these POST requests in its javascript crawling. In a 7 day period, that amounts to around 800k POST requests. As we are ignoring that data anyhow, and it is quite a number, we considered reducing this for bots.
Though, we had several questions about this:
1. Do these requests count towards crawl budgets?
2. If they do, and we'd want to prevent this from happening: what would be the preferred option? Either preventing the request in the frontend code, or blocking the request using a robots.txt line?The latter question is given by the fact that a in-app block for the request could lead to different behaviour for users and bots, and may be Google could penalize that as cloaking. The latter is slightly less convenient from a development perspective, as all logic is spread throughout the application.
I'm aware one should not cloak, or makes pages appear differently to search engine crawlers. However these requests do not change anything in the pages behaviour, and purely send some anonymous data so we can improve future recommendations.
-
Hi Rogier,
- Yes, this is usually counting towards crawl budgets as Googlebot is doing this per request.
- It depends on how your request is being set up obviously, otherwise, I would advise going with the exclusion for the robots.txt that you're already heading towards.
Hope this helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What about a No-record backlink in the eye of Google
I have an uncertainty - when I make a backlink as a piece of SEO in some site when I reviewed similar couple of days after. It hasn't filed and I checked its robots record. It appearing Client specialist: Mediapartners-Google Disallow: User-Agent: * Disallow: However, is this make any backlink uphold or only this with the end goal of not ordering in google. I make it straightforward - "Is this sort of backlink creation my site My Music Goals uphold my SEO action or Not?" In this No-record site.
Technical SEO | | Hadia7680 -
Google indexing is slowing down?
I have up to 20 million unique pages, and so far I've only submitted about 30k of them on my sitemap. We had a few load related errors during googles initial visits, and it thought some were duplicates, but we fixed all that. We haven't gotten a crawl related error for 2 weeks now. Google appears to be indexing fewer and fewer urls every time it visits. Any ideas why? I am not sure how to get all our pages indexed if its going to operate like this... love some help thanks! HnJaXSM.png
Technical SEO | | RyanTheMoz0 -
Recovering from a Google penalty
Hi there, So about 3.5 weeks ago I noticed my website (www.authenticstyle.co.uk) had gone from ranking in second place for our main key phrase "web design dorset" to totally dropping off the SERP's for that particular search phrase - it's literally no where to be seen. It seems that other pages of my website still rank, but the homepage. I then noticed that I had an unread alert in my Google Search Console account to say that a staging site we were hosting on a subdomain (the subdomain was domvs.authenticstyle.co.uk) had hacked content - it was a couple of PDF files with weird file names. The strange thing is we'd taken this staging site down a few weeks earlier, BUT one of my staff had left an A record set up in our Cloudflare account pointing to that staging server - they'd forgotten to remove it when removing the staging site. I then removed the A record, myself and submitted a reconsideration request on Google Search Console (which I still haven't received confirmation of) in the hope of everything sorting itself out. Since then I've also grabbed a Moz Pro account to try and dig a little deeper, but without any success. We have a few warnings for old 404's, some missing meta descs on some pages, and some backlinks that have accumulated over time that have hghish spam rating, but nothing major - nothing that would warrant a penalty as far as I can tell. From what I can make out, we've been issued a penalty on our homepage only, but I don't understand why we would get penalised for hacked content if that site domvs.authenticstyle.co.uk no longer existed (would it just be due to that erroneous A record we forgot to remove?). I contacted a few freelance SEO experts and one came back to me saying I'd done everything correctly and that I should see our site appearing again in a few days after submitting the reconsideration request. Its been 3 weeks and nothing. I'm at a huge loss as to how my site can recover from this. What would you recommend? I even tried getting our homepage to rank for a variation of "web design dorset", but it seems our homepage has been penalised for anything with "dorset" in the keyphrase. Any pointers would be HUGELY appreciated. Thanks in advance! Will
Technical SEO | | wsmith7270 -
Fake Links indexing in google
Hello everyone, I have an interesting situation occurring here, and hoping maybe someone here has seen something of this nature or be able to offer some sort of advice. So, we recently installed a wordpress to a subdomain for our business and have been blogging through it. We added the google webmaster tools meta tag and I've noticed an increase in 404 links. I brought this up to or server admin, and he verified that there were a lot of ip's pinging our server looking for these links that don't exist. We've combed through our server files and nothing seems to be compromised. Today, we noticed that when you do site:ourdomain.com into google the subdomain with wordpress shows hundreds of these fake links, that when you visit them, return a 404 page. Just curious if anyone has seen anything like this, what it may be, how we can stop it, could it negatively impact us in anyway? Should we even worry about it? Here's the link to the google results. https://www.google.com/search?q=site%3Amshowells.com&oq=site%3A&aqs=chrome.0.69i59j69i57j69i58.1905j0j1&sourceid=chrome&es_sm=91&ie=UTF-8 (odd links show up on pages 2-3+)
Technical SEO | | mshowells0 -
Google autorship in specific field?
Hi, I want to ask you about something I 've read about google and authorship. It is written that it is better to show yourself as a author in a specific field. I myself have knowledge and interest in many fields - like SEO, vegan living, martial arts. And I want to be seen as specialist in all of them. Does it mean that we are limited to mark with autorship articles in only one field, in order to be seen as expert in a specific field? f.e. Should I mark with "rel=author" the articles that are about SEO because I want to be seen as author in that specific field for sure. Iif I mark with "rel=author" articles also about martial arts would these affect the understanding about my expertise in SEO?
Technical SEO | | vladokan0 -
How to correct a google canonical issue?
So when I initially launched my website I had an issue where I didn't properly set my canonical tags and all my pages got crawled. Now in looking at the search engine results I see a number of the pages that were meant to be canonical tagged to the correct page showing up in the results. What is the best way to correct this issue with google? Also I noticed that while initially I was ranking well for the main pages, now those results have disappeared entirely and deeper in the rankings I am finding the pages that were meant to be canonical tagged. Please Help.
Technical SEO | | jackaveli0 -
Quality analytics without Google?
I'm trying to find a program or a site that will give me quality traffic data. If I'm going into a presentation or simply a round one meeting I like to prepare a 1 pager with info I've found. Without authentication access to their Google Analytics I feel like I'm a step behind. Any Help would be greatly appreciated. -JoeGrrrcia
Technical SEO | | JOEGRRRCIA0 -
Google Quality Algorithm Update
I'm curious what correlations or impacting variables SEO professionals have found that have increased or decreased ranking with the most recent algorithm change. It appears that many innocent sites have fallen victim, especially larger sites. It also appears that Google is maintaining that specific sites were not targeted... Meaning there must be proven characteristics.
Technical SEO | | douglaskarr0