Prevent Google from crawling Ajax
-
With Google figuring out how to make Ajax and JS more searchable/indexable, I am curious on thoughts or techniques to prevent this.
Here's my Situation, we have a page that we do not ever want to be indexed/crawled or other. Currently we have the nofollow/noindex command, but due to technical changes for our site the method in which this information is being implemented if it is ever displayed it will not have the ability to block the content from search. It is also the decision of the business to not list the file in robots.txt due to the sensitivity of the content. Basically, this content doesn't exist unless something super important happens, and even if something super important happens, we do not want Google to know of its existence.
Since the Dev team is planning on using Ajax/JS to pull in this content if the business turns it on, the concern is that it will be on the homepage and Google could index it. So the questions that I was asked; if Google can/does index, how long would that piece of content potentially appear in the SERPs? Can we block Google from caring about and indexing this section of content on the homepage?
Sorry for the vagueness of this question, it's very sensitive in nature and I am trying to avoid too many specifics. I am able to discuss this in a more private way if necessary.
Thanks!
-
Toby, thanks for the suggestion! I believe that this will help accomplish what we need. My Dev gave the "oh S" I should've thought of that response.
-
You may find that you have to wrap the code that gets called when Ajax fires in something to catch the user agent. I.e. if your making an Ajax request to a php script in order to return data, you could wrap that php code in something like this (please excuse the Sudo code):
if(in_array($_SERVER['HTTP_USER_AGENT'], $knownagents){
//known webspider, or blocked agent, return nothing.
return "";
} else {
//not a known spider so continue.
}
?>
Thats very generalised but you get the idea. I put a short list together in JSON format a while back, you can find it here if its of any use: https://www.source-control.co.uk/knownspiders/spiders.php
PM me if you need any more specific help than that with development, hopefully someone else will have a slightly easier way of dealing with this though heh
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What should I do if same content ranked twice or more on Google?
I have a Bangla SEO related blog where I have written article like "Domain Selection" "SEO Tools" "MOZ" etc. All the article has been written in Bengali language. I have used wp tag for every post. I have submit xml site map generated by Yoast SEO. However I kept "no index" for category. I know well duplicate content is a major problem for SEO. After publishing my content Google ranked them on 1st page. But my fear is that most of the content twice or more. The keywords are ranked by post, wp post tag and Archive. Now I have a fear of penalty. Please check the screenshot and please suggest me what to do. uRCHf yq7m2 rSLKFLG
Intermediate & Advanced SEO | | AccessTechBD0 -
Client has an inexplicable jump in crawled pages being reported in Google Search Console
Recently a client of mine noticed an inexplicable jump in crawled pages being reported in Google Search Console. We researched the following culprits and found nothing: Rel=canonicals are put in place No SSL/non SSL duplication We used a tool to extrapolate search query page data from Google Search Insights; nothing unusual No dynamic pages being made on the website All necessary landing pages are in the XML sitemap Could this be a glitch in GSC? We are wondering what the heck is going on. 7eaeS
Intermediate & Advanced SEO | | BigChad20 -
Google does not favour php websites?
Hi there. An SEO company recently told me that google does not favour php development? This seems rather sketchy, I have not read that google doesn't favour this anywhere, did I just miss that part of SEO or are these guys blowing a little smoke?
Intermediate & Advanced SEO | | ProsperoDigital1 -
How is Google crawling and indexing this directory listing?
We have three Directory Listing pages that are being indexed by Google: http://www.ccisolutions.com/StoreFront/jsp/ http://www.ccisolutions.com/StoreFront/jsp/html/ http://www.ccisolutions.com/StoreFront/jsp/pdf/ How and why is Googlebot crawling and indexing these pages? Nothing else links to them (although the /jsp.html/ and /jsp/pdf/ both link back to /jsp/). They aren't disallowed in our robots.txt file and I understand that this could be why. If we add them to our robots.txt file and disallow, will this prevent Googlebot from crawling and indexing those Directory Listing pages without prohibiting them from crawling and indexing the content that resides there which is used to populate pages on our site? Having these pages indexed in Google is causing a myriad of issues, not the least of which is duplicate content. For example, this file <tt>CCI-SALES-STAFF.HTML</tt> (which appears on this Directory Listing referenced above - http://www.ccisolutions.com/StoreFront/jsp/html/) clicks through to this Web page: http://www.ccisolutions.com/StoreFront/jsp/html/CCI-SALES-STAFF.HTML This page is indexed in Google and we don't want it to be. But so is the actual page where we intended the content contained in that file to display: http://www.ccisolutions.com/StoreFront/category/meet-our-sales-staff As you can see, this results in duplicate content problems. Is there a way to disallow Googlebot from crawling that Directory Listing page, and, provided that we have this URL in our sitemap: http://www.ccisolutions.com/StoreFront/category/meet-our-sales-staff, solve the duplicate content issue as a result? For example: Disallow: /StoreFront/jsp/ Disallow: /StoreFront/jsp/html/ Disallow: /StoreFront/jsp/pdf/ Can we do this without risking blocking Googlebot from content we do want crawled and indexed? Many thanks in advance for any and all help on this one!
Intermediate & Advanced SEO | | danatanseo0 -
Google Webmasters not Accurate
I recently updated all the Meta titles, descriptions and keywords on my website because in the past most were duplicate and/or written in the incorrect language. According to Webmaster Tools they have indexed our site post update, but we still have the same number of HTML issues. When I click to investigate the issues further it is clear they are reflecting the old Meta not the new stuff we just added. Should this fix itself the next time Google crawls my site or is there something else I should be doing about the issue? Thanks!
Intermediate & Advanced SEO | | theLotter0 -
Google +1 and Yslow
After adding Google's +1 script and call to our site (loading asynchronously), we noticed Yslow is giving us a D for not having expire headers for the following scripts: https://apis.google.com/js/plusone.js
Intermediate & Advanced SEO | | GKLA
https://www.google-analytics.com/ga.js
https://lh4.googleusercontent.com... 1. Is their a workaround for this issue, so expire headers are added to to plusone and GA script? Or, are we being to nit-picky about this issue?0 -
Will this get penalized by google?
I had a thought recently, and perhaps it is a pretty bad thought, but i don't see the flaw in it, or how google would really detect it, so please correct me where I am wrong here. Say we ran some sort of marketing campeign and through that campeign we created about 100 extra pages on our domain. A lot of these pages are heavily shared on facebook, twitter, google+ etc. These pages also have several backlinks here and there. Now this campaign is over and so these pages no longer seem relevant to us. If we were to add 301 redirects to all these pages, to three different (and unrelated) internal pages (our primary targets) would this pass all the accumulated link juice on to those three target internal pages? Or would this behaviour get penalized by google?
Intermediate & Advanced SEO | | adriandg0 -
Nofollow links in Google Webmaster
I've noticed nofollow links showing up in my Google Webmaster tools "links to your site" list. If they are nofollow why are they showing up here? Do nofollow links still count as a backlink and transfer PR and authority?
Intermediate & Advanced SEO | | NoCoGuru1