Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Prevent Google from crawling Ajax
-
With Google figuring out how to make Ajax and JS more searchable/indexable, I am curious on thoughts or techniques to prevent this.
Here's my Situation, we have a page that we do not ever want to be indexed/crawled or other. Currently we have the nofollow/noindex command, but due to technical changes for our site the method in which this information is being implemented if it is ever displayed it will not have the ability to block the content from search. It is also the decision of the business to not list the file in robots.txt due to the sensitivity of the content. Basically, this content doesn't exist unless something super important happens, and even if something super important happens, we do not want Google to know of its existence.
Since the Dev team is planning on using Ajax/JS to pull in this content if the business turns it on, the concern is that it will be on the homepage and Google could index it. So the questions that I was asked; if Google can/does index, how long would that piece of content potentially appear in the SERPs? Can we block Google from caring about and indexing this section of content on the homepage?
Sorry for the vagueness of this question, it's very sensitive in nature and I am trying to avoid too many specifics. I am able to discuss this in a more private way if necessary.
Thanks!
-
Toby, thanks for the suggestion! I believe that this will help accomplish what we need. My Dev gave the "oh S" I should've thought of that response.
-
You may find that you have to wrap the code that gets called when Ajax fires in something to catch the user agent. I.e. if your making an Ajax request to a php script in order to return data, you could wrap that php code in something like this (please excuse the Sudo code):
if(in_array($_SERVER['HTTP_USER_AGENT'], $knownagents){
//known webspider, or blocked agent, return nothing.
return "";
} else {
//not a known spider so continue.
}
?>
Thats very generalised but you get the idea. I put a short list together in JSON format a while back, you can find it here if its of any use: https://www.source-control.co.uk/knownspiders/spiders.php
PM me if you need any more specific help than that with development, hopefully someone else will have a slightly easier way of dealing with this though heh
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google indexing pages from chrome history ?
We have pages that are not linked from site yet they are indexed in Google. It could be possible if Google got these pages from browser. Does Google takes data from chrome?
Intermediate & Advanced SEO | | vivekrathore0 -
Pages are Indexed but not Cached by Google. Why?
Here's an example: I get a 404 error for this: http://webcache.googleusercontent.com/search?q=cache:http://www.qjamba.com/restaurants-coupons/ferguson/mo/all But a search for qjamba restaurant coupons gives a clear result as does this: site:http://www.qjamba.com/restaurants-coupons/ferguson/mo/all What is going on? How can this page be indexed but not in the Google cache? I should make clear that the page is not showing up with any kind of error in webmaster tools, and Google has been crawling pages just fine. This particular page was fetched by Google yesterday with no problems, and even crawled again twice today by Google Yet, no cache.
Intermediate & Advanced SEO | | friendoffood2 -
Google crawling different content--ever ok?
Here are a couple of scenarios I'm encountering where Google will crawl different content than my users on initial visit to the site--and which I think should be ok. Of course, it is normally NOT ok, I'm here to find out if Google is flexible enough to allow these situations: 1. My mobile friendly site has users select a city, and then it displays the location options div which includes an explanation for why they may want to have the program use their gps location. The user must choose the gps, the entire city, or he can enter a zip code, or choose a suburb of the city, which then goes to the link chosen. OTOH it is programmed so that if it is a Google bot it doesn't get just a meaningless 'choose further' page, but rather the crawler sees the page of results for the entire city (as you would expect from the url), So basically the program defaults for the entire city results for google bot, but for for the user it first gives him the initial ability to choose gps. 2. A user comes to mysite.com/gps-loc/city/results The site, seeing the literal words 'gps-loc' in the url goes out and fetches the gps for his location and returns results dependent on his location. If Googlebot comes to that url then there is no way the program will return the same results because the program wouldn't be able to get the same long latitude as that user. So, what do you think? Are these scenarios a concern for getting penalized by Google? Thanks, Ted
Intermediate & Advanced SEO | | friendoffood0 -
Number of images on Google?
Hello here, In the past I was able to find out pretty easily how many images from my website are indexed by Google and inside the Google image search index. But as today looks like Google is not giving you any numbers, it just lists the indexed images. I use the advanced image search, by defining my domain name for the "site or domain" field: http://www.google.com/advanced_image_search and then Google returns all the images coming from my website. Is there any way to know the actual number of images indexed? Any ideas are very welcome! Thank you in advance.
Intermediate & Advanced SEO | | fablau1 -
How to find all indexed pages in Google?
Hi, We have an ecommerce site with around 4000 real pages. But our index count is at 47,000 pages in Google Webmaster Tools. How can I get a list of all pages indexed of our domain? trying to locate the duplicate content. Doing a "site:www.mydomain.com" only returns up to 676 results... Any ideas? Thanks, Ben
Intermediate & Advanced SEO | | bjs20100 -
Google places keyword variations
Hi all, I have a site that is ranking #1 in Google Places for its main <city><keyword>search... but it does not rank for any of its basic keyword variations, which I find very confusing.</keyword></city> ie (just an example) Chicago Caterer (ranked #1 in google places)
Intermediate & Advanced SEO | | x2264983x
Chicago Caterers (not ranked in google places)
Chicago Catering (not ranked in google places)
Chicago Catering Company (not ranked in google places)
Chicago Catering Companies (etc..) How can I secure a google places ranking for these simple keyword variations? Do I build links to the google plus page using that anchor text? Do I get citations that contain that keyword somewhere on the page? Do I optimize for these keyword variations on the actual website itself? (not the places listing). Obviously I don't stuff these keywords into the google places listing. Any help would be much appreciated!0 -
How does Google know if a backlink is good or not?
Hi, What does Google look at when assessing a backlink? How important is it to get a backlink from a website with relevant content? Ex: 1. Domain/Page Auth 80, website is not relevant. Does not use any of the words in your target term in any area of the website. 2. Domain/Page Auth 40, website is relevant. Uses the words in your target term multiple times across website. Which website example would benefit your SERP's more if you gained a backlink? (and if you can say, how much more would it benefit - low, medium, high).
Intermediate & Advanced SEO | | activitysuper0 -
Google News URL Structure
Hi there folks I am looking for some guidance on Google News URLs. We are restructuring the site. A main traffic driver will be the traffic we get from Google News. Most large publishers use: www.site.com/news/12345/this-is-the-title/ Others use www.example.com/news/celebrity/12345/this-is-the-title/ etc. www.example.com/news/celebrity-news/12345/this-is-the-title/ www.example.com/celebrity-news/12345/this-is-the-title/ (Celebrity is a channel on Google News so should we try and follow that format?) www.example.com/news/celebrity-news/this-is-the-title/12345/ www.example.com/news/celebrity-news/this-is-the-title-12345/ (unique ID no at the end and part of the title URL) www.example.com/news/celebrity-news/celebrity-name/this-is-the-title-12345/ Others include the date. So as you can see there are so many combinations and there doesnt seem to be any unity across news sites for this format. Have you any advice on how to structure these URLs? Particularly if we want to been seen as an authority on the following topics: fashion, hair, beauty, and celebrity news - in particular "celebrity name" So should the celebrity news section be www.example.com/news/celebrity-news/celebrity-name/this-is-the-title-12345/ or what? This is for a completely new site build. Thanks Barry
Intermediate & Advanced SEO | | Deepti_C0