Block Googlebot from submit button
-
Hi,
I have a website where many searches are made by the googlebot on our internal engine. We can make noindex on result page, but we want to stop the bot to call the ajax search button - GET form (because it pass a request to an external API with associate fees).
So, we want to stop crawling the form button, without noindex the search page itself. The "nofollow" tag don't seems to apply on button's submit.
Any suggestion?
-
Hey Olivier,
You could detect the user agent and hide the button. The difference isn't substantial enough to be called cloaking.
Or you could make the button not actually a button tag, but another tag with that traps clicks with a JS event. I'm not sure Google's headless browser is smart enough to automate that. I would try this first and if it doesn't work switch to the user agent detection idea.
Let us know how it goes!
-Mike
-
-
Can always do it in a programme the bot's can't use or hide it behind a log in field etc.
I also give you the following for consumption :
http://moz.com/blog/12-ways-to-keep-your-content-hidden-from-the-search-engines
Good luck!
-
Hi Bernard
Are you able to provide a link to the web form containing the submit button?
Peter
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How Submit to different Countries
Hey There, One of My client Needs To rank his site in multiple Location , like He is in Australia But wants To Promote his Site in Canada, india, USA,So What are the things i can do For Rank in Others Country. Please any Expert Help. Thanx
Intermediate & Advanced SEO | | nupuriepl0 -
How to Submit My new Website in All Search Engines
Hello Everyone, Can Any body help to suggest Good software, or Any other to easily Submit my website , to All Search Engines ? ? Any expert Can help please, Thanx in Advance
Intermediate & Advanced SEO | | falguniinnovative0 -
Whole site blocked by robots in webmaster tools
My URL is: www.wheretobuybeauty.com.auThis new site has been re-crawled over last 2 weeks, and in webmaster tools index status the following is displayed:Indexed 50,000 pagesblocked by robots 69,000Search query 'site:wheretobuybeauty.com.au' returns 55,000 pagesHowever, all pages in the site do appear to be blocked and over the 2 weeks, the google search query site traffic declined from significant to zero (proving this is in fact the case ).This is a Linux php site and has the following: 55,000 URLs in sitemap.xml submitted successfully to webmaster toolsrobots.txt file existed but did not have any entries to allow or disallow URLs - today I have removed robots.txt file completely URL re-direction within Linux .htaccess file - there are many rows within this complex set of re-directions. Developer has double checked this file and found that it is valid.I have read everything that google and other sources have on this topic and this does not help. Also checked webmaster crawl errors, crawl stats, malware and there is no problem there related to this issue.Is this a duplicate content issue - this is a price comparison site where approx half the products have duplicate product descriptions - duplicated because they are obtained from the suppliers through an XML data file. The suppliers have the descriptions from the files in their own sites.Help!!
Intermediate & Advanced SEO | | rrogers0 -
Manipulate Googlebot
**Problem: I have found something wierd on the server log as below. the googlebot visit the folders and files which do not exist at all. there is no photo folder on the server, but googlebot visit the files inside the photo folder and return 404 error. ** I wonder if it is SEO hacking attempts, and how can someone manage to Manipulate Googlebot. ================================================== **66.249.71.200 - - [22/Aug/2012:02:31:53 -0400] "GET /robots.txt HTTP/1.0" 200 2255 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" ** **66.249.71.25 - - [22/Aug/2012:02:36:55 -0400] "GET /photo/pic24.html HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" 66.249.71.26 - - [22/Aug/2012:02:37:03 -0400] "GET /photo/pic20.html HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" 66.249.71.200 - - [22/Aug/2012:02:37:11 -0400] "GET /photo/pic22.html HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" 66.249.71.200 - - [22/Aug/2012:02:37:28 -0400] "GET /photo/pic19.html HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" 66.249.71.26 - - [22/Aug/2012:02:37:36 -0400] "GET /photo/pic17.html HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" 66.249.71.200 - - [22/Aug/2012:02:37:44 -0400] "GET /photo/pic21.html HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" **
Intermediate & Advanced SEO | | semer1 -
Does using robots.txt to block pages decrease search traffic?
I know you can use robots.txt to tell search engines not to spend their resources crawling certain pages. So, if you have a section of your website that is good content, but is never updated, and you want the search engines to index new content faster, would it work to block the good, un-changed content with robots.txt? Would this content loose any search traffic if it were blocked by robots.txt? Does anyone have any available case studies?
Intermediate & Advanced SEO | | nicole.healthline0 -
Googlebot crawling partial URLs
Hi guys, I've checked my email this morning and I've got a number of 404 errors over the weekend where Google has tried to crawl some of my existing pages but not found the full URL. Instead of hitting 'domain.com/folder/complete-pagename.php' it's hit 'domain.com/folder/comp'. This is definitely Googlebot/2.1; http://www.google.com/bot.html (66.249.72.53) but I can't find where it would have found only the partial URL. It certainly wasn't on the domain it's crawling and I can't find any links from external sites pointing to us with the incorrect URL. GoogleBot is doing the same thing across a single domain but in different sub-folders. Having checked Webmaster Tools there aren't any hard 404s and the soft ones aren't related and haven't occured since August. I'm really confused as to how this is happening.. Thanks!
Intermediate & Advanced SEO | | panini0 -
Googlebot + Meta-Refresh
Quick question, can Googlebot (or other search engines) follow meta refresh tags? Does it work anything like a 301 in terms of passing value to the new page?
Intermediate & Advanced SEO | | kchandler1