Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
How to stop Search Bot from crawling through a submit button
-
On our website http://www.thefutureminders.com/, we have three form fields that have three pull downs for Month, Day, and year. This is creating duplicate pages while indexing. How do we tell the search Bot to index the page but not crawl through the submit button?
Thanks
Naren
-
Hi Dan
What is happening is this - since we have all the months [12], all the dates [31] and years[1921 through 2011] in the form fields, the robot seems to be taking these incrementally and then using the submit button. After the submit button, user is presented with a registration page. While we do want the search to index the rest of the page and the crawl through the rest of the page links we do not want it to crawl through that submit button. I hope I am making sense.
Naren
-
The advantage of blocking a page from being indexed via a meta tag is it is less likely to have unexpected consequences. I've often seen in the past cases where an incorrectly modified robots.txt file leads to a site being blocked by accident.
-
Hi
To my knowledge, you don't stop it from crawling through the button (like a nofollowed link), rather you block the robot at the page it ends up on after clicking submit.
Say the user hits submit and it takes them to mydomain.com/confirm.html On that page you'll want to add;
....if you want it to NOT index the page but follow the links on it.
or
...if you want it to NOT index and NOT follow the links on that page.
Its advised that its better to do this with the meta tag than in robots.txt.
Hopefully I've understood the question correctly!
-Dan
-
Block the pages/folders you do not wish to be indexed with robots.txt file:
User-agent: * Disallow: /folder1/ Disallow: /folder2/OR you can add canonical tags to the other pages which are creating duplicate content.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate, submitted URL not selected as canonical
Hi all, A number of our pages have dropped out of search rankings. It seems they are being marked as "Duplicate, submitted URL not selected as canonical" However, the page Google is choosing as the canonical is totally different - different headings, titles, metadata, content on the page. We are completely mystified as to why this is happening. If anyone can shed any light, it would be hugely appreciated! Example URL is this one:
Technical SEO | | Eric_S
https://www.vouchedfor.co.uk/IFA-financial-advisor-mortgage/london Which Google seems to think is a duplicate of this: https://www.vouchedfor.co.uk/solicitor/london0 -
Indexed, but not shown in search result
Hi all We face this problem for www.residentiebosrand.be, which is well programmed, added to Google Search Console and indexed. Web pages are shown in Google for site:www.residentiebosrand.be. Website has been online for 7 weeks, but still no search results. Could you guys look at the update below? Thanks!
Technical SEO | | conversal0 -
Spam URL'S in search results
We built a new website for a client. When I do 'site:clientswebsite.com' in Google it shows some of the real, recently submitted pages. But it also shows many pages of spam url results, like this 'clientswebsite.com/gockumamaso/22753.htm' - all of which then go to the sites 404 page. They have page titles and meta descriptions in Chinese or Japanese too. Some of the urls are of real pages, and link to the correct page, despite having the same Chinese page titles and descriptions in the SERPS. When I went to remove all the spammy urls in Search Console (it only allowed me to temporarily hide them), a whole load of new ones popped up in the SERPS after a day or two. The site files itself are all fine, with no errors in the server logs. All the usual stuff...robots.txt, sitemap etc seems ok and the proper pages have all been requested for indexing and are slowly appearing. The spammy ones continue though. What is going on and how can I fix it?
Technical SEO | | Digital-Murph0 -
Do you need a canonical tag for search and filter pages?
Hi Moz Community, We've been implementing new canonical tags for our category pages but I have a question about pages that are found via search and our filtering options. Would we still need a canonical tag for pages that show up in search + a filter option if it only lists one page of items? Example below. www.uncommongoods.com/search.html/find/?q=dog&exclusive=1 Thanks!
Technical SEO | | znotes0 -
Expired domain 404 crawl error
I recently purchased a Expired domain from auction and after I started my new site on it, I am noticing 500+ "not found" errors in Google Webmaster Tools, which are generating from the previous owner's contents.Should I use a redirection plugin to redirect those non-exist posts to any new post(s) of my site? or I should use a 301 redirect? or I should leave them just as it is without taking further action? Please advise.
Technical SEO | | Taswirh1 -
Mobile SERPS: how to optimize for call button
Hi, I have 2 questions about the "call" button on mobile google serps when doing a business name search: -since when is this button available in SERPS -is there anything specific you can do to actually have google display that call button (schema.org, ...) Kind regards Pieter
Technical SEO | | TruvoDirectories0 -
Does Google Bot accept Cookies
I am working with a per page results refinement that stores a cookie on the users computer and then keeps that same per page as the user goes around the site. I was just wondering if that was true for Google bot or Bing bot as well. Will they keep the cookie or would they not be able to accept it. I just want to know as I dont want different urls created if they can keep the cookie. Thanks!
Technical SEO | | Gordian0 -
Crawling image folders / crawl allowance
We recently removed /img and /imgp from our robots.txt file thus allowing googlebot to crawl our image folders. Not sure why we had these blocked in the first place, but we opened them up in response to an email from Google Product Search about not being able to crawl images - which can/has hurt our traffic from Google Shopping. My question is: will allowing Google to crawl our image files eat up our 'crawl allowance'? We wouldn't want Google to not crawl/index certain pages, and ding our organic traffic, because more of our allotted crawl bandwidth is getting chewed up crawling image files. Outside of the non-detailed crawl stat graphs from Webmaster Tools, what's the best way to check how frequently/ deeply our site is getting crawled? Thanks all!
Technical SEO | | evoNick0