How to prevent Google from crawling our product filter?
-
Hi All,
We have a crawler problem on one of our sites www.sneakerskoopjeonline.nl.
On this site, visitors can specify criteria to filter available products. These filters are passed as http/get arguments. The number of possible filter urls is virtually limitless.
In order to prevent duplicate content, or an insane amount of pages in the search indices, our software automatically adds noindex, nofollow and noarchive directives to these filter result pages. However, we’re unable to explain to crawlers (Google in particular) to ignore these urls.
We’ve already changed the on page filter html to javascript, hoping this would cause the crawler to ignore it. However, it seems that Googlebot executes the javascript and crawls the generated urls anyway.
What can we do to prevent Google from crawling all the filter options?
Thanks in advance for the help.
Kind regards,
Gerwin
-
The following is added to our robots.txt .. now lets wait and see the results
User-agent: * Disallow: /admin/
Disallow: /?
Allow /?product_date=&product_date2=*
Disallow /?product_date=&product_date2=&To check the working of the robots.txt i found a handy website;
-
The url looks like this;
http://www.sneakerskoopjeonline.nl/herensneakers?product_brand=
So just adding;
User-agent: *
Disallow: /*?product_brandShould do the trick?
Most important is that herensneakers itself should be indexed, followed and crawled -
I would use your robots.txt file to prevent them from crawling the specific strings / pages. Go into your Google Webmaster Tools and you can see all the information Google has on your site and any issues, you can also specify robots.txt information in there. That would be the best route as Google is obedient with what is on the robots.txt file. If you want more information about robots.txt, go here.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
No images in Google index
No images are indexed on this site (client of ours): http://www.rubbermagazijn.nl/. We've tried everything (descriptive alt texts, image sitemaps, fetch&render, check robots) but a site:www.rubbermagazijn.nl shows 0 image results and the sitemap report in Search Console shows 0 images indexed. We're not sure how to proceed from here. Is there anyone with an idea what the problem could be?
Intermediate & Advanced SEO | | Adriaan.Multiply0 -
Google + and Schema
I've noticed with a few of the restaurant clients I work with that Schema isn't contributing at all to their SERP -- their Google + page is. Is there any way to have more control over what Google is pulling to help make UX better? I.e. showing photos of the restaurant without a logo, etc.
Intermediate & Advanced SEO | | Anti-Alex0 -
Google local pointing to Google plus page not homepage
Today my clients homepage dropped off the search results page (was #1 for months, in the top for years). I noticed in the places account everything is suddenly pointing at the Google plus page? The interior pages are still ranking. Any insight would be very helpful! Thanks.
Intermediate & Advanced SEO | | stevenob0 -
Crawling issue
Hello, I am working on 3 weeks old new Magento website. On GWT, under index status >advanced, I can only see 1 crawl on the 4th day of launching and I don't see any numbers for indexed or blocked status. | Total indexed | Ever crawled | Blocked by robots | Removed |
Intermediate & Advanced SEO | | sedamiran
| 0 | 1 | 0 | 0 | I can see the traffic on Google Analytic and i can see the website on SERPS when i search for some of the keywords, i can see the links appear on Google but i don't see any numbers on GWT.. As far as I check there is no 'no index' or robot block issue but Google doesn't crawl the website for some reason. Any ideas why i cannot see any numbers for indexed or crawled status on GWT? Thanks Seda | | | | |
| | | | |0 -
Question about Google Search Results
I have a question regarding google search results. I have a website www.911signalusa.com when you type this into google search box the URL comes up repeatedly. I have several competitors here is one of them www.emergencycity.com when you type in their name it only come up as the first result. How did our SEO guys make this happen? I have another site tha when we type in the URL it only comes up as the first result. However when you do site:www.------.com All of these site are indexed in Google. It is not causing any problem we knoe of but it appears to me that our 1 site has it better? Or is it that maybe there are very minimal links to the site? Thank you for your time and consideration in answering my quesiton.
Intermediate & Advanced SEO | | scamper0 -
Not ranking on Bing but is on Google?
Hi What are the main differences between Bing and Google in terms of ranking sites? My site is ranking well in Google but in Bing it is very low down and does not deliver much traffic. In Bing webmaster tools there are no warning messages and I had sent in a sitemap back in 2011 and 77 pages are listed, but I had not submitted a URL could this be why my pages are not ranking highly? Or does anybody have a checklist on what a site should offer to get ranking on Bing?
Intermediate & Advanced SEO | | ocelot0 -
Crawl Budget on Noindex Follow
We have a list of crawled product search pages where pagination on Page 1 is indexed and crawled and page 2 and onward is noindex, noarchive follow as we want the links followed to the Product Pages themselves. (All product Pages have canonicals and unique URLs) Orr search results will be increasing the sets, and thus Google will have more links to follow on our wesbite although they all will be noindex pages. will this impact our carwl budget and additionally have impact to our rankings? Page 1 - Crawled Indexed and Followed Page 2 onward - Crawled No-index No-Archive Followed Thoughts? Thanks, Phil G
Intermediate & Advanced SEO | | AU-SEO0 -
How to make Google forget my pages ?
Hello all ! I've decided to delete many pages from my website wich had poor content. I've made a php 301 redirect from all these old pages to a unique page (not the home page, a deep page). My problem is that this modification has been made a week ago and my position in the SERPs have crashed down... What can I do ? I believe that I'll get up again when Google will see that these pages don't exist anymore but it could take a long time 😞 (these page are in the Google cache with a date older than my modification's date) I've read somewhere that I should put a link to the destination page (where old pages are 301 redirected) but I don't understand how it could help... Can someone help me ? Tell me what I've done wrong... These pages were very poor and I've deleted them in order to boost the global quality of my site... It should help me in the SERPs, not penalize me...
Intermediate & Advanced SEO | | B-CITY0