How to prevent Google from crawling our product filter?

footsteps

Hi All,

We have a crawler problem on one of our sites www.sneakerskoopjeonline.nl.

On this site, visitors can specify criteria to filter available products. These filters are passed as http/get arguments. The number of possible filter urls is virtually limitless.

In order to prevent duplicate content, or an insane amount of pages in the search indices, our software automatically adds noindex, nofollow and noarchive directives to these filter result pages. However, we’re unable to explain to crawlers (Google in particular) to ignore these urls.

We’ve already changed the on page filter html to javascript, hoping this would cause the crawler to ignore it. However, it seems that Googlebot executes the javascript and crawls the generated urls anyway.

What can we do to prevent Google from crawling all the filter options?

Thanks in advance for the help.

Kind regards,

Gerwin

footsteps

The following is added to our robots.txt .. now lets wait and see the results

User-agent: * Disallow: /admin/
Disallow: /?
Allow /?product_date=&product_date2=*
Disallow /?product_date=&product_date2=&

To check the working of the robots.txt i found a handy website;

http://phpweby.com/services/robots

footsteps

The url looks like this;

http://www.sneakerskoopjeonline.nl/herensneakers?product_brand=

So just adding;

User-agent: *
Disallow: /*?product_brand

Should do the trick?
Most important is that herensneakers itself should be indexed, followed and crawled

alexhoug

I would use your robots.txt file to prevent them from crawling the specific strings / pages. Go into your Google Webmaster Tools and you can see all the information Google has on your site and any issues, you can also specify robots.txt information in there. That would be the best route as Google is obedient with what is on the robots.txt file. If you want more information about robots.txt, go here.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Moz Q&A is closed.

How to prevent Google from crawling our product filter?

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Can Google Crawl & Index my Schema in CSR JavaScript

Does google ignore ? in url?

"Null" appearing as top keyword in "Content Keywords" under Google index in Google Search Console

Prevent Google from crawling Ajax

Pages are Indexed but not Cached by Google. Why?

Buying a domain banned by google

How does Google know if a backlink is good or not?

Why Put an H1 Tag On A Product?

Products

Moz Solutions

Free SEO Tools

Resources

About Moz

Why Moz

Get Involved