Welcome to the Q&A Forum

footsteps

The following is added to our robots.txt .. now lets wait and see the results

User-agent: * Disallow: /admin/
Disallow: /?
Allow /?product_date=&product_date2=*
Disallow /?product_date=&product_date2=&

To check the working of the robots.txt i found a handy website;

http://phpweby.com/services/robots

footsteps

The url looks like this;

http://www.sneakerskoopjeonline.nl/herensneakers?product_brand=

So just adding;

User-agent: *
Disallow: /*?product_brand

Should do the trick?
Most important is that herensneakers itself should be indexed, followed and crawled

footsteps

Hi All,

We have a crawler problem on one of our sites www.sneakerskoopjeonline.nl.

On this site, visitors can specify criteria to filter available products. These filters are passed as http/get arguments. The number of possible filter urls is virtually limitless.

In order to prevent duplicate content, or an insane amount of pages in the search indices, our software automatically adds noindex, nofollow and noarchive directives to these filter result pages. However, we’re unable to explain to crawlers (Google in particular) to ignore these urls.

We’ve already changed the on page filter html to javascript, hoping this would cause the crawler to ignore it. However, it seems that Googlebot executes the javascript and crawls the generated urls anyway.

What can we do to prevent Google from crawling all the filter options?

Thanks in advance for the help.

Kind regards,

Gerwin

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

footsteps

@footsteps

Posts made by footsteps

Products

Moz Solutions

Free SEO Tools

Resources

About Moz

Why Moz

Get Involved