How to prevent Google from crawling our product filter?
-
Hi All,
We have a crawler problem on one of our sites www.sneakerskoopjeonline.nl.
On this site, visitors can specify criteria to filter available products. These filters are passed as http/get arguments. The number of possible filter urls is virtually limitless.
In order to prevent duplicate content, or an insane amount of pages in the search indices, our software automatically adds noindex, nofollow and noarchive directives to these filter result pages. However, we’re unable to explain to crawlers (Google in particular) to ignore these urls.
We’ve already changed the on page filter html to javascript, hoping this would cause the crawler to ignore it. However, it seems that Googlebot executes the javascript and crawls the generated urls anyway.
What can we do to prevent Google from crawling all the filter options?
Thanks in advance for the help.
Kind regards,
Gerwin
-
The following is added to our robots.txt .. now lets wait and see the results
User-agent: * Disallow: /admin/
Disallow: /?
Allow /?product_date=&product_date2=*
Disallow /?product_date=&product_date2=&To check the working of the robots.txt i found a handy website;
-
The url looks like this;
http://www.sneakerskoopjeonline.nl/herensneakers?product_brand=
So just adding;
User-agent: *
Disallow: /*?product_brandShould do the trick?
Most important is that herensneakers itself should be indexed, followed and crawled -
I would use your robots.txt file to prevent them from crawling the specific strings / pages. Go into your Google Webmaster Tools and you can see all the information Google has on your site and any issues, you can also specify robots.txt information in there. That would be the best route as Google is obedient with what is on the robots.txt file. If you want more information about robots.txt, go here.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Update
My rank has dropped quite a lot this past week and I can see from the Moz tools that there is an unconfirmed Google update responsible. Is there any information from Moz on this?
Intermediate & Advanced SEO | | moon-boots0 -
Reclaiming Ranking positions in Google
We have a website we are working on that was ranking well in Google but since having a hosting upgrade has completely dropped in rankings. When a hosting upgrade was made, the developer added an incorrect robots.txt file that restricted the site from being found, hence resulting in lost rankings. We have since sorted out that issue so the robots.txt is OK. However, ranking results have yet to be reclaimed. We are unsure why these rankings haven't rebounded back, as it has been a while now. The site is https://www.brightonpanelworks.com.au. We have since also attempted to add a sitemap however to help the site be better crawled and to regain rankings, however, it appears that sitemap generators are having problems creating a sitemap for this site and we are not sure why. And we are not sure whether this may relate to why Google has not picked up on pages and ranking results have not be restored. If you have any ideas as to how we can reclaim rankings to the strong positions they were in previously, that would be much appreciated. We believe we may be missing something here that is not allowing webpages to be picked up and ranked by Google.
Intermediate & Advanced SEO | | Gavo0 -
Crawl Test Question
Good Morning, I am just looking for a little bit of advice, I ran a crawl report on our website www.swiftcomm.co.uk. I have resolved most of the issues myself, however I have two questions;- Screenshot image http://imgur.com/VlFEiZ2 Highlighted blue, we have two homepages www.swiftcomm.co.uk and www.swiftcomm.co.uk/ both are set with a Rel-Canonical Target of www.swiftcomm.co.uk/. Will this cause me any SEO issues and or other potential issue? If this may cause an issue how would I go about resolving? Highlighted yellow, Our contact and referral-form are showing as duplicate title and meta description. Both of these pages have separate title and meta desc which it does seem to be detecting. If I search the page in google it returns the correct title and meta desc. The only common denominator behind these pages is that both have php pages behind them for the contact form. Do you think that the moz crawl may be detecting the php page over the html? Could this be cause any issues when search engines crawl the site? Kind Regards Jonathan Mack VlFEiZ2
Intermediate & Advanced SEO | | JMack9860 -
Google Webmaster Tools Parameters
We have several large ecommerce websites, and we've added some tracking parameters to GWT for google to ignore. All pages are correctly canonicaled. Google has been ignoring the parameters and the canonicals, and still ranks many parametered pages for us. Has anyone run into this?
Intermediate & Advanced SEO | | AMHC0 -
Google news and Yoast News
Hi, I have a blog, I want to send my blog to Google news with the plugin "Yoast news".
Intermediate & Advanced SEO | | JohnPalmer
If I'll change the meta-title and ill keep the title of the post as is, for example:
Meta-title (yoast) - TEXT for Search engines | My Brand name
Post tilte - for users - TExT For Users and BlaBla there is a problem? the title of the page and the title of the meta should be same for Google NEWS?0 -
Does Google check Whois
Hello everyone, I own quite a lot of website active in the same niche and sometimes targeting the same keywords, these sites are hosted at different IP's. But they all have the same Whois details, i was wondering if Google checks the Whois-data? And if it affects the serp's? Regards, Yannick
Intermediate & Advanced SEO | | iwebdevnl0 -
Duplicate Content on Product Pages
I'm getting a lot of duplicate content errors on my ecommerce site www.outdoormegastore.co.uk mainly centered around product pages. The products are completely different in terms of the title, meta data, product descriptions and images (with alt tags)but SEOmoz is still identifying them as duplicates and we've noticed a significant drop in google ranking lately. Admittedly the product descriptions are a little bit thin but I don't understand why the pages would be viewed as duplicates and therefore can be ranked lower? The content is definitely unique too. As an example these three pages have been identified as being duplicates of each other. http://www.outdoormegastore.co.uk/regatta-landtrek-25l-rucksack.html http://www.outdoormegastore.co.uk/canyon-bryce-adult-cycling-helmet-9045.html http://www.outdoormegastore.co.uk/outwell-minnesota-6-carpet-for-green-07-08-tent.html
Intermediate & Advanced SEO | | gavinhoman0 -
Does Google punish sites for Backlinks?
Here is Matt Cutts video, for those of you who have not seen it already. http://www.youtube.com/watch?v=f4dAWb5jUws (Very Short) In this Video Matt explains that Google does not look at backlinks. Many link spamming sites have detected, there have been many website receiving warning messages in their Google web tools to deindex these links, etc.. My theory is that Google will not punish sites for backlinks. However, they manually check for "link farming sites" and warn anyone affiliated with them, just in case these links were built from a competitor. This way they can eliminate all the "Bad Link Farm" sites and not hurt anyone who does not deserve to be hurt. Google is not going to give us all their information to rank, they dont want us to rank. They want us to PPC. However, they do want to have the best SERPs available. I call it Google juggling! Thoughts?
Intermediate & Advanced SEO | | SEODinosaur0