Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
How to prevent Google from crawling our product filter?
-
Hi All,
We have a crawler problem on one of our sites www.sneakerskoopjeonline.nl.
On this site, visitors can specify criteria to filter available products. These filters are passed as http/get arguments. The number of possible filter urls is virtually limitless.
In order to prevent duplicate content, or an insane amount of pages in the search indices, our software automatically adds noindex, nofollow and noarchive directives to these filter result pages. However, we’re unable to explain to crawlers (Google in particular) to ignore these urls.
We’ve already changed the on page filter html to javascript, hoping this would cause the crawler to ignore it. However, it seems that Googlebot executes the javascript and crawls the generated urls anyway.
What can we do to prevent Google from crawling all the filter options?
Thanks in advance for the help.
Kind regards,
Gerwin
-
The following is added to our robots.txt .. now lets wait and see the results
User-agent: * Disallow: /admin/
Disallow: /?
Allow /?product_date=&product_date2=*
Disallow /?product_date=&product_date2=&To check the working of the robots.txt i found a handy website;
-
The url looks like this;
http://www.sneakerskoopjeonline.nl/herensneakers?product_brand=
So just adding;
User-agent: *
Disallow: /*?product_brandShould do the trick?
Most important is that herensneakers itself should be indexed, followed and crawled -
I would use your robots.txt file to prevent them from crawling the specific strings / pages. Go into your Google Webmaster Tools and you can see all the information Google has on your site and any issues, you can also specify robots.txt information in there. That would be the best route as Google is obedient with what is on the robots.txt file. If you want more information about robots.txt, go here.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Will Google View Using Google Translate As Duplicate?
If I have a page in English, which exist on 100 other websites, we have a case where my website has duplicate content. What if I use Google Translate to translate the page from English to Japanese, as the only website doing this translation will my page get credit for producing original content? Or, will Google view my page as duplicate content, because Google can tell it is translated from an original English page, which runs on 100+ different websites, since Google Translate is Google's own software?
Intermediate & Advanced SEO | | khi50 -
Page position dropped on Google
Hey Guys, My web designer has recommended this forum to use, the reason being: my google position has been dropped from page 1 to page 10 in the last week. The site is weloveschoolsigns.co.uk, but our main business site is textstyles.co.uk the school signs are a product of text styles. I have been told off my SEO company, that because I have changed the school logo to the text styles logo, Google have penalised me for it, and dropped us from page 1 for numerous keywords, to page 10 or more. They have also said that duplicate content within the school site http://www.weloveschoolsigns.co.uk/school-signs-made-easy/ has also a contributed to the drop in positions. (this content is not on the textstyles site) Lastly they said, that having the same telephone number is a definate no no. They said that I have been penalised, because google see the above as trying to monopolise on the market. I don’t know if all this is true, as the SEO is way above my head, but they have quoted me £1250 to repair all the errors, when the site only cost £750. They have also mentioned that because of the above changes, the main text styles site will also be punished. Any thoughts on this matter would be much appreciated as I don't know whether to pay them to crack on, or accept the new positions. Either way I'm very confused. Thanks Thomas
Intermediate & Advanced SEO | | TextStylesUK0 -
Number of images on Google?
Hello here, In the past I was able to find out pretty easily how many images from my website are indexed by Google and inside the Google image search index. But as today looks like Google is not giving you any numbers, it just lists the indexed images. I use the advanced image search, by defining my domain name for the "site or domain" field: http://www.google.com/advanced_image_search and then Google returns all the images coming from my website. Is there any way to know the actual number of images indexed? Any ideas are very welcome! Thank you in advance.
Intermediate & Advanced SEO | | fablau1 -
Google is displaying wrong address
I have a client whose Google Places listing is not showing correctly. We have control of the page, and have the address verified by postcard. Yet when we view the listing it shows a totally different address that is miles away and on a totally different street. We have relogged into manage the business listing and all of the info is correct. We dragged the marker and submitted it to them that they had things wrong and left a note with the right address. Why would this happen and how can we fix it? Right now they rank highly but with a blatantly wrong address.
Intermediate & Advanced SEO | | Atomicx0 -
Multiple Authors Google + Authorship
Hello, I took a look through past questions but can't seem to find a definitive answer on setting up Google + Authorship credit (for multiple authors) using a Wordpress blog. Has anyone had experience setting this up? Or could you recommend solid reading/research? I took a look at a couple of Wordpress plug in's but just found them very confusing (so did our IT contact who will ultimately be setting up code for this.) Any direction or advice is appreciated.
Intermediate & Advanced SEO | | SEOSponge0 -
How does Google know if a backlink is good or not?
Hi, What does Google look at when assessing a backlink? How important is it to get a backlink from a website with relevant content? Ex: 1. Domain/Page Auth 80, website is not relevant. Does not use any of the words in your target term in any area of the website. 2. Domain/Page Auth 40, website is relevant. Uses the words in your target term multiple times across website. Which website example would benefit your SERP's more if you gained a backlink? (and if you can say, how much more would it benefit - low, medium, high).
Intermediate & Advanced SEO | | activitysuper0 -
Indexed Pages in Google, How do I find Out?
Is there a way to get a list of pages that google has indexed? Is there some software that can do this? I do not have access to webmaster tools, so hoping there is another way to do this. Would be great if I could also see if the indexed page is a 404 or other Thanks for your help, sorry if its basic question 😞
Intermediate & Advanced SEO | | JohnPeters0 -
Url structure for multiple search filters applied to products
We have a product catalog with several hundred similar products. Our list of products allows you apply filters to hone your search, so that in fact there are over 150,000 different individual searches you could come up with on this page. Some of these searches are relevant to our SEO strategy, but most are not. Right now (for the most part) we save the state of each search with the fragment of the URL, or in other words in a way that isn't indexed by the search engines. The URL (without hashes) ranks very well in Google for our one main keyword. At the moment, Google doesn't recognize the variety of content possible on this page. An example is: http://www.example.com/main-keyword.html#style=vintage&color=blue&season=spring We're moving towards a more indexable URL structure and one that could potentially save the state of all 150,000 searches in a way that Google could read. An example would be: http://www.example.com/main-keyword/vintage/blue/spring/ I worry, though, that giving so many options in our URL will confuse Google and make a lot of duplicate content. After all, we only have a few hundred products and inevitably many of the searches will look pretty similar. Also, I worry about losing ground on the main http://www.example.com/main-keyword.html page, when it's ranking so well at the moment. So I guess the questions are: Is there such a think as having URLs be too specific? Should we noindex or set rel=canonical on the pages whose keywords are nested too deep? Will our main keyword's page suffer when it has to share all the inbound links with these other, more specific searches?
Intermediate & Advanced SEO | | boxcarpress0