Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Block session id URLs with robots.txt
-
Hi,
I would like to block all URLs with the parameter '?filter=' from being crawled by including them in the robots.txt.
Which directive should I use:
User-agent: *
Disallow: ?filter=or
User-agent: *
Disallow: /?filter=In other words, is the forward slash in the beginning of the disallow directive necessary?
Thanks!
-
Hi Martijn,
Thanks for the answer. Regarding the forward slash in the beginning, is it necessary to use this?
In the robots text from Zalando for example, you can see that they don't use it for a lot of filters.
-
Uhh, that's not what the requester is looking for and could actually cause tons of problems if you would apply this on a site that you're unaware of. I would always go with the most limiting robots.txt that you can and in this case, I would go with: /?filter=
-
Hi,
The following should suffice as it will black any URL with a "?" in it
User-agent: * Disallow: /*?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Does google ignore ? in url?
Hi Guys, Have a site which ends ?v=6cc98ba2045f for all its URLs. Example: https://domain.com/products/cashmere/robes/?v=6cc98ba2045f Just wondering does Google ignore what is after the ?. Also any ideas what that is? Cheers.
Intermediate & Advanced SEO | | CarolynSC0 -
Help with facet URLs in Magento
Hi Guys, Wondering if I can get some technical help here... We have our site britishbraces.co.uk , built in Magento. As per eCommerce sites, we have paginated pages throughout. These have rel=next/prev implemented but not correctly ( as it is not in is it in ) - this fix is in process. Our canonicals are currently incorrect as far as I believe, as even when content is filtered, the canonical takes you back to the first page URL. For example, http://www.britishbraces.co.uk/braces/x-style.html?ajaxcatalog=true&brand=380&max=51.19&min=31.19 Canonical to... http://www.britishbraces.co.uk/braces/x-style.html Which I understand to be incorrect. As I want the coloured filtered pages to be indexed ( due to search volume for colour related queries ), but I don't want the price filtered pages to be indexed - I am unsure how to implement the solution? As I understand, because rel=next/prev implemented ( with no View All page ), the rel=canonical is not necessary as Google understands page 1 is the first page in the series. Therefore, once a user has filtered by colour, there should then be a canonical pointing to the coloured filter URL? ( e.g. /product/black ) But when a user filters by price, there should be noindex on those URLs ? Or can this be blocked in robots.txt prior? My head is a little confused here and I know we have an issue because our amount of indexed pages is increasing day by day but to no solution of the facet urls. Can anybody help - apologies in advance if I have confused the matter. Thanks
Intermediate & Advanced SEO | | HappyJackJr0 -
Will disallowing URL's in the robots.txt file stop those URL's being indexed by Google
I found a lot of duplicate title tags showing in Google Webmaster Tools. When I visited the URL's that these duplicates belonged to, I found that they were just images from a gallery that we didn't particularly want Google to index. There is no benefit to the end user in these image pages being indexed in Google. Our developer has told us that these urls are created by a module and are not "real" pages in the CMS. They would like to add the following to our robots.txt file Disallow: /catalog/product/gallery/ QUESTION: If the these pages are already indexed by Google, will this adjustment to the robots.txt file help to remove the pages from the index? We don't want these pages to be found.
Intermediate & Advanced SEO | | andyheath0 -
What do you add to your robots.txt on your ecommerce sites?
We're looking at expanding our robots.txt, we currently don't have the ability to noindex/nofollow. We're thinking about adding the following: Checkout Basket Then possibly: Price Theme Sortby other misc filters. What do you include?
Intermediate & Advanced SEO | | ThomasHarvey0 -
Product or Shop in URL
What do you think is better for seo and for sale, I am using woo-ecommerce for health products website. websitename.com/product/keyword OR websitename.com/shop/keyword
Intermediate & Advanced SEO | | MasonBaker0 -
301 Redirection and apostrophes in URLs
Hi I am experiencing trouble getting any redirects with apostrophes in the URLs to 301 redirect in order to eliminate 404 errors. I have tried replacing the instance of the apostrophe in the source URL field to %27 and variations of this but to no avail. The site is a wordpress site (the old URLS are legacies from the old Business Catalyst site) and I am using the redirection plug in. I have gone into some detail with a helpful soul here http://wordpress.org/support/topic/how-to-deal-with-apostrophes-in-source-url but unfortunately to no result. If anyone has any idea how to solve this puzzle I would be grateful for the help. Example: http://www.tesselaars.com/blog/Inside_Flowers/post/Online_Marketing_for_Florists_Part_1%E2%80%93_A_Website_You_Won%27t_Regret/
Intermediate & Advanced SEO | | Seamoose0 -
Strange URLs, how do I fix this?
I've just check Majestic and have seen around 50 links coming from one of my other sites. The links all look like this: http://www.dwww.mysite.com
Intermediate & Advanced SEO | | JohnPeters
http://www.eee.mysite.com
http://www.w.mysite.com The site these links are coming from is a html site. Any ideas whats going on or a way to get rid of these urls? When I visit the strange URLs such as http://www.dwww.mysite.com, it shows the home page of http://www.mysite.com. Is there a way to redirect anything like this back to the home page?0 -
Where to put a page ID in a URL?
Hello, My company is going to change URLs to example.com/category or example.com/product. When we will change the URLs to product or category pages somehow we have to check whether the requested page is from category table in DB or from products table (this gives much speed to page load time). So we have to choose how to make the different product and category pages.
Intermediate & Advanced SEO | | komeksimas
Programmers said that we need to insert id to URL. So the question is: Which is the better way to place an id to an URL? example.com/product-name?id=111 example.com/product-name/111 example.com/product_name-111 Or maybe we should use some other punctuation mark to separate id from product name? p.s. I have read Dynamic URLs vs. static URLs by Google and it still didn't answered which is the best for all of the pages. Somehow others solve this problem by typing only the names to the URL, but could anyone tell what that technology should be?0