Best practice for disallowing URLS with Robots.txt
-
Hi Everybody,
We are currently trying to tidy up the crawling errors which are appearing when we crawl the site. On first viewing, we were very worried to say the least:17000+. But after looking closer at the report, we found the majority of these errors were being caused by bad URLs featuring:
- Currency - For example: "directory/currency/switch/currency/GBP/uenc/aHR0cDovL2NlbnR1cnlzYWZldHkuY29tL3dvcmt3ZWFyP3ByaWNlPTUwLSZzdGFuZGFyZHM9NzEx/"
- Color - For example: ?color=91
- Price - For example: "?price=650-700"
- Order - For example: ?dir=desc&order=most_popular
- Page - For example: "?p=1&standards=704"
- Login - For example: "customer/account/login/referer/aHR0cDovL2NlbnR1cnlzYWZldHkuY29tL2NhdGFsb2cvcHJvZHVjdC92aWV3L2lkLzQ1ODczLyNyZXZpZXctZm9ybQ,,/"
My question now is as a novice of working with Robots.txt, what would be the best practice for disallowing URLs featuring these from being crawled?
Any advice would be appreciated!
-
If you are looking to disallow url parameters you could use something like the following as a convention.
Disallow: /? or Disallow: /?dir=&order=&p= if you wanted to be more accurate with specific parameters. There have been a few Moz questions of this type over the last few years, if you do look to remove the parameters.
Also try and ensure that the product pages you have listed are well canonicalised and point to the original product etc. A good review on how to do this can be found here. This will in most cases be enough to remove any indexation/duplicate issues.
-
First I assume you have webmaster tools set up?
They have a robots.txt tester tool which you can test out different parameters to make sure you get the right syntax. For example color would be blocked by: Disallow: /?color=91* and you would follow that similar format more or less.
If you are confused I highly recommend reading through Moz's robots.txt best practices guide before you make any changes. Be sure to test all out in webmaster tools(search console)>robots.txt tester.
Let me know if you run into any problems.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
SEO Best Practices regarding Robots.txt disallow
I cannot find hard and fast direction about the following issue: It looks like the Robots.txt file on my server has been set up to disallow "account" and "search" pages within my site, so I am receiving warnings from the Google Search console that URLs are being blocked by Robots.txt. (Disallow: /Account/ and Disallow: /?search=). Do you recommend unblocking these URLs? I'm getting a warning that over 18,000 Urls are blocked by robots.txt. ("Sitemap contains urls which are blocked by robots.txt"). Seems that I wouldn't want that many urls blocked. ? Thank you!!
Intermediate & Advanced SEO | | jamiegriz0 -
Www. or naked url?
Hi everyone, I am about to start a new WordPress site and debating whether to use www or naked URL for the URL structure. Using naked URL makes sense from a branding and minimalistic perspective but I am reading that using naked URL might have some technical deficiencies. Specifically, cookie issues and DNS can't be cname. Are these technical deficiencies still valid when using naked url? Would appreciate any feedback on this! Cheers
Intermediate & Advanced SEO | | nsereke1 -
SEO Best eCommerce Practice - Same Product Different Keywords
I want to target different keywords for the same e-commerce product. What's the best SEO practice? I'm aware of the pitfalls to keyword stuffing. The product example is the GoPro Hero 5 Action Camera. The same action camera can be used in many different activities, e.g. surfing, auto racing, mountain biking, sky diving, search & rescue, law enforcement etc. These activities target completely different markets, so naturally the keywords are different. I have three strategies to tackle the issue. Please let me know which one you think is best. 1) Create different keyword landing pages with a call-to-action to the same conversion page Each landing page will be optimized for the targeted keywords e.g. surfing, auto racing, mountain biking, sky diving, search & rescue etc. Obviously this will be a big task because there will be numerous landing pages. Each page will show how the product can be used in these activities. For Surfing, the content would include surfing images with the GoPro Hero 5, instructions on how to mount the camera to a surfboard, waterproof tests, surfing testimonials and surfing owner reviews, etc. The call-to-action leads to a generic product conversion page displaying product information such as specs, weight, video formats, price, shipping, warranty etc. The same product page will be the call-to-action for all keyword landing pages. Positives Vast number of targeting long-tail keywords, numerous landing pages Good specific user experience who may be looking for "underwater action camera" (specific mounting instructions related to surfboards etc.) Less duplicate content as there is only one product page showing the same information Negatives Challenging to come up with each page for the vast amount of activities. Inbound Link Considerations
Intermediate & Advanced SEO | | ChrisCK
Inbound links from publications can link directly to the product page or the keyword landing page Surf Magazine may link to:
"Surfing Action Camera | GoPro Hero 5 | GoPro.com" - gopro.com/hero5/underwater-surf-camera
"GoPro Hero 5 Action Camera | GoPro.com" - gopro.com/hero5 2) Create different keyword landing pages with call-to-action to directly add product to cart Similar to the first option, but the call-to-action on the landing page is to Add Hero 5 to Cart. The user experience will be similar, the content creation challenges will be similar, but the techy product info e.g. specs, price, video format, etc. will be displayed on the same landing page. Positives Same benefit to long-tail keywords targeting Same benefit to a good, specific user experience Negatives Same challenges to create each long-tail keyword landing page Since there is no aggregate "product page", inbound links will be split between the landing pages Splitting of Page Authority to each landing conversion page Surf Magazine will link to:
"Surfing Action Camera | GoPro Hero 5 | GoPro.com" - gopro.com/hero5/underwater-surf-camera
Cycling Magazine will link to:
"Cycling Action Camera | GoPro Hero 5 | GoPro.com" - gopro.com/hero5/cycling-camera 3) Create conversion-focused product page with casual blog about keywords This is currently what GoPro has chosen - GoPro Hero 5. The product page displays the many different types of activities on the same page. The page is focused on the user experience with images of the action camera being used in different cool activities, showing its versatility. Note, very little long-tail keyword targeting on this page, instead they could use a broad keyword "action camera". To target long-tails, maybe a blog can be used brand ambassadors displaying the product being used in the various activities. Positives User experience focused Higher conversion rate Less content creation work Inbound links go to the same product page, building Page Authority Negatives Poor ranking with short-tail keyword (GoPro is not even in Top 10 SERP for "action camera") Poor ranking with long-tail keywords, (GoPro doesn't rank for "diving camera, cycling camera, surf camera") For blogging the long-tail keywords, who really converts from landing on a blog of the actual seller?! I hope those three strategies were explained clear enough and have enough of a differentiator. Please let me know what you think!0 -
The Consequences & Best Practices In Changing Domains
Working with a long established/organic successful site that, for brand reasons I disagree with, is verging on changing its domain name. Other than 301ing individual pages to their new domain name equivalent, getting canonicals updated, updating SSL certificates, new Google Search Console with old settings, maintaining the old robots.txtetc what else is worth paying attention to? Assuming I do all of that, how bad a hit to organic over what period of time might this result in? 6 months ago we migrated to https and that was hardly felt, but this is really a brand new domain name altogether. Thanks!
Intermediate & Advanced SEO | | 945010 -
Why is this url redirecting to our site?
I was doing an audit on our site and searching for duplicate content using some different terms from each of our pages. I came across the following result: www.sswug.org/url/32639 redirects to our website. Is that normal? There are hundreds of these url's in google all with the exact same description. I thought it was odd. Any ideas and what is the consequence of this?
Intermediate & Advanced SEO | | Sika220 -
Meta canonical or simply robots.txt other domain names with same content?
Hi, I'm working with a new client who has a main product website. This client has representatives who also sells the same products but all those reps have a copy of the same website on another domain name. The best thing would probably be to shut down the other (same) websites and redirect 301 them to the main, but that's impossible in the minding of the client. First choice : Implement a conical meta for all the URL on all the other domain names. Second choice : Robots.txt with disallow for all the other websites. Third choice : I'm really open to other suggestions 😉 Thank you very much! 🙂
Intermediate & Advanced SEO | | Louis-Philippe_Dea0 -
Soft 404's from pages blocked by robots.txt -- cause for concern?
We're seeing soft 404 errors appear in our google webmaster tools section on pages that are blocked by robots.txt (our search result pages). Should we be concerned? Is there anything we can do about this?
Intermediate & Advanced SEO | | nicole.healthline4 -
URL Length or Exact Breadcrumb Navigation URL? What's More Important
Basically my question is as follows, what's better: www.romancingdiamonds.com/gemstone-rings/amethyst-rings/purple-amethyst-ring-14k-white-gold (this would fully match the breadcrumbs). or www.romancingdiamonds.com/amethyst-rings/purple-amethyst-ring-14k-white-gold (cutting out the first level folder to keep the url shorter and the important keywords are closer to the root domain). In this question http://www.seomoz.org/qa/discuss/37982/url-length-vs-url-keywords I was consulted to drop a folder in my url because it may be to long. That's why I'm hesitant to keep the bradcrumb structure the same. To the best of your knowldege do you think it's best to drop a folder in the URL to keep it shorter and sweeter, or to have a longer URL and have it match the breadcrumb structure? Please advise, Shawn
Intermediate & Advanced SEO | | Romancing0