Robots.txt: Syntax URL to disallow
-
Did someone ever experience some "collateral damages" when it's about "disallowing" some URLs?
Some old URLs are still present on our website and while we are "cleaning" them off the site (which takes time), I would like to to avoid their indexation through the robots.txt file.
The old URLs syntax is "/brand//13" while the new ones are "/brand/samsung/13." (note that there is 2 slash on the URL after the word "brand")
Do I risk to erase from the SERPs the new good URLs if I add to the robots.txt file the line "Disallow: /brand//" ?
I don't think so, but thank you to everyone who will be able to help me to clear this out
-
You could inadvertently block /brand/ altogether. Just because you use a // doesn't mean Google follows the same rules when crawling.
-
"I wouldn't risk telling a spider to ignore /brand// because it might have adverse results."
Which adverse results could be expected?
-
(because of the 404 error pages being constantly found in our pages)
Think of it this way:
Which is better? Re-routing traffic when it's congested or putting up a road block to back up even more traffic?Yes, it's more work to do the 301 redirects but if you have "pages being constantly found" you should give instructions to spiders to take the different path.
Now, if you are talking about an error such as:
/brand//samsung/13 SHOULD go to
/brand/samsung/13
Then you could EASILY solve this with HTACCESS redirects. I wouldn't risk telling a spider to ignore /brand// because it might have adverse results. -
Hi guys,
Thank you for your answers
I understand (and agree) with your SEO point of view (301 redirection) but I should have mentioned that these old URLs are leading to a 404 error page for a long time now, we are not considering anymore their SEO strength anymore...
My goal right now is to find a quick and simple way to tell search engines to not consider this type of old URLs (because of the 404 error pages being constantly found in our pages) : doing the 301 redirection to the right page would be a bit more complex at the moment.
So: do you think there is a risk that the second slash won't be "considered" in the robots.txt about the "disallow" line I want to add ? (= do search engines will stop to crawl URLs like "/brand/samsung/13" if I add the line "Disallow: /brand//" ?)
-
I'll further what Highland and Alex Chan are telling you. If you are using Apache (Linux) then you can redirect your old site links using a 301 redirect and .htaccess which is a very powerful tool. Otherwise, if you are using a IIS server, web.config is what you want to use.
A really good resource for .htassess is CSS-Tricks: http://css-tricks.com/snippets/htaccess/301-redirects/
-
Yup like Highland mentioned, using your robots.txt for this isn't a good idea. The robots.txt file isn't guaranteed to work anyway. The only sure fire way to get it working is to move all the URLs from the old structure to the new one, then 301 all the old URLs into the new URLs. The 301 minimizes loss to your SEO.
-
You really don't need a robots for that. I would either 301 the old URL (preferred) or have the old URL return a 404. Both will cause the old URL to be removed from the index. A robots nofollow simply leaves it up but tells the robots not to crawl it.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Robots.txt wildcards - the devs had a disagreement - which is correct?
Hi – the lead website developer was assuming that this wildcard: Disallow: /shirts/?* would block URLs including a ? within this directory, and all the subdirectories of this directory that included a “?” The second developer suggested that this wildcard would only block URLs featuring a ? that come immediately after /shirts/ - for example: /shirts?minprice=10&maxprice=20 BUT argued that this robots.txt directive would not block URLS featuring a ? in sub directories - e.g. /shirts/blue?mprice=100&maxp=20 So which of the developers is correct? Beyond that, I assumed that the ? should feature a * on each side of it – for example - /? - to work as intended above? Am I correct in assuming that?
Intermediate & Advanced SEO | | McTaggart0 -
Keyword in URL - SEO impact
Hi, We don't have most important keyword of our industry in our domain or sub-domain. How important it is to have keyword in website URL? Most of our competitors pages with "keyword" urls been listing in SERP. What is back-links role in this scenarion? And which URL have more advantage? keyword in sub-domain or page with keyword. Like for "seo" keyword..... seo.example.com or example.com/seo
Intermediate & Advanced SEO | | vtmoz0 -
Client wants to remove mobile URLs from their sitemap to avoid indexing issues. However this will require SEVERAL billing hours. Is having both mobile/desktop URLs in a sitemap really that detrimental to search indexing?
We had an enterprise client ask to remove mobile URLs from their sitemaps. For their website both desktop & mobile URLs are combined into one sitemap. Their website has a mobile template (not a responsive website) and is configured properly via Google's "separate URL" guidelines. Our client is referencing a statement made from John Mueller that having both mobile & desktop sitemaps can be problematic for indexing. Here is the article https://www.seroundtable.com/google-mobile-sitemaps-20137.html
Intermediate & Advanced SEO | | RosemaryB
We would be happy to remove the mobile URLs from their sitemap. However this will unfortunately take several billing hours for our development team to implement and QA. This will end up costing our client a great deal of money when the task is completed. Is it worth it to remove the mobile URLs from their main website to be in adherence to John Mueller's advice? We don't believe these extra mobile URLs are harming their search indexing. However we can't find any sources to explain otherwise. Any advice would be appreciated. Thx.0 -
Robots.txt Disallowed Pages and Still Indexed
Alright, I am pretty sure I know the answer is "Nothing more I can do here." but I just wanted to double check. It relates to the robots.txt file and that pesky "A description for this result is not available because of this site's robots.txt". Typically people want the URL indexed and the normal Meta Description to be displayed but I don't want the link there at all. I purposefully am trying to robots that stuff outta there.
Intermediate & Advanced SEO | | DRSearchEngOpt
My question is, has anybody tried to get a page taken out of the Index and had this happen; URL still there but pesky robots.txt message for meta description? Were you able to get the URL to no longer show up or did you just live with this? Thanks folks, you are always great!0 -
Optimal URLs for SEO and UX
We are considering restructuring the URL scheme on one of the websites we maintain. We have a few options. Currently news article URLs are as follows:
Intermediate & Advanced SEO | | Peter264
http://domain.com/news/1234/article-title-name/ Download section URLs are as follows:
http://domain.com/downloads/files/1234/file-title-of-download-here/ Forum URLS:
http://forum.domain.com/forum/topic/1234/title-of-forum-topic-here/ We feel that these are a bit too long for both SEO and user experience. We want to remove as many directories from the URLs as possible. From experience, what do you recommend changing for the example URLs above? We have some ideas below...and we need to keep the ID in the URLs...however I know this is a little frustrating. Some ideas we have for news articles:
http://domain.com/news/article-title-shorter-1234
http://domain.com/article-title-shorter-n1234 Some ideas for the download pages:
http://domain.com/downloads/file-title-shorter-d1234
http://domain.com/downloads/files/file-title-shorter-1234
http://domain.com/file-title-shorter-d1234 Some ideas for the forum URLs:
http://forum.domain.com/topic-title-shorter-t1234
http://forum.domain.com/topic/topic-title-shorter-1234 What do you think of these suggestions? Any other URL ideas? Recommended URL length? The purpose of is question was to find the perfect URLs for the site we are working on; your thoughts, suggestions and tips are very much appreciated.0 -
Should I change wordpress urls?
Should I change my wordpress permalinks to include the keyword? For examples at the minute my url is http://www.musicliveuk.com/home/wedding-singer. Is it better to be http://www.musicliveuk.com/live-bands/wedding-singer. 'home' is not relevant so surely 'live-bands' would be better? If I change the urls won't I lose 'link juice' as external links will all point to a url that no longer exists? Or will wordpress automatically redirect the old url to the new one? Finally, if I should change the url as described how do I do it on wordpress? I can only see how to edit the last bit of the url and not the middle bit.
Intermediate & Advanced SEO | | SamCUK0 -
Exact keyword URL or not?
Hi all, I have a quick question about the proper use of permalinks. Let's say that I have a website about sports and I want to create an internal page dedicated to shoes. I know that the keyword "shoe" has 15.000 monthly visits, while the keyword "shoes" has 1.000 monthly visits. How do I have to name the internal page? http://www.example.com/shoe or http://www.example.com/shoes (with a final 's')? I would think that by naming the URL http://www.example.com/shoes, the search engine would consider that page for the keywords "shoe" and "shoes", but I am not sure about it. Should I create a URL that only focuses on one specific keyword ("shoe", in this example) or a URL that may encompass more than one keyword ("shoe" and "shoes")? I hope this is clear. Thank you for your time and help. All best, Sal
Intermediate & Advanced SEO | | salvyy0 -
Effect of URL change on Website
Hello we are developers and we have just created a new webpage for a client of us. The problem is that we can not replace the old one by the new one, cause our client has developed over 15 satellite pages that calls directly to the code of the old page. If we completly remove the old page we will make those 15 pages go down. Those pages are working over domains specially register for SEO reasons. For example Main page is www.euroair.es Satellite page is www.aireacondicionadodaikin.com Satellite page has pretty good ranking for search term "aire acondicionado daikin" As I told you, we have a new page but we can not make the page work over root domain. So we thought we could make it work over www.euroair.es/es, and make a redirection 301 of homepage and another important inner pages. We chose "/es" folder because it seems like a language folder, but we are not very sure of the effects of pages working on that folder instead of working on root directory. What do you think? Is this matter important or doesn't? Thanks
Intermediate & Advanced SEO | | teconsite.com0