Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Your browser does not seem to support JavaScript. As a result, your viewing experience will be diminished, and you have been placed in read-only mode.

Please download a browser that supports JavaScript, or enable it if it's disabled (i.e. NoScript).

Moz Q&A is closed.

After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.

Robots.txt: Can you put a /* wildcard in the middle of a URL?

Intermediate & Advanced SEO

1028

IHSwebsite last edited by

We have noticed that Google is indexing the language/country directory versions of directories we have disallowed in our robots.txt.

For example:

Disallow: /images/ is blocked just fine

However, once you add our /en/uk/ directory in front of it, there are dozens of pages indexed.

The question is: Can I put a wildcard in the middle of the string, ex. /en/*/images/, or do I need to list out every single country for every language in the robots file. Anyone know of any workarounds?
1 Reply Last reply
Reply Quote 0
irvingw last edited by

Yes, wildcards work, thank god.
1 Reply Last reply
Reply Quote 1

Got a burning SEO question?

Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.

Start my free trial

Browse Questions

View

From

Sorted by

With category

Explore more categories

Related Questions

Robots.txt & Disallow: /*? Question!

Hi, I have a site where they have: Disallow: /*? Problem is we need the following indexed: ?utm_source=google_shopping What would the best solution be? I have read: User-agent: *
Allow: ?utm_source=google_shopping
Disallow: /*? Any ideas?
Intermediate & Advanced SEO | | vetofunk

0
Mass URL changes and redirecting those old URLS to the new. What is SEO Risk and best practices?

Hello good people of the MOZ community, I am looking to do a mass edit of URLS on content pages within our sites. The way these were initially setup was to be unique by having the date in the URL which was a few years ago and can make evergreen content now seem dated. The new URLS would follow a better folder path style naming convention and would be way better URLS overall. Some examples of the **old **URLS would be https://www.inlineskates.com/Buying-Guide-for-Inline-Skates/buying-guide-9-17-2012,default,pg.html
https://www.inlineskates.com/Buying-Guide-for-Kids-Inline-Skates/buying-guide-11-13-2012,default,pg.html
https://www.inlineskates.com/Buying-Guide-for-Inline-Hockey-Skates/buying-guide-9-3-2012,default,pg.html
https://www.inlineskates.com/Buying-Guide-for-Aggressive-Skates/buying-guide-7-19-2012,default,pg.html The new URLS would look like this which would be a great improvement https://www.inlineskates.com/Learn/Buying-Guide-for-Inline-Skates,default,pg.html
https://www.inlineskates.com/Learn/Buying-Guide-for-Kids-Inline-Skates,default,pg.html
https://www.inlineskates.com/Learn/Buying-Guide-for-Inline-Hockey-Skates,default,pg.html
https://www.inlineskates.com/Learn/Buying-Guide-for-Aggressive-Skates,default,pg.html My worry is that we do rank fairly well organically for some of the content and don't want to anger the google machine. The way I would be doing the process would be to edit the URLS to the new layout, then do the redirect for them and push live. Is there a great SEO risk to doing this?
Is there a way to do a mass "Fetch as googlebot" to reindex these if I do say 50 a day? I only see the ability to do 1 URL at a time in the webmaster backend.
Is there anything else I am missing? I believe this change would overall be good in the long run but do not want to take a huge hit initially by doing something incorrectly. This would be done on 5- to a couple hundred links across various sites I manage. Thanks in advance,
Chris Gorski
Intermediate & Advanced SEO | | kirin44355

0
Redirect wordpress from /%post_id%/%postname%/ to /blog/%postname%/

Hi what is the code to redirect wordpress blog from site.com/%post_id%/%postname%/ to site.com/blog/%postname%/ We are moving the site to a new server and new url structure. Thanks in advance
Intermediate & Advanced SEO | | Taiger

0
[Very Urgent] More 100 "/search/adult-site-keywords" Crawl errors under Search Console

I just opened my G Search Console and was shocked to see more than 150 Not Found errors under Crawl errors. Mine is a Wordpress site (it's consistently updated too): Here's how they show up: Example 1: URL: www.example.com/search/adult-site-keyword/page2.html/feed/rss2 Linked From: http://an-adult-image-hosting.com/search/adult-site-keyword/page2.html Example 2 (this surprised me the most when I looked at the linked from data): URL: www.example.com/search/adult-site-keyword-2.html/page/3/ Linked From: www.example.com/search/adult-site-keyword-2.html/page/2/ (this is showing as if it's from our own site) http://a-spammy-adult-site.com/search/adult-site-keyword-2.html Example 3: URL: www.example.com/search/adult-site-keyword-3.html Linked From: http://an-adult-image-hosting.com/search/adult-site-keyword-3.html How do I address this issue?
Intermediate & Advanced SEO | | rmehta1

0
What do you add to your robots.txt on your ecommerce sites?

We're looking at expanding our robots.txt, we currently don't have the ability to noindex/nofollow. We're thinking about adding the following: Checkout Basket Then possibly: Price Theme Sortby other misc filters. What do you include?
Intermediate & Advanced SEO | | ThomasHarvey

0
Should I include URLs that are 301'd or only include 200 status URLs in my sitemap.xml?

I'm not sure if I should be including old URLs (content) that are being redirected (301) to new URLs (content) in my sitemap.xml. Does anyone know if it is best to include or leave out 301ed URLs in a xml sitemap?
Intermediate & Advanced SEO | | Jonathan.Smith

0
Wildcarding Robots.txt for Particular Word in URL

Hey All, So I know that this isn't a standard robots.txt, I'm aware of how to block or wildcard certain folders but I'm wondering whether it's possible to block all URL's with a certain word in it? We have a client that was hacked a year ago and now they want us to help remove some of the pages that were being autogenerated with the word "viagra" in it. I saw this article and tried implementing it https://builtvisible.com/wildcards-in-robots-txt/ and it seems that I've been able to remove some of the URL's (although I can't confirm yet until I do a full pull of the SERPs on the domain). However, when I test certain URL's inside of WMT it still says that they are allowed which makes me think that it's not working fully or working at all. In this case these are the lines I've added to the robots.txt Disallow: /*&viagra Disallow: /*&Viagra I know I have the solution of individually requesting URL's to be removed from the index but I want to see if anybody has every had success with wildcarding URL's with a certain word in their robots.txt? The individual URL route could be very tedious. Thanks! Jon
Intermediate & Advanced SEO | | EvansHunt

0
Weird 404 URL Problem - domain name being placed at end of urls

Hey there. For some reason when doing crawl tests I'm finding pages with the domain name being tacked on the end and causing 404 errors.
For example: http://domainname.com/page-name/http://domainname.com This is happening to all pages, posts and even category type 1. Site is in Wordpress
2. Using Yoast SEO plugin Any suggestions? Thanks!
Intermediate & Advanced SEO | | Jay328

0

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Moz Q&A is closed.

Robots.txt: Can you put a /* wildcard in the middle of a URL?

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Robots.txt & Disallow: /*? Question!

Mass URL changes and redirecting those old URLS to the new. What is SEO Risk and best practices?

Redirect wordpress from /%post_id%/%postname%/ to /blog/%postname%/

[Very Urgent] More 100 "/search/adult-site-keywords" Crawl errors under Search Console

What do you add to your robots.txt on your ecommerce sites?

Should I include URLs that are 301'd or only include 200 status URLs in my sitemap.xml?

Wildcarding Robots.txt for Particular Word in URL

Weird 404 URL Problem - domain name being placed at end of urls

Products

Moz Solutions

Free SEO Tools

Resources

About Moz

Why Moz

Get Involved