Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Robots.txt: Can you put a /* wildcard in the middle of a URL?
-
We have noticed that Google is indexing the language/country directory versions of directories we have disallowed in our robots.txt.
For example:
Disallow: /images/ is blocked just fine
However, once you add our /en/uk/ directory in front of it, there are dozens of pages indexed.
The question is: Can I put a wildcard in the middle of the string, ex. /en/*/images/, or do I need to list out every single country for every language in the robots file. Anyone know of any workarounds?
-
Yes, wildcards work, thank god.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Looking for opinions on structuring meta title tags/page title/menu title/H1
Hi everyone I am hoping a few of you can share your opinions. I have been having conversations (okay, healthy debates) about how to write/structure meta title tag and how to compliment them with the H1, page title, menu name. To help explain the thought processes I will use a pretend keyword. How about "screwdriver". Case: (I made this up) we are redesigning a website for a construction tools manufacturing company (pretend name: ABC Tools) targeting OEMs who are interested in purchasing large quantities of tools. The product categories (to become main menu items) are Screwdrivers, Nails, Drills, and Hammers. (bear with me .... this is just an example I am making up on the fly) K. Circling back to screwdrivers - let's say we have one landing page (a primary category page and in the main menu) listing products and great details about screwdrivers. Focus keywords are screwdriver manufacturer, screwdriver supplier, construction screwdrivers Below are questions being debated. If you are willing ... how would you address these questions? And, can you explain WHY? QUESTION ONE: How would you structure the meta title tag (feel free to write one of your own) Screwdriver Manufacturer - Construction Screwdriver | ABC Tools ABC Tools - US-based Screwdriver Manufacturer Supplier Near You High-Quality Screwdrivers for Construction with ABC Tools QUESTION TWO: how would you write the H1 on the page? Would it match the meta tag? OR, would you write something different using the primary keyword? QUESTION THREE Remembering this is not a blog post ... it is a primary landing page linked to the main navigation. What would the menu title be? (remember the product categories above are how the main menu items are bucketed) Screwdrivers Screwdriver Manufacturer Typically in WordPress, the H1 and the menu title is auto-populated using the page title (not the title tag)... So, if we use Screwdrivers as the page title but we want the H1 to match the meta title tag, would we manually change the H1? Or, have the page title and title tag match, but manually change the menu item?
Intermediate & Advanced SEO | | Brenda.Haines1 -
Disallow: /jobs/? is this stopping the SERPs from indexing job posts
Hi,
Intermediate & Advanced SEO | | JamesHancocks1
I was wondering what this would be used for as it's in the Robots.exe of a recruitment agency website that posts jobs. Should it be removed? Disallow: /jobs/?
Disallow: /jobs/page/*/ Thanks in advance.
James0 -
If my website do not have a robot.txt file, does it hurt my website ranking?
After a site audit, I find out that my website don't have a robot.txt. Does it hurt my website rankings? One more thing, when I type mywebsite.com/robot.txt, it automatically redirect to the homepage. Please help!
Intermediate & Advanced SEO | | binhlai0 -
Replace dynamic paramenter URLs with static Landing Page URL - faceted navigation
Hi there, got a quick question regarding faceted navigation. If a specific filter (facet) seems to be quite popular for visitors. Does it make sense to replace a dynamic URL e.x http://www.domain.com/pants.html?a_type=239 by a static, more SEO friendly URL e.x http://www.domain.com/pants/levis-pants.html by creating a proper landing page for it. I know, that it is nearly impossible to replace all variations of this parameter URLs by static ones but does it generally make sense to do this for the most popular facets choose by visitors. Or does this cause any issues? Any help is much appreciated. Thanks a lot in advance
Intermediate & Advanced SEO | | ennovators0 -
Robots.txt - Do I block Bots from crawling the non-www version if I use www.site.com ?
my site uses is set up at http://www.site.com I have my site redirected from non- www to the www in htacess file. My question is... what should my robots.txt file look like for the non-www site? Do you block robots from crawling the site like this? Or do you leave it blank? User-agent: * Disallow: / Sitemap: http://www.morganlindsayphotography.com/sitemap.xml Sitemap: http://www.morganlindsayphotography.com/video-sitemap.xml
Intermediate & Advanced SEO | | morg454540 -
Why is /home used in this company's home URL?
Just working with a company that has chosen a home URL with /home latched on - very strange indeed - has anybody else comes across this kind of homepage URL "decision" in the past? I can't see why on earth anybody would do this! Perhaps simply a logic-defying decision?
Intermediate & Advanced SEO | | McTaggart0 -
"noindex, follow" or "robots.txt" for thin content pages
Does anyone have any testing evidence what is better to use for pages with thin content, yet important pages to keep on a website? I am referring to content shared across multiple websites (such as e-commerce, real estate etc). Imagine a website with 300 high quality pages indexed and 5,000 thin product type pages, which are pages that would not generate relevant search traffic. Question goes: Does the interlinking value achieved by "noindex, follow" outweigh the negative of Google having to crawl all those "noindex" pages? With robots.txt one has Google's crawling focus on just the important pages that are indexed and that may give ranking a boost. Any experiments with insight to this would be great. I do get the story about "make the pages unique", "get customer reviews and comments" etc....but the above question is the important question here.
Intermediate & Advanced SEO | | khi50 -
Overly-Dynamic URL
Hi, We have over 5000 pages showing under Overly-Dynamic URL error Our ecommerce site uses Ajax and we have several different filters like, Size, Color, Brand and we therefor have many different urls like, http://www.dellamoda.com/Designer-Pumps.html?sort=price&sort_direction=1&use_selected_filter=Y http://www.dellamoda.com/Designer-Accessories.html?sort=title&use_selected_filter=Y&view=all http://www.dellamoda.com/designer-handbags.html?use_selected_filter=Y&option=manufacturer%3A&page3 Could we use the robots.txt file to disallow these from showing as duplicate content? and do we need to put the whole url in there? like: Disallow: /*?sort=price&sort_direction=1&use_selected_filter=Y if not how far into the url should be disallowed? So far we have added the following to our robots,txt Disallow: /?sort=title Disallow: /?use_selected_filter=Y Disallow: /?sort=price Disallow: /?clearall=Y Just not sure if they are correct. Any help would be greatly appreciated. Thank you,Kami
Intermediate & Advanced SEO | | dellamoda2