Robots.txt: Can you put a /* wildcard in the middle of a URL?
-
We have noticed that Google is indexing the language/country directory versions of directories we have disallowed in our robots.txt.
For example:
Disallow: /images/ is blocked just fine
However, once you add our /en/uk/ directory in front of it, there are dozens of pages indexed.
The question is: Can I put a wildcard in the middle of the string, ex. /en/*/images/, or do I need to list out every single country for every language in the robots file. Anyone know of any workarounds?
-
Yes, wildcards work, thank god.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to 301 Redirect /page.php to /page, after a RewriteRule has already made /page.php accessible by /page (Getting errors)
A site has its URLs with php extensions, like this: example.com/page.php I used the following rewrite to remove the extension so that the page can now be accessed from example.com/page RewriteCond %{REQUEST_FILENAME}.php -f
Intermediate & Advanced SEO | | rcseo
RewriteRule ^(.*)$ $1.php [L] It works great. I can access it via the example.com/page URL. However, the problem is the page can still be accessed from example.com/page.php. Because I have external links going to the page, I want to 301 redirect example.com/page.php to example.com/page. I've tried this a couple of ways but I get redirect loops or 500 internal server errors. Is there a way to have both? Remove the extension and 301 the .php to no extension? By the way, if it matters, page.php is an actual file in the root directory (not created through another rewrite or URI routing). I'm hoping I can do this, and not just throw a example.com/page canonical tag on the page. Thanks!0 -
Robots.txt
Hi all, Happy New Year! I want to block certain pages on our site as they are being flagged (according to my Moz Crawl Report) as duplicate content when in fact that isn't strictly true, it is more to do with the problems faced when using a CMS system... Here are some examples of the pages I want to block and underneath will be what I believe to be the correct robots.txt entry... http://www.XYZ.com/forum/index.php?app=core&module=search&do=viewNewContent&search_app=members&search_app_filters[forums][searchInKey]=&period=today&userMode=&followedItemsOnly= Disallow: /forum/index.php?app=core&module=search http://www.XYZ.com/forum/index.php?app=core&module=reports&rcom=gallery&imageId=980&ctyp=image Disallow: /forum/index.php?app=core&module=reports http://www.XYZ.com/forum/index.php?app=forums&module=post§ion=post&do=reply_post&f=146&t=741&qpid=13308 Disallow: /forum/index.php?app=forums&module=post http://www.XYZ.com/forum/gallery/sizes/182-promenade/small/ http://www.XYZ.com/forum/gallery/sizes/182-promenade/large/ Disallow: /forum/gallery/sizes/ Any help \ advice would be much appreciated. Many thanks Andy
Intermediate & Advanced SEO | | TomKing0 -
URL Optimisation Dilemma
First of all, I fully appreciate that I may be over analysing this, so feel free to highlight if you think I’m going overboard on this one. I’m currently trying to optimise the URLs for a group of new pages that we have recently launched. I would usually err on the side of leaving the urls as they are so that any incoming links are not diluted through the 301 re-direct. In this case, however, there are very few links to these pages, so I don’t think that changing URLs will harm them. My main question is between short URLs vs. long URLs (I have already read Dr. Pete’s post on this). Note: the URLs I have listed below are not the actual URLs, but very similar examples that I have created. The URLs currently exist in a similar format to the examples below: http://www.company.com/products/dlm/hire-ca My first response was that we could put a few descriptive keywords in the url, with something like the following: http://www.company/products/debt-lifecycle-management/hire-collection-agents - I’m worried though that the URL will get too long for any pages sitting under this. As a compromise, I am considering the following: http://www.company/products/dlm/hire-collection-agents My feeling is that the second approach will give the best balance between having the keywords for the products and trying to ensure good user experience. My only concern is whether the /dlm/ category page would suffer slightly, but this would have ‘debt-lifecycle-management’ in the title tag. Does this sound like a good approach to people? Or do you think I’m being a little obsessive about this? Any help would be appreciated 🙂
Intermediate & Advanced SEO | | RG_SEO0 -
301 forwarding old urls to new urls - when should you update sitemap?
Hello Mozzers, If you are amending your urls - 301ing to new URLs - when in the process should you update your sitemap to reflect the new urls? I have heard some suggest you should submit a new sitemap alongside old sitemap to support indexing of new URLs, but I've no idea whether that advice is valid or not. Thanks in advance, Luke
Intermediate & Advanced SEO | | McTaggart0 -
New Website Look/Structure - Should I Redirect or Update Pages w/ Quality Inbound Links
This questing is regarding an ecommerce website that I hand wrote(html) in 1997. One of the first click and buy websites, with cart/admin system that I also developed. After all this time, the Old plain HTML look just doesnt cut it. I just updated to XHTML w/ a very modern look, and believe the structured data will index better. All products and current category pages will have the identical vrls taken from the old version. I decided to go with the switch after manual penalty, which has since been removed... I figured now is the time to update. My big question is that over the years, a lot of my backlinks came from products/news that are either no longer relevant or just not available. The pages do exist, but can only be found from the Outbound Link Source. For SEO purposes, I have thought a few things I can do but can't decide which one is the best choice. Any Insight or suggestions would be Awesome! 1. Redirect the old link to the most relevant page in my current catalog. 2. Add my new header/footer to old page(this will add a navigation bar w/ brands/cats/etc) 3. Simply add a nice new image to the top of these pages linking home & update any broken/irrelevant links. I was also considering adding just the very top 2 inches of my header(logo,search box, phone, address) *note, some of these pages do receive some traffic. Nothing huge, but consider the 50+ pages, it ads up.
Intermediate & Advanced SEO | | Southbay_Carnivorous_Plants0 -
Can anyone explain this?
On Sunday 26th May, for about 40 minutes, we had about 25-30 direct visits from San Jose (we are a UK site). During this time our rankings increased dramatically and then as soon as the direct visits disappeared, our rankings went back to how they were prior to them visiting the site.
Intermediate & Advanced SEO | | Jonnygeeuk0 -
Best url structure
I am making a new site for a company that services many cities. I was thinking a url structure like this, website.com/keyword1-keyword2-keyword3/cityname1-cityname2-cityname3-cityname4-cityname5. Will this be the best approach to optimize the site for the keyword plus 5 different cities ? as long as I keep the total url characters under the SeoMoz reccomended 115 characters ? Or would it be better to build separate pages for each city, trying to reword the main services to try to avoid dulpicate content.
Intermediate & Advanced SEO | | jlane90 -
How important is it to canonicalize mobile URLs to desktop URLs?
I know many SEO's prefer a stylesheet and single URL, but if you use m.domain.com, do you canonicalize to your desktop URLS?
Intermediate & Advanced SEO | | nicole.healthline0