Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Using 2 wildcards in the robots.txt file
-
I have a URL string which I don't want to be indexed. it includes the characters _Q1 ni the middle of the string.
So in the robots.txt can I use 2 wildcards in the string to take out all of the URLs with that in it? So something like /_Q1. Will that pickup and block every URL with those characters in the string?
Also, this is not directly of the root, but in a secondary directory, so .com/.../_Q1. So do I have to format the robots.txt as //_Q1* as it will be in the second folder or just using /_Q1 will pickup everything no matter what folder it is on?
Thanks.
-
I'm not 100% positive, however it does make sense to use it this way.
User-agent: *
Disallow: /*_Q1$
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Translating meta tags using WPML and AIO SEO
Having a heck of a time finding info on this one... We're working on a multilingual website which uses WPML. I've used the All in One SEO plugin to customize meta data (title, description, etc). These strings do not appear in the list of translations in WPML. Does anyone have any experience with this setup? How do you enable WPML to translate meta data set via the AIO plugin? Thanks!
Intermediate & Advanced SEO | | jonmc0 -
If my website do not have a robot.txt file, does it hurt my website ranking?
After a site audit, I find out that my website don't have a robot.txt. Does it hurt my website rankings? One more thing, when I type mywebsite.com/robot.txt, it automatically redirect to the homepage. Please help!
Intermediate & Advanced SEO | | binhlai0 -
Using hreflang for international pages - is this how you do it?
My client is trying to achieve a global presence in select countries, and then track traffic from their international pages in Google Analytics. The content for the international pages is pretty much the same as for USA pages, but the form and a few other details are different due to how product licensing has to be set up. I don’t want to risk losing ranking for existing USA pages due to issues like duplicate content etc. What is the best way to approach this? This is my first foray into this and I’ve been scanning the MOZ topics but a number of the conversations are going over my head,so suggestions will need to be pretty simple 🙂 Is it a case of adding hreflang code to each page and creating different URLs for tracking. For example:
Intermediate & Advanced SEO | | Caro-O
URL for USA: https://company.com/en-US/products/product-name/
URL for Canada: https://company.com/en-ca/products/product-name /
URL for German Language Content: https://company.com/de/products/product-name /
URL for rest of the world: https://company.com/en/products/product-name /1 -
Large robots.txt file
We're looking at potentially creating a robots.txt with 1450 lines in it. This will remove 100k+ pages from the crawl that are all old pages (I know, the ideal would be to delete/noindex but not viable unfortunately) Now the issue i'm thinking is that a large robots.txt will either stop the robots.txt from being followed or will slow our crawl rate down. Does anybody have any experience with a robots.txt of that size?
Intermediate & Advanced SEO | | ThomasHarvey0 -
Using disavow tool for 404s
Hey Community, Got a question about the disavow tool for you. My site is getting thousands of 404 errors from old blog/coupon/you name it sites linking to our old URL structure (which used underscores and ended in .jsp). It seems like the webmasters of these sites aren't answering back or haven't updated their sites in ages so it's returning 404 errors. If I disavow these domains and/or links will it clear out these 404 errors in Google? I read the GWT help page on it, but it didn't seem to answer this question. Feel free to ask any questions that may help you understand the issue more. Thanks for your help,
Intermediate & Advanced SEO | | IceIcebaby
-Reed0 -
Duplicate Content From Indexing of non- File Extension Page
Google somehow has indexed a page of mine without the .html extension. so they indexed www.samplepage.com/page, so I am showing duplicate content because Google also see's www.samplepage.com/page.html How can I force google or bing or whoever to only index and see the page including the .html extension? I know people are saying not to use the file extension on pages, but I want to, so please anybody...HELP!!!
Intermediate & Advanced SEO | | WebbyNabler0 -
Sitemaps. When compressed do you use the .gz file format or the (untidy looking, IMHO) .xml.gz format?
When submitting compressed sitemaps to Google I normally use the a file named sitemap.gz A customer is banging on that his web guy says that sitemap.xml.gz is a better format. Google spiders sitemap.gz just fine and in Webmaster Tools everything looks OK... Interested to know other SEOmoz Pro's preferences here and also to check I haven't made an error that is going to bite me in the ass soon! Over to you.
Intermediate & Advanced SEO | | NoisyLittleMonkey0 -
All page files in root? Or to use directories?
We have thousands of pages on our website; news articles, forum topics, download pages... etc - and at present they all reside in the root of the domain /. For example: /aosta-valley-i6816.html
Intermediate & Advanced SEO | | Peter264
/flight-sim-concorde-d1101.html
/what-is-best-addon-t3360.html We are considering moving over to a new URL system where we use directories. For example, the above URLs would be the following: /images/aosta-valley-i6816.html
/downloads/flight-sim-concorde-d1101.html
/forums/what-is-best-addon-t3360.html Would we have any benefit in using directories for SEO purposes? Would our current system perhaps mean too many files in the root / flagging as spammy? Would it be even better to use the following system which removes file endings completely and suggests each page is a directory: /images/aosta-valley/6816/
/downloads/flight-sim-concorde/1101/
/forums/what-is-best-addon/3360/ If so, what would be better: /images/aosta-valley/6816/ or /images/6816/aosta-valley/ Just looking for some clarity to our problem! Thank you for your help guys!0