Question about Syntax in Robots.txt
-
So if I want to block any URL from being indexed that contains a particular parameter what is the best way to put this in the robots.txt file?
Currently I have-
Disallow: /attachment_idWhere "attachment_id" is the parameter. Problem is I still see these URL's indexed and this has been in the robots now for over a month. I am wondering if I should just do
Disallow: attachment_id or Disallow: attachment_id= but figured I would ask you guys first.
Thanks!
-
That's excellent Chris.
Use the Remove Page function as well - it might help speed things up for you.
-Andy
-
I don't know how but I completely forgot I could just pop those URL's in GWT and see if they were blocked or not and sure enough, Google says they are. I guess this is just a matter of waiting.... Thanks much!
-
I have previously looked into both of those documents and the issue remains that they don't exactly address how best to block parameters. I could do this through GWT but just am curious about the correct and preferred syntax for the robots.txt as well. I guess I could just look at sites like Amazon or other big sites to see what the common practices are. Thanks though!
-
Problem is I still see these URL's indexed and this has been in the robots now for over a month. I am wondering if I should just do
It can take Google some time to remove pages from the index.
The best way to test if this has worked is hop into Webmaster Tools and use the Test Robots.txt function. If it has blocked the required pages, then you know it's just a case of waiting - you can also remove pages from within Webmaster Tools as well, although this isn't immediate.
-Andy
-
Hi there
Take a look at Google's resource on robots.txt, as well as Moz's. You can get all the information you need there. You can also let Google know about what URLs to exclude from it's crawls via Search Console.
Hope this helps! Good luck!
-
Im not a robots.txt expert by a long shot, but I found this, which is a little dated, which explained it to me in terms i could understand.
https://sanzon.wordpress.com/2008/04/29/advanced-usage-of-robotstxt-w-querystrings/
there is also a feature in Google Webmaster tools called URL parameters that lets you block URLs with set parameters for all sorts of reason to avoid duplicate content etc. I havn't used it myself but may be work looking into
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Question about related topics
I know that in order to rank on any keyword I need to talk about different "concepts / topics " everyone has a different word for it but let say I need to talk about multiple subject to make it simple. My question is how to find those subjects ... in some industry it is pretty straight forward you go to related search or some keyword tool such as Moz and you find what you need. Example : If "Title tag" is my main topic the subtopic that I find and that I need to cover on the same page are "title tag length , title tag checker, mobile title tag, title tag example etc..." On my keywords such as " Alsace bike tours" all I find in related searches and using all the tools out there such as Moz keyword research explorer is "Alsace cycling vacations " ""Cycling Colmar" "Alsace bike trip " etc... not really anything exiting , it means the same thing and it is just variations of the keyword. I have used other tools such as Marketmuse and they give me related topic such as "Strasbourg" "Colmar" "half timbered houses" "Alsace wine" and I am not sure it is any better... because to cover those I have no other solution that doing definitions... or describe those in details which is probably not what someone typing "Alsace bike tour" is looking for. I have the feeling that all those tools are great for keywords like "content marketing" or "title tag" with a lot of requests but they fail for everything else. Can someone give me an insight on how they do to write on multiple topic where they are in my situation and based on the example I gave which topic they would cover and based on my example.. Thank you,
Intermediate & Advanced SEO | | seoanalytics1 -
Site Migration Question
Hi Guys, I am preparing for a pretty standard site migration. Small business website moving to a new domain, new branding and new cms. Pretty much a perfect storm. Right now the new website is being designed and will need another month, however the client is pretty antsy to get her new brand out over the web. We cannot change the current site, which has the old branding. She wants to start passing out business cards and hang banners with the new domain and brand. However, I don't want to be messing with any redirects and potentially screw up a clean migration from the old site to the new. To be specific, she wants to redirect the new domain to the current domain and then when the new site, flip the redirect. However, I'm a little apprehensive with that because a site migration from the current to the new is already so intricate, I don't want to leave any possibility of error. I'm trying to figure out the best solution, these are 2 options I am thinking of: DO NOT market new domain. Reprint all Marketing material and wait until new domain is up and then start marketing it. (At cost to client) Create a one pager on new domain saying the site is being built & have a No Follow link to the current site. No redirects added. Just the no follow link. I'd like option 2 so that the client could start passing out material, but my number one concern is messing with any part of the migration. We are about to submit a sitemap index to Google Search Console for the current site, so we are just starting the site migration. What do you guys think?
Intermediate & Advanced SEO | | Khoo0 -
10 quick questions for SEO experts!
Hey guys! I'm working to build something to make technical SEO audit less painful and I'd like to hear from other SEO experts. Can I ask you to answer this quick survey: https://mykoto.typeform.com/to/R5Gvyr THANKS!
Intermediate & Advanced SEO | | jbrisebois0 -
Href lang and multilingual question
Greetings Moz-Hive mind! I'm hoping you can help me on the internationalisation conundrum below; We currently have a website with three distinct 'locales' US, SEA and UK we automatically redirect customers using IP recognition to a locale which matches, we also determine their currency based on IP. The issue we currently have is a lot of duplicate content and no use of href lang or rel=canonical tags etc... My proposed structure would be to create a locale based directory for the three locales we offer. / - being US and most other Worldwide /uk - being UK /as - being Hong Kong and other Asian territories. How would you suggest we set up the href lang tags for these? Because technically there are going to be multiple language possibilities within. Our main customers are English only if this helps. Also as a secondary question, how should I set up the Google Search Console settings for each of these directories? Many thanks in advance.
Intermediate & Advanced SEO | | Ashley-Jacada0 -
Block in robots.txt instead of using canonical?
When I use a canonical tag for pages that are variations of the same page, it basically means that I don't want Google to index this page. But at the same time, spiders will go ahead and crawl the page. Isn't this a waste of my crawl budget? Wouldn't it be better to just disallow the page in robots.txt and let Google focus on crawling the pages that I do want indexed? In other words, why should I ever use rel=canonical as opposed to simply disallowing in robots.txt?
Intermediate & Advanced SEO | | YairSpolter0 -
Should all pages on a site be included in either your sitemap or robots.txt?
I don't have any specific scenario here but just curious as I come across sites fairly often that have, for example, 20,000 pages but only 1,000 in their sitemap. If they only think 1,000 of their URL's are ones that they want included in their sitemap and indexed, should the others be excluded using robots.txt or a page level exclusion? Is there a point to having pages that are included in neither and leaving it up to Google to decide?
Intermediate & Advanced SEO | | RossFruin1 -
Quick htaccess question
Hi! I'm trying to do a 301 from www.stevesims.com/index.htm to www.stevesims.com. I know I need to use the request command to avoid an infinite loop, but I can't quite figure out the correct code. Here's the first part of the htaccess file. RewriteEngine On RewriteCond %{HTTP_HOST} ^stevesims.com
Intermediate & Advanced SEO | | Blink-SEO
RewriteRule (.*) http://www.stevesims.com/$1 [R=301,L] RewriteCond %{HTTP_REFERER} !^http://stevesims.com/.$ [NC]
RewriteCond %{HTTP_REFERER} !^http://stevesims.com$ [NC]
RewriteCond %{HTTP_REFERER} !^http://www.stevesims.com/.$ [NC]
RewriteCond %{HTTP_REFERER} !^http://www.stevesims.com$ [NC]
RewriteRule .*.(jpg|jpeg|gif|png|bmp)$ - [F,NC] Any suggestions would be much appreciated.0 -
Can you use more than one meta robots tag per page?
If you want to add both "noindex, follow" and "noopd" should you add two meta robots tags or is there a way to combine both into one?
Intermediate & Advanced SEO | | nicole.healthline0