Question about Syntax in Robots.txt
-
So if I want to block any URL from being indexed that contains a particular parameter what is the best way to put this in the robots.txt file?
Currently I have-
Disallow: /attachment_idWhere "attachment_id" is the parameter. Problem is I still see these URL's indexed and this has been in the robots now for over a month. I am wondering if I should just do
Disallow: attachment_id or Disallow: attachment_id= but figured I would ask you guys first.
Thanks!
-
That's excellent Chris.
Use the Remove Page function as well - it might help speed things up for you.
-Andy
-
I don't know how but I completely forgot I could just pop those URL's in GWT and see if they were blocked or not and sure enough, Google says they are. I guess this is just a matter of waiting.... Thanks much!
-
I have previously looked into both of those documents and the issue remains that they don't exactly address how best to block parameters. I could do this through GWT but just am curious about the correct and preferred syntax for the robots.txt as well. I guess I could just look at sites like Amazon or other big sites to see what the common practices are. Thanks though!
-
Problem is I still see these URL's indexed and this has been in the robots now for over a month. I am wondering if I should just do
It can take Google some time to remove pages from the index.
The best way to test if this has worked is hop into Webmaster Tools and use the Test Robots.txt function. If it has blocked the required pages, then you know it's just a case of waiting - you can also remove pages from within Webmaster Tools as well, although this isn't immediate.
-Andy
-
Hi there
Take a look at Google's resource on robots.txt, as well as Moz's. You can get all the information you need there. You can also let Google know about what URLs to exclude from it's crawls via Search Console.
Hope this helps! Good luck!
-
Im not a robots.txt expert by a long shot, but I found this, which is a little dated, which explained it to me in terms i could understand.
https://sanzon.wordpress.com/2008/04/29/advanced-usage-of-robotstxt-w-querystrings/
there is also a feature in Google Webmaster tools called URL parameters that lets you block URLs with set parameters for all sorts of reason to avoid duplicate content etc. I havn't used it myself but may be work looking into
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Few question about SEO
HI guys, I have few questions and I always find good answer here. I tried many SEO companies some very expensive and well known some with medium prices and some from India. I’m not an SEO expert but I always get the same things from SEO companies. They're saying you have to stay with us for few months before you’ll see any results. I completely understand however I don’t see the result on the end.1. What exactly Do I need SEO company for, after I do on page optimisation if they don’t work on proper backlinks. Just letting you know I’m getting content from other people.2. Is there something else which is really important after your page is optimised than backlinks? Or we should fully focus on get backlinks from customers, guest post, sharing on social media etc. to increase our DA and PA?3. Any advice about some individual or company who is good in backlink services?
Intermediate & Advanced SEO | | Lukas-ST
Thank youLukasThanks a lot.Lukas0 -
Question about duplicate listings on site for product listings.
We list products on our site and suspect that we have been hit by Panda as we are duplicating listings across our own site. Not intentionally, we just have multiple pages listings the same content as they fall into multiple categories. Has anyone else had the same issue and if so how did you deal with it?.. Have you seen a change in results/rankings due to the changes you made?
Intermediate & Advanced SEO | | nick-name1230 -
Will blocking urls in robots.txt void out any backlink benefits? - I'll explain...
Ok... So I add tracking parameters to some of my social media campaigns but block those parameters via robots.txt. This helps avoid duplicate content issues (Yes, I do also have correct canonical tags added)... but my question is -- Does this cause me to miss out on any backlink magic coming my way from these articles, posts or links? Example url: www.mysite.com/subject/?tracking-info-goes-here-1234 Canonical tag is: www.mysite.com/subject/ I'm blocking anything with "?tracking-info-goes-here" via robots.txt The url with the tracking info of course IS NOT indexed in Google but IT IS indexed without the tracking parameters. What are your thoughts? Should I nix the robots.txt stuff since I already have the canonical tag in place? Do you think I'm getting the backlink "juice" from all the links with the tracking parameter? What would you do? Why? Are you sure? 🙂
Intermediate & Advanced SEO | | AubieJon0 -
Keyword Question: How to Target my Niche
Hello, I'm a health coach helping people with multiple sclerosis. Here's my website: bobweikel(dot)com What do you think the top 4 local keywords would be for my niche? I'm in Boise ID. I'm thinking MS Boise MS Boise Idaho Multiple Sclerosis Boise Multiple Sclerosis Boise Idaho With your intuition, do you think these are valuable keywords for a coaching site? Also, can you think of any other keywords? I want this 100% white hat.
Intermediate & Advanced SEO | | BobGW0 -
Multilingual sites: Canonical and Alternate tag implementation question
Hello, I would like some clarification about the correct implementation of the rel="alternate" tag and the canonical tag. The example given at http://support.google.com/webmasters/bin/answer.py?hl=en&answer=189077 recommends implementing the canonical tag on all region specific sub-domains, and have it point to the www version of the website Here's the example given by Google. My question is the following. Would this technique also apply if I have region specific sites site local TLD. In other words, if I have www.example.com, www.example.co.uk, www.example.ca – all with the same content in English, but prices and delivery options tailored for US, UK and Canada residents, should I go ahead and implement the canonical tag and alternate tag as follows: I am a bit concerned about canonicalizing an entire local TLD to the .com site.
Intermediate & Advanced SEO | | Amiee0 -
Negative impact on crawling after upload robots.txt file on HTTPS pages
I experienced negative impact on crawling after upload robots.txt file on HTTPS pages. You can find out both URLs as follow. Robots.txt File for HTTP: http://www.vistastores.com/robots.txt Robots.txt File for HTTPS: https://www.vistastores.com/robots.txt I have disallowed all crawlers for HTTPS pages with following syntax. User-agent: *
Intermediate & Advanced SEO | | CommercePundit
Disallow: / Does it matter for that? If I have done any thing wrong so give me more idea to fix this issue.0 -
Robots.txt & url removal vs. noindex, follow?
When de-indexing pages from google, what are the pros & cons of each of the below two options: robots.txt & requesting url removal from google webmasters Use the noindex, follow meta tag on all doctor profile pages Keep the URLs in the Sitemap file so that Google will recrawl them and find the noindex meta tag make sure that they're not disallowed by the robots.txt file
Intermediate & Advanced SEO | | nicole.healthline0 -
Architecture questions.
I have two architecture related questions. Fewer folders is better. For example, www.site.com/product should rank better than www.site.com/foldera/folderb/product, all else constant. However, to what extreme does it make sense to remove folders? With a small site of 100 or so pages, why not put all files in the main directory? You'd have to manually build the navigation versus tying navigation to folder structure, but would the benefit justify the additional effort on a small site? I see a lot of sites with expansive footer menus on the home page and sometimes on every page. I can see how that would help indexing and user experience by making every page a click or two apart. However, what does that do to the flow of link juice? Does Google degrade the value of internal footer links like they do external footer links? If Google does degrade internal footer links, then having a bunch of footer links would waste link juice by sending a large portion of juice through degraded links, wouldn't it? Thank you in advance, -Derek
Intermediate & Advanced SEO | | dvansant0