Robots.txt advice
-
Hey Guys,
Have you ever seen coding like this in a robots.txt, I have never seen a noindex rule in a robots.txt file before - have you?
user-agent: AhrefsBot
User-agent: trovitBot
User-agent: Nutch
User-agent: Baiduspider
Disallow: /User-agent: *
Disallow: /WebServices/
Disallow: /*?notfound=
Disallow: /?list=
Noindex: /?*list=
Noindex: /local/
Disallow: /local/
Noindex: /handle/
Disallow: /handle/
Noindex: /Handle/
Disallow: /Handle/
Noindex: /localsites/
Disallow: /localsites/
Noindex: /search/
Disallow: /search/
Noindex: /Search/
Disallow: /Search/
Disallow: ?I have never seen a noindex rule in a robots.txt file before - have you?
Any pointers? -
Never seen this, doubt it's any useful as this isn't part of any search engines recommended statements to use. I don't think this would have any impact on what search engine robots would look at as it's not a statement in the robots.txt documentation.
-
Best I could find was-
Unlike disallowed pages, noindexed pages don’t end up in the index and therefore won’t show in search results. Combine both in robots.txt to optimise your crawl efficiency: the noindex will stop the page showing in search results, and the disallow will stop it being crawled
From-https://www.deepcrawl.com/blog/best-practice/robots-txt-noindex-the-best-kept-secret-in-seo/
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Robots.txt Allowed
Hello all, We want to block something that has the following at the end: http://www.domain.com/category/product/some+demo+-text-+example--writing+here So I was wondering if doing: /*example--writing+here would work?
Intermediate & Advanced SEO | | ThomasHarvey0 -
New g(TLD) advice needed
Hey all, I'm a bit confused by conflicting advice, need some direct input. We're quite experienced in SEO but that doesn't mean we can't get better 🙂 I manage a very old, well established, very generic TLD portal that ranks very highly in MANY keywords. (If you know our domain, I'd appreciate not naming it here) (145 1-3 ranks, 342 1-20 ranks) but there are also many topics we want to improve upon. Lets say, for example, I own gold.com, but I've failed to rank for 'gold events' and I acquired gold.events. What is the thought as to using some of the g(TLD)s versus the original .com? In the example events.gold.com or gold.events or gold.com/events/? I really can't find a consensus on which would bemost effective for SEO purposes. In a more general aspect of the same question, we own MANY "gold.newg(TLD)" domains and are conflicted as to best use of all of them. All advice greatly appreciated. Nat
Intermediate & Advanced SEO | | WorldWideWebLabs0 -
Baidu Spider appearing on robots.txt
Hi, I'm not too sure what to do about this or what to think of it. This magically appeared in my companies robots.txt file (literally magically appeared/text is below) User-agent: Baiduspider
Intermediate & Advanced SEO | | IceIcebaby
User-agent: Baiduspider-video
User-agent: Baiduspider-image
Disallow: / I know that Baidu is the Google of China, but I'm not sure why this would appear in our robots.txt all of a sudden. Should I be worried about a hack? Also, would I want to disallow Baidu from crawling my companies website? Thanks for your help,
-Reed0 -
Hit by Google updates; Some good advice needed
Hi, Here`s my domain http://www.kent-website-designer.co.uk/. Registered in 2007. We have took a big hit from the updates in the last 6 months and its really affecting revenue. I know when you look at the site you may well think WOW this is 2007 SEO and youre right it hasnt been updated in some time as of last year we ranked very highly and it gave us enough business to concentrate on. However up until last year many of my competitors were using same onpage and offpage strategies....and probably a few of you were too! So now the inquiries and income is drying up. However, I provided myself with an income from my efforts, rather than be unemployed, so I want to get it back on track. I visited the Google webmaster forums to query a couple of webmaster account queries and basically got beat up by the rude and arrogant google forum admins. Basically they said I was a spam site who shouldnt be in business. How very nice! 1. I have EMD - but domain age should mean something? 2. I lost a few links from https://www.getsafeonline.org/partners-and-supporters/ in the last year which hasnt helped when they reorganised their content. Same with other trusted sites we lost links. We are left with low quality links. 3. Some CMS sites have replicated our footer links on a large scale, which wasnt intentional but may look as link spam, plus they arent no followed as G prefers. 4. Google seems to have become intelligent? Apparently it can detect content which is negative in outdated seo advice. How, can it understand context and meaning so older seo advice isdetected as spam content? 5. No pages are de indexed just a rank drop to 30 - 60 positions. 6. Over optimised H1`s? 7. Is Pipe command in titles now negative? So its sink or swim time I guess. The siteand domain is honest but neglected and probably should re align the business with what we can offer. We got away with that SEO but clearly things have changed. However with no grey or black hat at least we arent overly worried by removing links. Also looking for an SEO company who we can outsource with a white label solution in order to offer SEO. I dont need beating up, short and to the point critiques please. Pros and Cons Many thanks.
Intermediate & Advanced SEO | | xtopher661 -
Effect duration of robots.txt file.
in my web site there is demo site in that also, index in Google but no need it now.so i have created robots file and upload to server yesterday.in the demo folder there are some html files,and i wanna remove all these in demo file from Google.but still in web master tools it showing User-agent: *
Intermediate & Advanced SEO | | innofidelity
Disallow: /demo/ How long this will take to remove from Google ? And are there any alternative way doing that ?0 -
Any advice for my website http://cvcsports.com?
I run the website http://cvcsports.com for myself and my parents. We offer custom varsity jackets for athletes/companies/etc. We rank first in Google for "letterman jackets" and near the top for "varsity jackets". I really want to reach #1 for "varsity jackets" (we were briefly #1 a few days ago but didn't stay there). Does anyone have any advice on what I can do to achieve that? Thanks in advance for the tips!
Intermediate & Advanced SEO | | BrandonDoyle0 -
Advice on further SEO
I am frustrated by a lack of progress for a major keyword I want to rank for. I have made several pages, optimized with Onpage and even a whole site but I can't seem to get my ratings up. I am hoping somone can take a look at my pages and efforts and offer me some advice... Keyword is "National Currency" One site is devoted to this keyword: NationalCurrencyValues This site is ranked 30th and is down 9... and this page on another site is devoted to the same keyword ranked 26th is: http://www.antiquebanknotes.com/National-Currency.aspx
Intermediate & Advanced SEO | | Banknotes0 -
Block all search results (dynamic) in robots.txt?
I know that google does not want to index "search result" pages for a lot of reasons (dup content, dynamic urls, blah blah). I recently optimized the entire IA of my sites to have search friendly urls, whcih includes search result pages. So, my search result pages changed from: /search?12345&productblue=true&id789 to /product/search/blue_widgets/womens/large As a result, google started indexing these pages thinking they were static (no opposition from me :)), but i started getting WMT messages saying they are finding a "high number of urls being indexed" on these sites. Should I just block them altogether, or let it work itself out?
Intermediate & Advanced SEO | | rhutchings0