Does Google respect User-agent rules in robots.txt?

lzhao

We want to use an inline linking tool (LinkSmart) to cross link between a few key content types on our online news site.

LinkSmart uses a bot to establish the linking.

The issue: There are millions of pages on our site that we don't want LinkSmart to spider and process for cross linking.

LinkSmart suggested setting a noindex tag on the pages we don't want them to process, and that we target the rule to their specific user agent.

I have concerns. We don't want to inadvertently block search engine access to those millions of pages. I've seen googlebot ignore nofollow rules set at the page level. Does it ever arbitrarily obey rules that it's been directed to ignore?

Can you quantify the level of risk in setting user-agent-specific nofollow tags on pages we want search engines to crawl, but that we want LinkSmart to ignore?

RyanKent

Does Google respect User-agent rules in robots.txt?

Yes

I've seen googlebot ignore nofollow rules set at the page level.

Google honors the nofollow rules set at the page level. The issue is there may be other links on your site or elsewhere on the web that Google will find and follow those links.

Robots.txt is the absolute last means to use for blocking pages. You should not block a page with robots.txt unless you have exhausted all other options. A more appropriate method of keeping a page out of the index is the noindex tag. If you use the tag appropriately, Google will honor the tag.

JamesNorquay

Hi,

I would advise to block the directories which the files sit in in robots.txt, over adding no index tags to specific pages.

Yet then this would also leave these pages to not be indexed by Google, other search engines and also this Link Smart software you are referring to.

The thing is if you add a no index tag or if you add a robots .txt block to pages it will also block all search engines too.

So yes their is some risk involved, you have to do things carefully around this area.

Kind Regards,

James.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Does Google respect User-agent rules in robots.txt?

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Will Google Count Links Loaded from JavaScript Files After the Page Loads

Can Google read this code?

Google not displaying my page title and meta description

We have 5 postions on page 2 in a google search, but none on page 1\. How can we fix this?

New CMS system - 100,000 old urls - use robots.txt to block?

What is a better mobile domain from an SEO perspective an m.example.com or using your regular domain with user agent detection?

Are Content in Inline Javascript and Collapsible Considered Cloaking to Google?

Shall Google index a search result?