Does Google respect User-agent rules in robots.txt?
-
We want to use an inline linking tool (LinkSmart) to cross link between a few key content types on our online news site.
LinkSmart uses a bot to establish the linking.
The issue: There are millions of pages on our site that we don't want LinkSmart to spider and process for cross linking.
LinkSmart suggested setting a noindex tag on the pages we don't want them to process, and that we target the rule to their specific user agent.
I have concerns. We don't want to inadvertently block search engine access to those millions of pages. I've seen googlebot ignore nofollow rules set at the page level. Does it ever arbitrarily obey rules that it's been directed to ignore?
Can you quantify the level of risk in setting user-agent-specific nofollow tags on pages we want search engines to crawl, but that we want LinkSmart to ignore?
-
Does Google respect User-agent rules in robots.txt?
Yes
I've seen googlebot ignore nofollow rules set at the page level.
Google honors the nofollow rules set at the page level. The issue is there may be other links on your site or elsewhere on the web that Google will find and follow those links.
Robots.txt is the absolute last means to use for blocking pages. You should not block a page with robots.txt unless you have exhausted all other options. A more appropriate method of keeping a page out of the index is the noindex tag. If you use the tag appropriately, Google will honor the tag.
-
Hi,
I would advise to block the directories which the files sit in in robots.txt, over adding no index tags to specific pages.
Yet then this would also leave these pages to not be indexed by Google, other search engines and also this Link Smart software you are referring to.
The thing is if you add a no index tag or if you add a robots .txt block to pages it will also block all search engines too.
So yes their is some risk involved, you have to do things carefully around this area.
Kind Regards,
James.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Will Google Count Links Loaded from JavaScript Files After the Page Loads
Hi, I have a simple question. If I want to put an image with a link to another site like a banner ad on my page, but do not want it counted by Google. Can I simply load the link and banner using jQuery onload from a separate .js file? The ideal result would be for Google to index a script tag instead of a link.
On-Page Optimization | | CopBlaster.com1 -
My Site's Name Not Ranking in Google
Hey all, I've seen a few posts like this. But I wanted to start a new thread in hopes I may find the underlying issue. I've had my site: http://www.ctrl-alt-success.com for about 2 years. Recently I've started really adding a lot of content to it. (about 2-3 posts a week). I get zero organic views which is fine as I know it's still in the beginning. But here's my main question. If I type "ctrl-alt-success" into google. I get some site that shows up. "ctrlaltsuccess.com" I've been looking at this issue forever. That site has been "coming soon" for nearly 2 years. lol My site doesn't even show up on the first 10 pages of google. However in Bing and Yahoo it ranks on the first page. What could my site be doing wrong that it's not even ranking for the exact domain name? Keep in mind, if I google "ctrl-alt-success.com" my site comes up fine. Any help would be appreciated, thanks!
On-Page Optimization | | Ctrl-Alt-Success0 -
Need suggestion: Should the user profile link be disallowed in robots.txt
I maintain a myBB based forum here. The user profile links look something like this http://www.learnqtp.com/forums/User-Ankur Now in my GWT, I can see many 404 errors for user profile links. This is primarily because we have tight control over spam and auto-profiles generated by bots. Either our moderators or our spam control software delete such spammy member profiles on a periodic basis but by then Google indexes those profiles. I am wondering, would it be a good idea to disallow User profiles links using robots.txt? Something like Disallow: /forums/User-*
On-Page Optimization | | AnkurJ0 -
Why is this site 1st in Google??
Hello The site www.woodensigns.net is 1st in google for the keyword "wooden Signs". All seo indicators are poor except keyword in url; wich, i thought, was not a + for google anymore. Could someone help me to understand here? Thank you Emmanuel
On-Page Optimization | | manu450 -
Does Google look at page design
Hi everybody, At the moment i'm creating several webshops and websites with the same layout, so visitors can recognize the websites are from the same company. But i was wondering: Does google look at the layout of a webpage that it's not a copy of another website? This because loads of website have the same wordpress/joomla templates etc, or doesn't this effect rankingpositions? Thank you,
On-Page Optimization | | iwebdevnl0 -
What URL Should I use in Google Place Page?
Alright, I have a client that has 1 website and 14 locations. We want to create place pages for each of their locations but my question is which URL should I put in the place page and why? I can put in the root domain into each place page, or should I put in the URL that lands on the actual location on the root. example: domain.com/location1 Thanks!
On-Page Optimization | | tcseopro0 -
Getting pages indexed by Google
Hi SEOMoz, I relaunched a site back in February of this year (www.uniquip.com) with about 1 million URL's. Right now I'm seeing that Google is not going past 110k indexed URL's (based on sitemaps). Do you have any tips on what I can do to make the site more likeable by Google and get more indexed URL's? All the the part pages can be browsed to by going to: http://www.uniquip.com/product-line-card/suppliers/sw-a/p-1 I've tried to make the content as unique as possible by adding random testimonials and random "related part numbers" see here: http://www.uniquip.com/id/246172/electronic-components/infineon/microcontrollers-mcu/sabc161pilfca Do I need to wait more time and be more patient with Google? It just seems like I'm only getting a few thousand URL's per day at the most. Would it help me if I implemented a breadcrumb on all part pages? Thanks, -Carlos
On-Page Optimization | | caneja0