Does Google respect User-agent rules in robots.txt?
-
We want to use an inline linking tool (LinkSmart) to cross link between a few key content types on our online news site.
LinkSmart uses a bot to establish the linking.
The issue: There are millions of pages on our site that we don't want LinkSmart to spider and process for cross linking.
LinkSmart suggested setting a noindex tag on the pages we don't want them to process, and that we target the rule to their specific user agent.
I have concerns. We don't want to inadvertently block search engine access to those millions of pages. I've seen googlebot ignore nofollow rules set at the page level. Does it ever arbitrarily obey rules that it's been directed to ignore?
Can you quantify the level of risk in setting user-agent-specific nofollow tags on pages we want search engines to crawl, but that we want LinkSmart to ignore?
-
Does Google respect User-agent rules in robots.txt?
Yes
I've seen googlebot ignore nofollow rules set at the page level.
Google honors the nofollow rules set at the page level. The issue is there may be other links on your site or elsewhere on the web that Google will find and follow those links.
Robots.txt is the absolute last means to use for blocking pages. You should not block a page with robots.txt unless you have exhausted all other options. A more appropriate method of keeping a page out of the index is the noindex tag. If you use the tag appropriately, Google will honor the tag.
-
Hi,
I would advise to block the directories which the files sit in in robots.txt, over adding no index tags to specific pages.
Yet then this would also leave these pages to not be indexed by Google, other search engines and also this Link Smart software you are referring to.
The thing is if you add a no index tag or if you add a robots .txt block to pages it will also block all search engines too.
So yes their is some risk involved, you have to do things carefully around this area.
Kind Regards,
James.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to replace the keywords of our Google Site https://www.opcfitness.com/ 's TITLE
How to replace the keywords of our Google Site https://www.opcfitness.com/ 's TITLE Our new google site https://www.opcfitness.com/ page https://www.opcfitness.com/commercial-fitness title: Gym Equipment for Sale - Buy Commercial Fitness The site name is Gym Equipment for Sale. But we need the title like this Buy Commercial Fitness - Gym Equipment for Sale How to fix it?
On-Page Optimization | | ahislop5740 -
Why am not ranking on google with 30 of domain authority
Hello my name is Alexander muller and around 2 years ago i have build this website https://muller-designs.com/ i love to use MOZ to analyze other sites, so i know that my page is better than some others but they still aut rank my website can someone give me an explanation for this ? and i also have a couple of other questions it would be great if someone can help me Is on-site optimisation really that important??? How often should I update my blog???
On-Page Optimization | | alexmuller871 -
Disappearing and reappearing in google index
Hello. I made a lot of car accident lawyer city pages. They probably weren't as unique as they should have been. Suddenly, they all disappeared from the rankings and I freaked out. Then, two days later, they all returned. Is this a bad sign? Should I be worried? Why would they drop out of the rankings and come back in? Let me know, thanks.
On-Page Optimization | | RafeTLouis0 -
OMG! does Google really consider text-decoration:none as a hidden link?
So I was reading this article today https://www.mattcutts.com/blog/hidden-links/ Can setting a link to the same color as regular text and applying text-decoration:none really be considered a 'hidden link'?
On-Page Optimization | | cbielich0 -
Image heavy pages: Google friendly fonts / seo text etc
Hi Google friendly fonts - are these in wide use now, do they work ? If you have image heavy site do they work just as well as using what we used to call 'seo text'. I have heard that 'seo text' not really used anymore or at least rebranded to 'helpful, informative paragraph or two of body copy about the page with a couple of the pages target keywords in it'. I take it if fonts in image not google friendly then should still ask dev for some space to fit in a para or two of some proper body copy, with couple of pages target kw in it ? Also looking like if i succeed in this request will be below the fold, how hard should i fight for it to be above the fold ? cheers dan
On-Page Optimization | | Dan-Lawrence0 -
Why Google did not index exactly these 2 pages? Any ideas?
Dear Community, on 27th of July I relaunched my own website and submitted the sitemap as well I send the index-page to crawl it including all linked pages. Already the next day the new pages have been indexed. Today I checked them manually if they have been indexed. The result is that 2 of 13 pages have not been indexed, here marked in bold: http://inlinear.com/
On-Page Optimization | | inlinear
http://inlinear.com/suchmaschinenoptimierung-online-marketing.php
http://inlinear.com/design/
http://inlinear.com/design/printmedien-gestaltung.php
http://inlinear.com/design/corporate-design-und-corporate-identity.php
http://inlinear.com/design/corporate-raum-design.php
http://inlinear.com/webentwicklung/
http://inlinear.com/virtueller-rundgang-360grad-fotografie.php
http://inlinear.com/business-atlas-online-verzeichnis.php
http://inlinear.com/baudokumentation-bauueberwachung.php
http://inlinear.com/ueber-uns.php
http://inlinear.com/blog/
http://inlinear.com/kontakt/ The page "/design/" (which is the index.php of this folder should be the main-page because its about WEB DESIGN.
Should I create a copy and call it /design/web-design.php? May be Google prefers a meaningful URL than the index.php? So I put then a rel=canonical to web-design.php in my index.php? design/corporate-design-und-corporate-identity.php
The URL is a little long, but this should not be the reason? Or might be a reason that another page which is still in the index, but not online anymore (even redirecting to /design/) is still more dominant? Strange.... orshould I simply wait a little or try submitting these to sites manually to google? When checking Google Webmasters Tools Google tells me that just 3 pages have been indexed.
When I was checking which page is indexed or not I checked each URL with the site-search option:
site:inlinear.com/pageX.php ... when Google shows this page, it was a sign that it was indexed but why webmasters tools show up only 3 pages? (see screenshot) Do you have any ideas?
Thank You 🙂 indexed.png0 -
Google Fonts & Site Speed
Hello, Does the use of one google font slow down a website enough to effect load speed and thus rankings? Here's the ones we're choosing from: www.google.com/webfonts How do we know if the one we choose is too slow? Thank you.
On-Page Optimization | | BobGW0 -
Do I need a robots meta tag on the homepage of my site?
Is it recommended to include on the homepage of your site site? I would like Google to index and follow my site. I am using WordPress and noticed my homepage is not including this meta tag, therefore wondering if I should include it?
On-Page Optimization | | asc760