Does Google respect User-agent rules in robots.txt?
-
We want to use an inline linking tool (LinkSmart) to cross link between a few key content types on our online news site.
LinkSmart uses a bot to establish the linking.
The issue: There are millions of pages on our site that we don't want LinkSmart to spider and process for cross linking.
LinkSmart suggested setting a noindex tag on the pages we don't want them to process, and that we target the rule to their specific user agent.
I have concerns. We don't want to inadvertently block search engine access to those millions of pages. I've seen googlebot ignore nofollow rules set at the page level. Does it ever arbitrarily obey rules that it's been directed to ignore?
Can you quantify the level of risk in setting user-agent-specific nofollow tags on pages we want search engines to crawl, but that we want LinkSmart to ignore?
-
Does Google respect User-agent rules in robots.txt?
Yes
I've seen googlebot ignore nofollow rules set at the page level.
Google honors the nofollow rules set at the page level. The issue is there may be other links on your site or elsewhere on the web that Google will find and follow those links.
Robots.txt is the absolute last means to use for blocking pages. You should not block a page with robots.txt unless you have exhausted all other options. A more appropriate method of keeping a page out of the index is the noindex tag. If you use the tag appropriately, Google will honor the tag.
-
Hi,
I would advise to block the directories which the files sit in in robots.txt, over adding no index tags to specific pages.
Yet then this would also leave these pages to not be indexed by Google, other search engines and also this Link Smart software you are referring to.
The thing is if you add a no index tag or if you add a robots .txt block to pages it will also block all search engines too.
So yes their is some risk involved, you have to do things carefully around this area.
Kind Regards,
James.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
I have speed problem in google webmaster
if i show my website to robots with less code (robots version ) is it harmful for my website ? my website is wordpress and i can't optimze it more plz help me
On-Page Optimization | | rhesti3280 -
Robots.txt Question for E-Commerce Sites
Hi All, I have a couple of e-commerce clients and have a question about URLs. When you perform a search on website all URLs contain a question mark, for example: /filter.aspx?search=blackout I'm not sure that I want these indexed. Could I be causing any harm/danger if I add this to the robots.txt file? /*? Any suggestions welcome! Gavin
On-Page Optimization | | IcanAgency0 -
Grade F page on Moz positions No 1 on Google Keywords not contained
Hi I am trying to understand why a page list in position 1 on Google despite the fact it does not include the search terms anywhere in the page source. One of our sites has been in that position for years has great content and links for the key word terms so how can the other page overtake it and all of the other keywords without so much as a sniff of the keyword in the URL, Meta, content or images. It grades F on Moz! How can I discover the technique that has been used. This really is black art stuff or do Google accept payment from major corporations to list their pages irrespective of content?
On-Page Optimization | | Eff-Commerce0 -
I have seen zero movement in my Google keyword rankings.
I have seen zero movement in my Google keyword rankings, but I have seen movement on the other search engines. I must be doing something wrong. Any tips?
On-Page Optimization | | LindaWolfe0 -
User experience regarding dulpicate content and managing this content with google.
Hi long title i know! We are moving on to magento and have chosen to allocate a specific colour to each category using corresponding tabbed navigation for user experience.All products within each of the coloured tabs then inherit the repective colour, giving the products a category identiy within the store. This layout has had a positive feedback from our "testers" As a lot of our products are seasonal and can be represented in different categories there is a significant amount of duplicate content. ATM i see our options as being: Alter the site structure so that the category is not shown in the url, therefore eliminating our duplicate products. The downside of this is that the colour co-ordination of the categories would not work at product level as its the category path that assigns the colour. create canonical links for every duplicate, can this be damaging? keep the duplicates and do nothing let google decide the most important version of a product. any guidance would be appreciated!
On-Page Optimization | | LadyApollo0 -
How do websites display product attributes listed with their meta descriptions in Google SEPRs?
If you take a look at this SERP for "boys costumes" you can see that Amazon, HalloweenExpress and Target all have attributes listed such as "Products 1-25 of 500" or Kids Legolas _Costume. _ These are getting blended with their meta descriptions. How are they doing this? Anyone see any lifts in ranking or CTR by doing this? Thank you!
On-Page Optimization | | Troyville0 -
Does google treat all urls equal?
Sorry for the lame title, i couldn't think of a better one. I want to know if google treats this: http://www.domain.com/products/some-product-name the same as it would treat: http://www.domain.com/?products=some-product-name if not, could you tell me the differences?
On-Page Optimization | | adriandg0 -
Photogallery and Robots.txt
Hey everyone SEOMOZ is telling us that there are to many onpage links on the following page: http://www.surfcampinportugal.com/photos-of-the-camp/ Should we stop it from being indexed via Robots.txt? best regards and thanks in advance... Simon
On-Page Optimization | | Rapturecamps0