Robots.txt vs noindex
-
I recently started working on a site that has thousands of member pages that are currently robots.txt'd out.
Most pages of the site have 1 to 6 links to these member pages, accumulating into what I regard as something of link juice cul-d-sac.
The pages themselves have little to no unique content or other relevant search play and for other reasons still want them kept out of search.
Wouldn't it be better to "noindex, follow" these pages and remove the robots.txt block from this url type? At least that way Google could crawl these pages and pass the link juice on to still other pages vs flushing it into a black hole.
BTW, the site is currently dealing with a hit from Panda 4.0 last month.
Thanks! Best... Darcy
-
if you add the meta noindex, follow tag , it will keep the page out of the SERP but allows pagerank to flow through them to other pages.
See this interview of Matt Cutts for more info : http://www.stonetemple.com/articles/interview-matt-cutts.shtml
-
Hi Saijo,
Thanks for the response. Do you think that would yield the benefit I'm looking for of recapturing that lost link juice?
Do you think there'd be any downside to the switcheroo from robots.txt to noindex, follow?
Best... Darcy
-
Since you said " The pages themselves have little to no unique content or other relevant search play and for other reasons still want them kept out of search. " I would use meta robots "noindex, follow"
-
HI Lesley,
Thanks for the thoughts. I don't see this as a real option for a number of reasons, including but not limited to that there are 50,000 profiles, most with very little information. The members of this site are 95% busy professionals who aren't trying to advance their career via their profile. So, there'd be some privacy concern and the potential for tens of thousands of low content/highly templated pages. Not really a search dream come true!
Also, converting it into a system where different levels of profile completeness are acknowledged would not really resonate with this community nor would it be near the top of our engineering priorities.
What I really want to get clear on is how best to keep them search invisible while not losing link value into a robots.txt'd black hole. Really just looking for confirmation if, with those goals, "noindex, follow" and remove from robots is the way to go. I'm pretty sure it is, but would like to hear more about that.
Thanks... Darcy
-
I think what I am going to say is going to sound like it is going against the grain, but it really isn't. I have noticed in some places if you want an active community, you reward your members. Look at how moz does their forum, they don't really noindex the pages, but once you hit a point they psuedo drop the nofollow off of your profile link (it could be argued whether they really do). But the point is reward your members that are active. I would set up some automatic noindex tag in the header that grabbed the users post numbers. Then you can noindex all of the spammers and have prominent members shown in the search. If it were me that is how I would do it. I have a PA of 49 on my profile in one forum I regular, I have seen the stats, it is regularly an entry page to the forum. Another member has a 64 on a 93 domain, his is used a lot more than mine for entry as well. Think of it this way, if someone is googling my name, the second result is http://screencast.com/t/jIx7a4hcWV Moz's forum. 2nd search results still get a lot of clicks.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Meta robots
Hi, I am checking a website for SEO and I've noticed that a lot of pages from the blog have the following meta robots: meta name="robots" content="follow" Normally these pages should be indexed, since search engines will index and follow by default. In this case however, a lot of pages from this blog are not indexed. Is this because the meta robots is specified, but only contains follow? So will search engines only index and follow by default if there is no meta robots specified at all? And secondly, if I would change the meta robots, should I just add index or remove the meta robots completely from the code? Thanks for checking!
Intermediate & Advanced SEO | | Mat_C0 -
Click To Reveal vs Rollover Navigation Better For Organic?
Hi, Any thoughts, data or insights as which is better in a top navigation... click to reveal the nav links or rollover to reveal the nav links? Regular content in an accordion (click to reveal) is evidently not best practice. Does that apply to navigation as well? Thanks! Best... Mike
Intermediate & Advanced SEO | | 945010 -
Default Robots.txt in WordPress - Should i change it??
I have a WordPress site as using theme Genesis i am using default robots.txt. that has a line Allow: /wp-admin/admin-ajax.php, is it okay or any problem. Should i change it?
Intermediate & Advanced SEO | | rootwaysinc0 -
High resolution (retina) images vs load time
I have an ecommerce website and have a product slider with 3 images. Currently, I serve them at the native size when viewed on a desktop browser (374x374). I would like to serve them using retina image quality (748px). However how will this affect my ranking due to load time? Does Google take into account image load times even though these are done asynchronously? Also as its a slider, its only the first image which needs to load. Do the other images contribute at all to the page load time?
Intermediate & Advanced SEO | | deelo5551 -
Robot.txt error
I currently have this under my robot txt file: User-agent: *
Intermediate & Advanced SEO | | Rubix
Disallow: /authenticated/
Disallow: /css/
Disallow: /images/
Disallow: /js/
Disallow: /PayPal/
Disallow: /Reporting/
Disallow: /RegistrationComplete.aspx WebMatrix 2.0 On webmaster > Health Check > Blocked URL I copy and paste above code then click on Test, everything looks ok but then logout and log back in then I see below code under Blocked URL: User-agent: * Disallow: / WebMatrix 2.0 Currently, Google doesn't index my domain and i don't understand why this happening. Any ideas? Thanks Seda0 -
Robots.txt error message in Google Webmaster from a later date than the page was cached, how is that?
I have error messages in Google Webmaster that state that Googlebot encountered errors while attempting to access the robots.txt. The last date that this was reported was on December 25, 2012 (Merry Christmas), but the last cache date was November 16, 2012 (http://webcache.googleusercontent.com/search?q=cache%3Awww.etundra.com/robots.txt&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a). How could I get this error if the page hasn't been cached since November 16, 2012?
Intermediate & Advanced SEO | | eTundra0 -
Using 2 wildcards in the robots.txt file
I have a URL string which I don't want to be indexed. it includes the characters _Q1 ni the middle of the string. So in the robots.txt can I use 2 wildcards in the string to take out all of the URLs with that in it? So something like /_Q1. Will that pickup and block every URL with those characters in the string? Also, this is not directly of the root, but in a secondary directory, so .com/.../_Q1. So do I have to format the robots.txt as //_Q1* as it will be in the second folder or just using /_Q1 will pickup everything no matter what folder it is on? Thanks.
Intermediate & Advanced SEO | | seo1234560 -
Factors that affect Google.com vs .ca
Though my company is based in Canada, we have a .com URL, we're hosted on servers in the U.S., and most of our customers are in the U.S. Our marketing efforts are focused on the U.S. Heck, we even drop the "u" in "colour" and "favour"! 🙂 Nonetheless we rank very well in Google.ca, and rather poorly on Google.com. One hypothesis is that we have more backlinks from .ca domains than .com, but I don't believe that to be true. For sure, the highest quality links we have come from .coms like NYTimes.com. Any suggestions on how we can improve the .com rankings, other than keeping on with the link building?
Intermediate & Advanced SEO | | RobM4161