Robots.txt vs noindex
-
I recently started working on a site that has thousands of member pages that are currently robots.txt'd out.
Most pages of the site have 1 to 6 links to these member pages, accumulating into what I regard as something of link juice cul-d-sac.
The pages themselves have little to no unique content or other relevant search play and for other reasons still want them kept out of search.
Wouldn't it be better to "noindex, follow" these pages and remove the robots.txt block from this url type? At least that way Google could crawl these pages and pass the link juice on to still other pages vs flushing it into a black hole.
BTW, the site is currently dealing with a hit from Panda 4.0 last month.
Thanks! Best... Darcy
-
if you add the meta noindex, follow tag , it will keep the page out of the SERP but allows pagerank to flow through them to other pages.
See this interview of Matt Cutts for more info : http://www.stonetemple.com/articles/interview-matt-cutts.shtml
-
Hi Saijo,
Thanks for the response. Do you think that would yield the benefit I'm looking for of recapturing that lost link juice?
Do you think there'd be any downside to the switcheroo from robots.txt to noindex, follow?
Best... Darcy
-
Since you said " The pages themselves have little to no unique content or other relevant search play and for other reasons still want them kept out of search. " I would use meta robots "noindex, follow"
-
HI Lesley,
Thanks for the thoughts. I don't see this as a real option for a number of reasons, including but not limited to that there are 50,000 profiles, most with very little information. The members of this site are 95% busy professionals who aren't trying to advance their career via their profile. So, there'd be some privacy concern and the potential for tens of thousands of low content/highly templated pages. Not really a search dream come true!
Also, converting it into a system where different levels of profile completeness are acknowledged would not really resonate with this community nor would it be near the top of our engineering priorities.
What I really want to get clear on is how best to keep them search invisible while not losing link value into a robots.txt'd black hole. Really just looking for confirmation if, with those goals, "noindex, follow" and remove from robots is the way to go. I'm pretty sure it is, but would like to hear more about that.
Thanks... Darcy
-
I think what I am going to say is going to sound like it is going against the grain, but it really isn't. I have noticed in some places if you want an active community, you reward your members. Look at how moz does their forum, they don't really noindex the pages, but once you hit a point they psuedo drop the nofollow off of your profile link (it could be argued whether they really do). But the point is reward your members that are active. I would set up some automatic noindex tag in the header that grabbed the users post numbers. Then you can noindex all of the spammers and have prominent members shown in the search. If it were me that is how I would do it. I have a PA of 49 on my profile in one forum I regular, I have seen the stats, it is regularly an entry page to the forum. Another member has a 64 on a 93 domain, his is used a lot more than mine for entry as well. Think of it this way, if someone is googling my name, the second result is http://screencast.com/t/jIx7a4hcWV Moz's forum. 2nd search results still get a lot of clicks.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What does Disallow: /french-wines/?* actually do - robots.txt
Hello Mozzers - Just wondering what this robots.txt instruction means: Disallow: /french-wines/?* Does it stop Googlebot crawling and indexing URLs in that "French Wines" folder - specifically the URLs that include a question mark? Would it stop the crawling of deeper folders - e.g. /french-wines/rhone-region/ that include a question mark in their URL? I think this has been done to block URLs containing query strings. Thanks, Luke
Intermediate & Advanced SEO | | McTaggart0 -
meta robots no follow on page for paid links
Hi I have a page containing paid links. i would like to add no follow attribute to these links
Intermediate & Advanced SEO | | Kung_fu_Panda
but from technical reasons, i can only place meta robots no follow on page level (
is that enough for telling Google that the links in this page are paid and and to prevent Google penlizling the sites that the page link to? Thanks!0 -
Baidu Spider appearing on robots.txt
Hi, I'm not too sure what to do about this or what to think of it. This magically appeared in my companies robots.txt file (literally magically appeared/text is below) User-agent: Baiduspider
Intermediate & Advanced SEO | | IceIcebaby
User-agent: Baiduspider-video
User-agent: Baiduspider-image
Disallow: / I know that Baidu is the Google of China, but I'm not sure why this would appear in our robots.txt all of a sudden. Should I be worried about a hack? Also, would I want to disallow Baidu from crawling my companies website? Thanks for your help,
-Reed0 -
Dealing with non-canonical http vs https?
We're working on a complete rebuild of a client's site. The existing version of the site is in WordPress and I've noticed that the site is accessible via http and https. The new version of the site will have mostly or entirely different URLs. It seems that both http and https versions of a page will resolve, but all of the rel-canonical tags I've seen point to the https version. Sometimes image tags and stylesheets are https, sometimes they aren't. There are both http and https pages in Google's index. Having looked at other community posts about http/https, I've gathered the following: http/https is like two different domains. http and https versions need to be verified in Google Webmaster Tools separately. Set up the preferred domain properly. Rel-canonicals and internal links should have matching protocols. My thought is that we will do a .htaccess that redirects old URLs regardless of the protocol to new pages at one protocol. I would probably let the .css and image files from the current site 404. When we develop and launch the new site, does it make sense for everything to be forced to https? Are there any particular SEO issues that I should be aware of for a scenario like this? Thanks!
Intermediate & Advanced SEO | | GOODSIR0 -
Pages getting into Google Index, blocked by Robots.txt??
Hi all, So yesterday we set up to Remove URL's that got into the Google index that were not supposed to be there, due to faceted navigation... We searched for the URL's by using this in Google Search.
Intermediate & Advanced SEO | | bjs2010
site:www.sekretza.com inurl:price=
site:www.sekretza.com inurl:artists= So it brings up a list of "duplicate" pages, and they have the usual: "A description for this result is not available because of this site's robots.txt – learn more." So we removed them all, and google removed them all, every single one. This morning I do a check, and I find that more are creeping in - If i take one of the suspecting dupes to the Robots.txt tester, Google tells me it's Blocked. - and yet it's appearing in their index?? I'm confused as to why a path that is blocked is able to get into the index?? I'm thinking of lifting the Robots block so that Google can see that these pages also have a Meta NOINDEX,FOLLOW tag on - but surely that will waste my crawl budget on unnecessary pages? Any ideas? thanks.0 -
How to know when do use singular vs plural in anchor text and on-page copy?
I'm building out a specific section of our site and I want to make sure I target it correctly. Is there a rule of thumb when to know how to use "car" vs "cars"? (as an example) Is there a specific way to research the right approach? thank you!
Intermediate & Advanced SEO | | JDatSB0 -
Local results vs Normal results
Hi everyone, I am currently working on the website of a friend, who's owning a French spa treatment company. I have been working on it for the past 6 months, mostly on optimizing the page titles and the link building. So far the results are great in terms on normal results : if you type most of the keywords and the city name, the website would be very well positioned, if not top positioned. My only problem is that in the local results (Google Maps), nothing has improved at all. In most of the same keyword where the website is ranking 1st on normal results, the website doesn't appear at all on the same keywords in local results. This is confusing as you would think Google think the website is relevant to the subject according to the normal results but it doesn't show any good ones in a local matter. The website is clearly located in the city (thanks to the pages titles and there's a Google Map in a specific page dedicated to its location). The company has a Google Places page and it has positive customers reviews on different trusted websites for more than a year now (the website is 2 years old). I focused my work concerning the link building on the local websites (directories and specialized websites) for the past 2 months. The results kept improving on normal results but still no improvement at all in the local ones. As far as I know, there is no mistakes such as multiple addresses for the same business etc. Everything seems to be done by the rules. I am not sure at all what more I can do. The competitors do not seem to be working their SEO pretty much and in terms of linking (according to the -pretty good- Seomoz tools), they have up to 10 times less (good) links than us. Maybe you guys have some advice on how I can manage this situation ? I'm kind of lost here 😞 Thanks a lot for your help, appreciate it. Cheers,
Intermediate & Advanced SEO | | Pureshore
Raphael0 -
Can you use more than one meta robots tag per page?
If you want to add both "noindex, follow" and "noopd" should you add two meta robots tags or is there a way to combine both into one?
Intermediate & Advanced SEO | | nicole.healthline0