Should comments and feeds be disallowed in robots.txt?

workathomecareers

Hi

My robots file is currently set up as listed below.

From an SEO point of view is it good to disallow feeds, rss and comments?

I feel allowing comments would be a good thing because it's new content that may rank in the search engines as the comments left on my blog often refer to questions or companies folks are searching for more information on. And the comments are added regularly.

What's your take? I'm also concerned about the /page being blocked. Not sure how that benefits my blog from an SEO point of view as well. Look forward to your feedback.

Thanks.

Eddy

User-agent: Googlebot
Crawl-delay: 10
Allow: /*

User-agent: *
Crawl-delay: 10
Disallow: /wp-
Disallow: /feed/
Disallow: /trackback/
Disallow: /rss/
Disallow: /comments/feed/
Disallow: /page/
Disallow: /date/
Disallow: /comments/

# Allow Everything
Allow: /*

FedeEinhorn

If I were going to disallow something I would go with noindex tags. The robots file is perfect with just those 2 lines.

Then, there are some plugins that will help you avoid any SEO issue like SEO by Yoast. Personally I like to noindex,follow tags, categories, and archive pages, that's it. But again, noindex, follow with a robots tag on the page, not using the robots.txt. SEO by Yoast will make that as easy as it can ever be with just a small configuration steps.

Give it a try, you can always disable plugins

Wish you the best!

DaveSottimano

Wordpress is a funny platform, you would think that there isn't much to disallow but there probably is quite a bit. I agree with Federico - you should allow comments, feed, and rss.

I'm not going to make blind assumptions here, so you should check your log files to see what's being constantly crawled, feel free to read this http://moz.com/blog/server-log-essentials-for-seo.

FYI - This is a big job. Shout if you need help.

P.S - Hostgator's Cpanel will allow you to archive raw server logs, make sure you check that option from now on or they'll be overwritten!

workathomecareers

Thanks for the info!

I contacted Hostgator to fix the robots file because it had been blocking Google's bot for some time now. So that's the robot file they uploaded.

Yes I use wordpress, and apparently some stupid plugin had originally blocked google before hostgator fixed the robots file yesterday.

So to confirm you don't think anything else should be disallowed except for the /wp-admin directory. With the feeds, comments, etc, there isn't any SEO concerns like duplicate content or anything else that may work against me that should be blocked.

Is this safe to assume?

Thanks again!

Eddy

FedeEinhorn

Who wrote that robots.txt?

You shouldn't disallow the comments, or feed or almost anything.

I notice you are using wordpress, so if you just want to avoid the admin being indexed (which will isn't going to be as Google does not have access anyway), your robots.txt should look like this:

User-Agent:*

Disallow: /wp-admin/

That's it.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Moz Q&A is closed.

Should comments and feeds be disallowed in robots.txt?

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

What happens to crawled URLs subsequently blocked by robots.txt?

If my website do not have a robot.txt file, does it hurt my website ranking?

Robots.txt - Do I block Bots from crawling the non-www version if I use www.site.com ?

Should I be using meta robots tags on thank you pages with little content?

Do you add 404 page into robot file or just add no index tag?

Meta NoIndex tag and Robots Disallow

Robots.txt: Can you put a /* wildcard in the middle of a URL?

Could you use a robots.txt file to disalow a duplicate content page from being crawled?

Products

Moz Solutions

Free SEO Tools

Resources

About Moz

Why Moz

Get Involved