Should comments and feeds be disallowed in robots.txt?
-
Hi
My robots file is currently set up as listed below.
From an SEO point of view is it good to disallow feeds, rss and comments?
I feel allowing comments would be a good thing because it's new content that may rank in the search engines as the comments left on my blog often refer to questions or companies folks are searching for more information on. And the comments are added regularly.
What's your take? I'm also concerned about the /page being blocked. Not sure how that benefits my blog from an SEO point of view as well. Look forward to your feedback.
Thanks.
Eddy
User-agent: Googlebot Crawl-delay: 10 Allow: /* User-agent: * Crawl-delay: 10 Disallow: /wp- Disallow: /feed/ Disallow: /trackback/ Disallow: /rss/ Disallow: /comments/feed/ Disallow: /page/ Disallow: /date/ Disallow: /comments/ # Allow Everything Allow: /*
-
If I were going to disallow something I would go with noindex tags. The robots file is perfect with just those 2 lines.
Then, there are some plugins that will help you avoid any SEO issue like SEO by Yoast. Personally I like to noindex,follow tags, categories, and archive pages, that's it. But again, noindex, follow with a robots tag on the page, not using the robots.txt. SEO by Yoast will make that as easy as it can ever be with just a small configuration steps.
Give it a try, you can always disable plugins
Wish you the best!
-
Wordpress is a funny platform, you would think that there isn't much to disallow but there probably is quite a bit. I agree with Federico - you should allow comments, feed, and rss.
I'm not going to make blind assumptions here, so you should check your log files to see what's being constantly crawled, feel free to read this http://moz.com/blog/server-log-essentials-for-seo.
FYI - This is a big job. Shout if you need help.
P.S - Hostgator's Cpanel will allow you to archive raw server logs, make sure you check that option from now on or they'll be overwritten!
-
Thanks for the info!
I contacted Hostgator to fix the robots file because it had been blocking Google's bot for some time now. So that's the robot file they uploaded.
Yes I use wordpress, and apparently some stupid plugin had originally blocked google before hostgator fixed the robots file yesterday.
So to confirm you don't think anything else should be disallowed except for the /wp-admin directory. With the feeds, comments, etc, there isn't any SEO concerns like duplicate content or anything else that may work against me that should be blocked.
Is this safe to assume?
Thanks again!
Eddy
-
Who wrote that robots.txt?
You shouldn't disallow the comments, or feed or almost anything.
I notice you are using wordpress, so if you just want to avoid the admin being indexed (which will isn't going to be as Google does not have access anyway), your robots.txt should look like this:
User-Agent:*
Disallow: /wp-admin/
That's it.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
URLs with parameters + canonicals + meta robots
Hi Moz community! I'm posting a new question here as I couldn't find specific answer to the case I'm facing. Along with canonical tags, we are implementing meta robots on our pages (e-commerce website with thousands of pages). Most of the cases have been covered but I still have one unanswered case: our products are linked from list pages (mostly categories) but they almost always include a tracking parameter (ie /my-product.html?ref=xxx) products urls are secured with a canonical tag (referring only to the clean url /my-product.html) but what would be the best solution regarding the meta robots? For now we opted for a meta robot 'noindex, follow' for non canonical urls (so the ones unfortunately linked from our category/list pages), but I'm afraid that it could hurt our SEO (apparently no juice is given from URLs with a noindex robots), and even maybe prevent bots from crawling our website properly ... Would it be best to have no meta robots at all on these product urls with parameters? (we obviously can't have 'index, follow' when the canonical ref points to another url!). Thanks for your help!
Intermediate & Advanced SEO | | JessicaZylberberg0 -
Have a Robots.txt Issue
I have a robots.txt file error that is causing me loads of headaches and is making my website fall off the SE grid. on MOZ and other sites its saying that I blocked all websites from finding it. Could it be as simple as I created a new website and forgot to re-create a robots.txt file for the new site or it was trying to find the old one? I just created a new one. Google's website still shows in the search console that there are severe health issues found in the property and that it is the robots.txt is blocking important pages. Does this take time to refresh? Is there something I'm missing that someone here in the MOZ community could help me with?
Intermediate & Advanced SEO | | primemediaconsultants0 -
If Robots.txt have blocked an Image (Image URL) but the other page which can be indexed has this image, how is the image treated?
Hi MOZers, This probably is a dumb question but I have a case where the robots.tags has an image url blocked but this image is used on a page (lets call it Page A) which can be indexed. If the image on Page A has an Alt tags, then how is this information digested by crawlers? A) would Google totally ignore the image and the ALT tags information? OR B) Google would consider the ALT tags information? I am asking this because all the images on the website are blocked by robots.txt at the moment but I would really like website crawlers to crawl the alt tags information. Chances are that I will ask the webmaster to allow indexing of images too but I would like to understand what's happening currently. Looking forward to all your responses 🙂 Malika
Intermediate & Advanced SEO | | Malika11 -
Why are these results being showed as blocked by robots.txt?
If you perform this search, you'll see all m. results are blocked by robots.txt: http://goo.gl/PRrlI, but when I reviewed the robots.txt file: http://goo.gl/Hly28, I didn't see anything specifying to block crawlers from these pages. Any ideas why these are showing as blocked?
Intermediate & Advanced SEO | | nicole.healthline0 -
Is our robots.txt file correct?
Could you please review our robots.txt file and let me know if this is correct. www.faithology.com/robots.txt Thank you!
Intermediate & Advanced SEO | | BMPIRE0 -
Using comment boxes for building links (the right way)
Some people see this kind of link building as spammy mainly because of automated systems I guess making it spammy. But what if you use your company name linking to your site to indicate who has posted it and then actually contribute some good discussion. A lot of these are no-follow (although I have got it into my head even though they are no follow not passing juice I still think Google counts the link and it does something). So I want to start doing some of this, for example squidoo. Lots of lens with great content that I could quite easily comment on with 50 words+
Intermediate & Advanced SEO | | activitysuper0 -
202 error page set in robots.txt versus using crawl-able 404 error
We currently have our error page set up as a 202 page that is unreachable by the search engines as it is currently in our robots.txt file. Should the current error page be a 404 error page and reachable by the search engines? Is there more value or is it a better practice to use 404 over a 202? We noticed in our Google Webmaster account we have a number of broken links pointing the site, but the 404 error page was not accessible. If you have any insight that would be great, if you have any questions please let me know. Thanks, VPSEO
Intermediate & Advanced SEO | | VPSEO0 -
Does anyone have any tips for optimizing your Google Product Feeds?
How often do you submit them? What have you seen work? Are there any tricks aside from filling out all of the data fields?
Intermediate & Advanced SEO | | eric_since1910.com1