Is blocking RSS Feeds with robots.txt necessary?
-
Is it necessary to block an rss feed with robots.txt?
It seems they are automatically not indexed (http://googlewebmastercentral.blogspot.com/2007/12/taking-feeds-out-of-our-web-search.html)
And, google says here that it's important not to block RSS feeds
(http://googlewebmastercentral.blogspot.com/2009/10/using-rssatom-feeds-to-discover-new.html)
I'm just checking!
-
Hi Michelleh,
There's no need to block RSS feeds as they are used for discovery (Gbot). Here's a quirky fact: RSS feeds actually combat the scraper sites as they have absolute URLs which clearly link back to your site They're going to scrape your content anyhow, let's hope they choose RSS!
How does G know it's an RSS feed? Let's look at some of the markup on RSS pages:
<rss <span="">version</rss>="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel></channel>
Either this or something similar will be in the HTML that defines an XML/RSS/Atom/XSL document/markup - this is easily read by Google. Not going to get too far into it but you can start reading more here:
http://en.wikipedia.org/wiki/RSS
Does Google index the XML file type? **Yes. **
Does that help?
-
How do they know it is an RSS feed? Does google not index the xml filetype?
-
If google says not to block it then don't block it. They may not index the RSS but they can still crawl the RSS.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
2 sitemaps on my robots.txt?
Hi, I thought that I just could link one sitemap from my site's robots.txt but... I may be wrong. So, I need to confirm if this kind of implementation is right or wrong: robots.txt for Magento Community and Enterprise ...
Technical SEO | | Webicultors
Sitemap: http://www.mysite.es/media/sitemap/es.xml
Sitemap: http://www.mysite.pt/media/sitemap/pt.xml Thanks in advance,0 -
How to use robots.txt to block areas on page?
Hi, Across the categories/product pages on out site there are archives/shipping info section and the texts are always the same. Would this be treated as duplicated content and harmful for seo? How can I alter robots.txt to tell google not to crawl those particular text Thanks for any advice!
Technical SEO | | LauraHT0 -
Quality Issues: My blog is blocked on Google Search Engine
Hi Webmasters, I got an email from google team. The email is included below. **Google Webmaster Tools: Quality Issues on http://abcdblogger.com/**August 8, 2012 Dear site owner or webmaster of http://abcdblogger.com/, We've detected that some of your site's pages may be using techniques that are outside Google's Webmaster Guidelines. If you have any questions about how to resolve this issue, please see ourWebmaster Help Forum for support. Sincerely, Google Search Quality Team My blog is completely blocked on Google Search engine. I removed all existing posts and reinstalled a fresh version of wordpress and wrote a good article. I redirected all broken links my homepage with a 301. After making those changes I submitted a reconsideration request to Google, But they declined it. I doubt that the reason for blocking could be due to the backlinks pointing to my domain. I think Google's Disavow Tool help me to remove low quality backlinks, But how can I sort low quality backlinks using Opensite Explorer? If possible can you create a text file with all possible low quality links, So that I could submit it using Google Disavow Tool. Thanks.
Technical SEO | | hafiskani0 -
Wordpress Robots.txt Sitemap submission?
Alright, my question comes directly from this article by SEOmoz http://www.seomoz.org/learn-seo/r... Yes, I have submitted the sitemap to google, bing's webmaster tools and and I want to add the location of our site's sitemaps and does it mean that I erase everything in the robots.txt right now and replace it with? <code>User-agent: * Disallow: Sitemap: http://www.example.com/none-standard-location/sitemap.xml</code> <code>???</code> because Wordpress comes with some default disallows like wp-admin, trackback, plugins. I have also read this, but was wondering if this is the correct way to add sitemap on Wordpress Robots.txt. [http://www.seomoz.org/q/removing-...](http://www.seomoz.org/q/removing-robots-txt-on-wordpress-site-problem) I am using Multisite with Yoast plugin so I have more than one sitemap.xml to submit Do I erase everything in Robots.txt and replace it with how SEOmoz recommended? hmm that sounds not right. like <code> <code>
Technical SEO | | joony2008
<code>User-agent: *
Disallow: </code> Sitemap: http://www.example.com/sitemap_index.xml</code> <code>``` Sitemap: http://www.example.com/sub/sitemap_index.xml ```</code> <code>?????????</code> ```</code>0 -
How to allow one directory in robots.txt
Hello, is there a way to allow a certain child directory in robots.txt but keep all others blocked? For instance, we've got external links pointing to /user/password/, but we're blocking everything under /user/. And there are too many /user/somethings/ to just block every one BUT /user/password/. I hope that makes sense... Thanks!
Technical SEO | | poolguy0 -
Help needed with robots.txt regarding wordpress!
Here is my robots.txt from google webmaster tools. These are the pages that are being blocked and I am not sure which of these to get rid of in order to unblock blog posts from being searched. http://ensoplastics.com/theblog/?cat=743 http://ensoplastics.com/theblog/?p=240 These category pages and blog posts are blocked so do I delete the /? ...I am new to SEO and web development so I am not sure why the developer of this robots.txt file would block pages and posts in wordpress. It seems to me like that is the reason why someone has a blog so it can be searched and get more exposure for SEO purposes. IS there a reason I should block any pages contained in wodrpress? Sitemap: http://www.ensobottles.com/blog/sitemap.xml User-agent: Googlebot Disallow: /*/trackback Disallow: /*/feed Disallow: /*/comments Disallow: /? Disallow: /*? Disallow: /page/
Technical SEO | | ENSO
User-agent: * Disallow: /cgi-bin/ Disallow: /wp-admin/ Disallow: /wp-includes/ Disallow: /wp-content/plugins/ Disallow: /wp-content/themes/ Disallow: /trackback Disallow: /commentsDisallow: /feed0 -
How does robots.txt affect aliased domains?
Several of my sites are aliased (hosted in subdirectories off the root domain on a single hosting account, but visible at www.theSubDirectorySite.com) Not ideal, I know, but that's a different issue. I want to block bots from viewing those files that are accessible in subdirectories on the main hosting account, www.RootDomain.com/SubDirectorySite/, and force the bots to look at www.SubDirectorySite.com instead. I utilized the canonical meta tag to point bots away from the sub directory site, but I am wondering what will happen if I use robots.txt to block those files from within the root domain. Will the bots, specifically Google bot, still index the site at its own URL, www.AnotherSite.com even if I've blocked that directory with Disallow: /AnotherSite/ ? THANK YOU!!!
Technical SEO | | michaelj_me0 -
How do I create an RSS feed for my site?
I do NOT have a cms website. just php/html. How do I create a feed so I can register it with feedburner? Not for my blog just the website its self.
Technical SEO | | bozzie3110