What to do about "blocked by meta-robots"?
-
The crawl report tells me "Notices are interesting facts about your pages we found while crawling". One of these interesting facts is that my blog archives are "blocked by meta robots".
Articles are not blocked, just the archives.
What is a "meta" robot?
I think its just normal (since the article need only be crawled once) but want a second opinion. Should I care about this?
-
Meta robots refers to the < meta name="robots" > tag at the page header level. This is usually the case when a blog is set up with an SEO program like All In One SEO for example, where you can manually set which content is blocked. It's common to block archives, tags, and other sections, in the theory that allowing these to be crawled could either cause duplicate content issues, or drain link value from the primary category navigation.
-
In general, there are two ways you can block crawlers from indexing your content.
-
You can add a Disallow entry to your robots.txt file
-
You can add a meta tag to your pages:
What you are saying in either case is "please do not list this content in your search engine".
In general, you would not want to block your archives. There certainly can be specific cases where you only want the public to see your most current content, in which case you can block it.
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Meta Data Question
Hi There, I am working on the umbraco CMS and we have a Menu page which sits under one page on the CMS. When accessing this page on the front end and navigating between the food menu / drinks menu, the url changes depending on which content you are on, however i have only one place to input a meta title and description meaning that it is seeing them as duplicate content as both the drinks menu url and food menu url are showing the same meta data. Hopefully this makes sense, does anyone have anything similair where a url change happens when content within the page changes.
Technical SEO | | AlexStanleyGK0 -
404 or rel="canonical" for empty search results?
We have search on our site, using the URL, so we might have: example.com/location-1/service-1, or example.com/location-2/service-2. Since we're a directory we want these pages to rank. Sometimes, there are no search results for a particular location/service combo, and when that happens we show an advanced search form that lets the user choose another location, or expand the search area, or otherwise help themselves. However, that search form still appears at the URL example.com/location/service - so there are several location/service combos on our website that show that particular form, leading to duplicate content issues. We may have search results to display on these pages in the future, so we want to keep them around, and would like Google to look at them and even index them if that happens, so what's the best option here? Should we rel="canonical" the page to the example.com/search (where the search form usually resides)? Should we serve the search form page with an HTTP 404 header? Something else? I look forward to the discussion.
Technical SEO | | 4RS_John1 -
Robots file set up
The robots file looks like it has been set up in a very messy way.
Technical SEO | | mcwork
I understand the # will comment out a line, does this mean the sitemap would
not be picked up?
Disallow: /js/ should this be allowed like /*.js$
Disallow: /media/wysiwyg/ - this seems to be causing alerts in webmaster tools as it can not access
the images within.
Can anyone help me clean this up please #Sitemap: https://examplesite.com/sitemap.xml Crawlers Setup User-agent: *
Crawl-delay: 10 Allowable Index Mind that Allow is not an official standard Allow: /index.php/blog/
Allow: /catalog/seo_sitemap/category/ Allow: /catalogsearch/result/ Allow: /media/catalog/ Directories Disallow: /404/
Disallow: /app/
Disallow: /cgi-bin/
Disallow: /downloader/
Disallow: /errors/
Disallow: /includes/
Disallow: /js/
Disallow: /lib/
Disallow: /magento/ Disallow: /media/ Disallow: /media/captcha/ Disallow: /media/catalog/ #Disallow: /media/css/
#Disallow: /media/css_secure/
Disallow: /media/customer/
Disallow: /media/dhl/
Disallow: /media/downloadable/
Disallow: /media/import/
#Disallow: /media/js/
Disallow: /media/pdf/
Disallow: /media/sales/
Disallow: /media/tmp/
Disallow: /media/wysiwyg/
Disallow: /media/xmlconnect/
Disallow: /pkginfo/
Disallow: /report/
Disallow: /scripts/
Disallow: /shell/
#Disallow: /skin/
Disallow: /stats/
Disallow: /var/ Paths (clean URLs) Disallow: /index.php/
Disallow: /catalog/product_compare/
Disallow: /catalog/category/view/
Disallow: /catalog/product/view/
Disallow: /catalog/product/gallery/
Disallow: */catalog/product/upload/
Disallow: /catalogsearch/
Disallow: /checkout/
Disallow: /control/
Disallow: /contacts/
Disallow: /customer/
Disallow: /customize/
Disallow: /newsletter/
Disallow: /poll/
Disallow: /review/
Disallow: /sendfriend/
Disallow: /tag/
Disallow: /wishlist/ Files Disallow: /cron.php
Disallow: /cron.sh
Disallow: /error_log
Disallow: /install.php
Disallow: /LICENSE.html
Disallow: /LICENSE.txt
Disallow: /LICENSE_AFL.txt
Disallow: /STATUS.txt
Disallow: /get.php # Magento 1.5+ Paths (no clean URLs) #Disallow: /.js$
#Disallow: /.css$
Disallow: /.php$
Disallow: /?SID=
Disallow: /rss*
Disallow: /*PHPSESSID Disallow: /:
Disallow: /😘 User-agent: Fatbot
Disallow: / User-agent: TwengaBot-2.0
Disallow: /0 -
Adding directories to robots nofollow cause pages to have Blocked Resources
In order to eliminate duplicate/missing title tag errors for a directory (and sub-directories) under www that contain our third-party chat scripts, I added the parent directory to the robots disallow list. We are now receiving a blocked resource error (in Webmaster Tools) on all of the pages that have a link to a javascript (for live chat) in the parent directory. My host is suggesting that the warning is only a notice and we can leave things as is without worrying about the page being de-ranked/penalized. I am wondering if this is true or if we should remove the one directory that contains the js from the robots file and find another way to resolve the duplicate title tags?
Technical SEO | | miamiman1000 -
Does "?" in my URL have a negative effect?
I am having a difficult time finding specific information about the effect, if any, having a ? within the URL structure. We have the descriptive keyword phrase followed by the ? location id as in this example: http://www.adventuresonly.com/adventure-locations/things-to-do-in-arizona?stateid=124 Any feedback on effect and a corrective process to improve if necessary would be appreciated!
Technical SEO | | RBBonds0 -
Meta description and Meta Keywords
Hi, We are new to SEO and have some meta Q's Should Meta descriptions and meta keywords be different on every page? Is it bad to have the same meta data repeated on the site? If it has to be different does it have to be totally different per page of just slightly different? Should the description contain keywords is there an advantage to that? Thanks Andrew
Technical SEO | | Studio330 -
Meta keywords
I heard that Google doesn't use 'meta keywords' for ranking. Is that true and does it harm to use the meta tag? And how do the other big SEs use the tag?
Technical SEO | | kortingsplanet0 -
Meta tags - better NOT to have?
OK ok . . . the SEOMox report card told me it's actually better NOT to have meta tag keywords on my page, because my competitors can then look at my page to see what words I am trying to target . . . That makes since, but is also painfully counter intuitive. I thought I would just double check and make sure . .. NO META TAGS KEYWORDS? and if so . . .. what (if anything) should I have in the meta tags?
Technical SEO | | damon12120