Robots.txt

Rong

I have a page used for a reference that lists 150 links to blog articles. I use in in a training area of my website. I now get warnings from moz that it has too many links. I decided to disallow this page in robots.text. Below is the what appears in the file.

Robots.txt file for http://www.boxtheorygold.com

User-agent: *

Disallow: /blog-links/

My understanding is that this simply has google bypass the page and not crawl it. However, in Webmaster Tools, I used the Fetch tool to check out a couple of my blog articles. One returned an expected result. The other returned a result of "access denied" due to robots.text. Both blog article links are listed on the /blog/links/ reference page.

Question: Why does google refuse to crawl the one article (using the Fetch tool) when it is not referenced at all in the robots.text file. Why is access denied? Should I have used a noindex on this page instead of robots.txt? I am fearful that robots.text may be blocking many of my blog articles. Please advise.

Thanks,
Ron

OlegKorneitchouk

User-agent: *
Disallow: /blog-links/

Will prevent spiders from crawling/indexing content that is located within that specific subfolder. If your articles are not located within that folder, then they should not be blocked. Maybe check for for meta noindex tags on the actual articles? You should also keep an eye on the "Blocked URLs" page in GWT to see if there are pages being blocked that shouldn't be.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Robots.txt

Robots.txt file for http://www.boxtheorygold.com

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Our crawler was not able to access the robots.txt file on your site.

Block Moz (or any other robot) from crawling pages with specific URLs

Blocked by Meta Robots.

Do the SEOmoz Campaign Reports follow Robots.txt?

Seomoz bar: No Follow and Robots.txt

Why does SEOMoz crawler ignore robots.txt?

Robots review

How to get rid of the message "Search Engine blocked by robots.txt"