3,511 Pages Indexed and 3,331 Pages Blocked by Robots

PeaSoupDigital

Morning,

So I checked our site's index status on WMT, and I'm being told that Google is indexing 3,511 pages and the robots are blocking 3,331. This seems slightly odd as we're only disallowing 24 pages on the robots.txt file. In light of this, I have the following queries:

Do these figures mean that Google is indexing 3,511 pages and blocking 3,331 other pages? Or does it mean that it's blocking 3,331 pages of the 3,511 indexed?
As there are only 24 URLs being disallowed on robots.text, why are 3,331 pages being blocked? Will these be variations of the URLs we've submitted?
Currently, we don't have a sitemap. I know, I know, it's pretty unforgivable but the old one didn't really work and the developers are working on the new one. Once submitted, will this help?
I think I know the answer to this, but is there any way to ascertain which pages are being blocked?

Thanks in advance!

Lewis

PeaSoupDigital

Hi,

No more links than a standard e-commerce site should have...

I'm chasing the sitemap as we speak.

Cheers,

MonicaOConnor

The blocked URLs are probably no follow links throughout the site. Do you have a lot of links pointing outward from pages?

Google is indexing 3511 pages, of which 3331 are blocked by Robots. I would check some of the internal/external links on those disallowed pages. I don't see how it could come up to 3331 blocked pages, but it couldn't hurt to start there.

Definitely get a sitemap submitted asap. It will help for sure.

Whittie

Excuse the short reply.

Add sitemap to your robots.txt - And submit it to Google WMT.

Just use a free one if you're in the middle of developing?

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

3,511 Pages Indexed and 3,331 Pages Blocked by Robots

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Paginated pages are being indexed?

Product Pages Outranking Category Pages

Huge number of indexed pages with no content

De-indexing millions of pages - would this work?

Un-Indexing a Page without robots.txt or access to HEAD

Page rank 2 for home page, 3 for service pages

Is there any value to a home page URL adding the /index.html ?

Blocking other engines in robots.txt