You could use Linksleuth to crawl your site. It will tell you how many pages it found, then match it against the total of pages google has indexed.
Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Best posts made by smarties954
- 
    RE: How to determine which pages are not indexedposted in Technical SEO
 - 
    RE: Google insists robots.txt is blocking... but it isn't.posted in Technical SEO
24 hours is a short time and probably google did not reindex or even looked at your new robot.txt
Webmaster tools is way slower than bing tools, so be patient.
As a rule of thumb, I wait at least a week with google before worrying (my 2 cents)
 - 
    RE: Reducing Booking Engine Indexationposted in Intermediate & Advanced SEO
- 
You could use rel=nofollow on links pointing to pages variations.
 - 
If you can you could also dynamically add a meta noindex, no follow, when a variant of the initial page is generated.
 - 
You could also add a link rel=canonical pointing to the initial page, this will tell bots that this page is the original page.
 
In other word, you have to tell crawlers when it is a page variant and that you don't want him to index them.
 -