Crawl Diagnostics - Crawling way more pages than my site has?

LodestoneGen

Hello all,

I'm fairly new here, more of a paid search guy dabbling in SEO on the side. I have a client that I have in SEOMoz and the Crawl Diagnostics report is showing 10,000+ pages crawled and I think the site has at most 800 pages (e-commerce site using freewebstore.org as the platform).

Any reasons this would be happening?

LodestoneGen

Ok - Here is an update. I found that it has a basketful of entries for each Category and I have a pretty good list of categories.

Attached is an image showing what is happening in one category. There is an entry for each sort option which I understand where this is coming from (Sort Name, Sort Price Ascending, Sort Price Descending) what i don't understand are all the "rw=1" entries. And why they stack up like they do.

Is this an issue? I am assuming it is because there seems to be no real reason for it.

VH2Cjst

LodestoneGen

Thanks to both of you. I will start to dig in to your suggested steps later today.

I just took this one and they really don't have anything set-up. I just got them set-up on Webmaster tools as well so not even sure if they had their site indexed before.

The Crawl Diagnostics doesn't show much duplicate content (60 pages?) but the Too Many On Page Links, Overly Dynamic URL, Duplicate Title, Long URL warnings are all showing 6000-10000 pages.

The site sells crystals, each item is unique and as I did my first review they don't really even have item descriptions written let alone page titles and meta-descriptions.

I am in analysis mode working up my comments in review and detailing an action plane to help them focus moving forward. I was just shocked by the 10,000 pages listed in one of the crawl warnings.

anyway, I'll dig into this info and let you know what I find. It's an adventure!

DougRoberts

I'm guessing that as an ecommerce site you've got multiple ways to browse your content, by category / brand / special offers etc. The thing to watch out for is interesting URLs with categories or lots of parameters.As a result, chances are you've got a duplicate content problem.

As Nakul mentioned a good first step is to take a look at your crawl report or use one of the tools he mentioned to see if you've got the same content being indexed multiple times.

Once you've done that, check is to see how many of these pages being crawled are appearing in Google's index. Is Google doing a reasonable job identifying the right version? How many pages are there in the index. Are recently added products being discovered quickly?

The Site: operators will be your friend here and Dr Pete did a great article on ways you can use it.

http://www.seomoz.org/blog/25-killer-combos-for-googles-site-operator

Once you understand what is being crawled and what's making it to the index you need to decide what pages you really do want to be indexed and make sure that these become the canonical versions and block parts of your site using robots.txt. (But understand the problem and what you want to achieve before you start doing this.)

Hope this helps.

NakulGoyal

You can download the entire crawl and see if there's actually that many pages. Or post the URL here.

You can also test using a crawling software tool like Xenu or Screaming Frog to test it.

You can also post/private message the link here and I can take a look.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Crawl Diagnostics - Crawling way more pages than my site has?

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Page with "Missing Title Tag" isn't a page

Duplicate Page Content on pages that appear to be different?

1 page crawled ... and other errors

Is SeoMOZ Crawl Diagnostics wrong here?

How do YOU use site explorer?

Crawl Report Warnings

Not all pages are being crawled

Duplicate content pages