Issue in number of pages crawled

cchhita

i wanted to figure out how our friend Roger Bot works.

On the first crawl of one of my large sites, the number of pages crawled stopped at 10000 (due to the restriction on the pro account). However after a few weeks, the number of pages crawled went down to about 5500. This number seemed to be a more accurate count of the pages on our site.

Today, it seems that Roger Bot has completed another crawl and the number is up to 10000 again.

I know there has been no downtime on our site, and the items that we fixed on our site did not reduce or increase the number of pages we had.

Just making sure there are no known issues with Roger Bot before I look deeper into our site to see if there is an issue.

Thanks!

Marcus_Miller

Hey Chirag

That is the point, if the crawler is seeing multiple versions of the same page, you will get a false page count.

If a single page resolves on multiple versions of the URL like...

/pagename

/pagename/

/pagename.html

Then one single page could get reported as three pieces of content.

So, if you have 100 pages, but all pages resolve on say two page names then it would show 200 pages BUT the duplicate content report should allow you to see if this is the case.

Hope that helps.
Marcus

cchhita

Hi Marcus,

Thanks for the reply.

Yes the duplicate content report is quite large, but I am not certain why the number of pages crawled fluctuated by over 4000.

the Duplicate content number went down by over 2000 last week, and then went straight back up again. So I am not sure if the crawler missed something, or if there was some other issue going on.

Cheers

Marcus_Miller

Hey Chirag

As a first suggestion, I would take a look at the duplicate content report and you may see some pages with multiple page names / urls giving a falsely inflated page count.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Issue in number of pages crawled

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

moz crawl is stopped?

On-page grader question

Seomoz legacy pages?

Number of backlinks throughout time

Crawl Diagnostics 403 on home page...

Duplicate page title

Open Site Explorer is showing "No Data" for my page titles under the "Top Pages" Tab

Tools that crawl 2 million page sites