Issue in number of pages crawled
-
i wanted to figure out how our friend Roger Bot works.
On the first crawl of one of my large sites, the number of pages crawled stopped at 10000 (due to the restriction on the pro account). However after a few weeks, the number of pages crawled went down to about 5500. This number seemed to be a more accurate count of the pages on our site.
Today, it seems that Roger Bot has completed another crawl and the number is up to 10000 again.
I know there has been no downtime on our site, and the items that we fixed on our site did not reduce or increase the number of pages we had.
Just making sure there are no known issues with Roger Bot before I look deeper into our site to see if there is an issue.
Thanks!
-
Hey Chirag
That is the point, if the crawler is seeing multiple versions of the same page, you will get a false page count.
If a single page resolves on multiple versions of the URL like...
/pagename
/pagename/
/pagename.html
Then one single page could get reported as three pieces of content.
So, if you have 100 pages, but all pages resolve on say two page names then it would show 200 pages BUT the duplicate content report should allow you to see if this is the case.
Hope that helps.
Marcus -
Hi Marcus,
Thanks for the reply.
Yes the duplicate content report is quite large, but I am not certain why the number of pages crawled fluctuated by over 4000.
the Duplicate content number went down by over 2000 last week, and then went straight back up again. So I am not sure if the crawler missed something, or if there was some other issue going on.
Cheers
-
Hey Chirag
As a first suggestion, I would take a look at the duplicate content report and you may see some pages with multiple page names / urls giving a falsely inflated page count.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Could my Crawl Error Report be wrong?
HI there, I am using Yoast SEO plugin on a wordpress website. I am currently showing 70 Med priority crawl errors 'missing meta description' on my Moz pro account. This number of missing meta descriptions has increased over the last 6 weeks. But every single page / post / tag / category has both the title and meta description completed via Yoast. I requested a google bot to crawl the site a few weeks ago as thought it perhaps wasn't been crawled and updated. Any idea what the issue might be? Could the Moz report be incorrect for some reason? Or could something be blocking Moz / Google from seeing the Yoast plugin?
Moz Pro | | skehoe0 -
Moz Crawl Test error
Moz crawl test show blank report for my website test - guitarcontrol.com. Why??? Please suggest.
Moz Pro | | zoe.wilson170 -
Moz crawl duplicate pages issues
Hi According to the moz crawl on my website I have in the region of 800 pages which are considered internal duplicates. I'm a little puzzled by this, even more so as some of the pages it lists as being duplicate of another are not. For example, the moz crawler considers page B to be a duplicate of page A in the urls below: Not sure on the live link policy so ive put a space in the urls to 'unlive' them. Page A http:// nuchic.co.uk/index.php/jeans/straight-jeans.html?manufacturer=3751 Page B http:// nuchic.co.uk/index.php/catalog/category/view/s/accessories/id/92/?cat=97&manufacturer=3603 One is a filter page for Curvety Jeans and the other a filter page for Charles Clinkard Accessories. The page titles are different, the page content is different so Ive no idea why these would be considered duplicate. Thin maybe, but not duplicate. Like wise, pages B and C are considered a duplicate of page A in the following Page A http:// nuchic.co.uk/index.php/bags.html?dir=desc&manufacturer=4050&order=price Page B http:// nuchic.co.uk/index.php/catalog/category/view/s/purses/id/98/?manufacturer=4001 Page C http:// nuchic.co.uk/index.php/coats/waistcoats.html?manufacturer=4053 Again, these are product filter pages which the crawler would have found using the site filtering system, but, again, I cannot find what makes pages B and C a duplicate of A. Page A is a filtered result for Great Plains Bags (filtered from the general bags collection). Page B is the filtered results for Chic Look Purses from the Purses section and Page C is the filtered results for Apricot Waistcoats from the Waistcoat section. I'm keen to fix the duplicate content errors on the site before it goes properly live at the end of this month - that's why anyone kind enough to check the links will see a few design issues with the site - however in order to fix the problem I first need to work out what it is and I can't in this case. Can anyone else see how these pages could be considered a duplicate of each other please? Checking ive not gone mad!! Thanks, Carl
Moz Pro | | daedriccarl0 -
Ajax4SEO and rogerbot crawling
Has anyone had any experience with seo4ajax.com and moz? The idea is that it points a bot to a html version of an ajax page (sounds good) without the need for ugly urls. However, I don't know how this will work with rogerbot and whether moz can crawl this. There's a section to add in specific user agents and I've added "rogerbot". Does anyone know if this will work or not? Otherwise, it's going to create some complications. I can't currently check as the site is in development and the dev version is noindexed currently. Thanks!
Moz Pro | | LeahHutcheon0 -
Settings to crawl entire site
Not sure what happened but I started a third campaign yesterday and only 1 pages was crawled, The other two campaigns has 472 and 10K respectively. What is the proper setting to choose in the beginning of campaign setup to have the entire site crawled. Not sure what I did different and I must be reading the instructions incorrectly. Thanks, Don
Moz Pro | | NicheGuy210 -
SEOMoz Q&A having some issues?
Apologies if this has been asked. I was browsing Q&A and got a Ruby error page, reloaded and everything was fine. However a bunch of the topics are kind of messed up, and it looks like a chunk of them are missing now. For example my 'questions I've answered' section of My Q&A only has the newest question and the oldest question, and when I sort the SEOMoz Tools category by newest I get one recent post and then the next one is from November of last year. You guys having database problems?
Moz Pro | | icecarats0 -
Crawl Diagnostics Error Spike
With the last crawl update to one of my sites there was a huge spike in errors reported. The errors jumped by 16,659 -- majority of which are under the duplicate title and duplicate content category. When I look at the specific issues it seems that the crawler is crawling a ton of blank pages on the sites blog through pagination. The odd thing is that the site has not been updated in a while and prior to this crawl on Jun 4th there were no reports of these blank pages. Is this something that can be an error on the crawler side of things? Any suggestions on next steps would be greatly appreciated. I'm adding an image of the error spike Xovep.jpg?1 Xovep.jpg?1
Moz Pro | | VanadiumInteractive1 -
On page links tool here at Seomoz
Hi Seomoz - first of all, thanks for the best SEO tools I have ever worked with (this is my first question in this forum, and also I just subscribed as a paying customer after the 30 days trial you guys offer). My question: After having worked for several weeks on getting the numbers of links in our forum on www.texaspoker.dk down, we are somewhat surprised to see that we didn't succeed in getting lower numbers. For instance, this page: http://www.texaspoker.dk/forum/aktuelle-konkurrencer/coaching-projekt-bliver-du-den-udvalgte has (that's what Seomoz seo tool tells us): 239 on page links. Can this really be true? We can't find these links, and we actuually did a lot to lower the numbers of links, for instance the forum members picture was a link before, and also there was a "go to top" link in each post in the forum. Thanks a lot.
Moz Pro | | MPO0