Wild fluctuation in number of pages crawled
-
I am seeing huge fluctuations in the number of pages discovered the crawl each week. Some weeks the crawl discovers > 10,000 pages and other weeks I am seeing 4-500.
So, this week for example I was hoping to see some changes reflected for warnings from last weeks report (which discovered > 10,000 pages). However, the entire crawl this week was 448 pages.
The number of pages discovered each week seems to go back and forth between these two extremes. The more accurate count would be nearer the 10,000 mark than the 400 range.
Thanks.
Mark
-
No problem!
Glad to see Cyrus' response!
-
Hi Mark,
I used to troubleshoot these types of problems (mysteries when I worked on the SEOmoz help team.
The best thing to do would be contact the Help Team (help@seomoz.org) and include information both your account, url and campaign. They can take this information and see if there is anything odd about your website, or if there is a bug in the crawling software, or finally if there is some strange quirk of incompatibility causing this behavior.
If you would rather, you can PM me with the info and I can try to troubleshoot it myself, but the Help Team has a few more tools and access to engineers, so they might be the better choice. Either way, let us know if you have any trouble.
-
Thank you for the response. I should have been more clear. It is the weekly SEOMoz crawl that is showing such inconsistent behavior, not Google. Sorry I wasn't more clear.
We have very few (if any) broken links, errors, etc.
Thanks.
Mark
-
Hi there Mark!
We used to have the same issue using Joomla here. It turns out that Google will reduce their crawling if your site has too many errors, broken links, and so on.
We used GWT to look into the 404's then redirected the broken links. Afterwards, we resubmitted the site to be reindexed. A few weeks later -VOILA- all is back to normal and our page freshness stays where it should.
I'd recommend looking at your GWT first, and fixing broken links followed by resubmission to SE's...
Good Luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Crawl diagnostics incorrectly reporting duplicate page titles
Hi guys, I have a question in regards to the duplicate page titles being reported in my crawl diagnostics. It appears that the URL parameter "?ctm" is causing the crawler to think that duplicate pages exist. In GWT, we've specified to use the representative URL when that parameter is used. It appears to be working, since when I search site:http://www.causes.com/about?ctm=home, I am served a single search result for www.causes.com/about. That begs the question, why is the SEOMoz crawler saying there is duplicate page titles when Google isn't (doesn't appear under the HTML improvements for duplicate page titles)? A canonical URL is not used for this page so I'm assuming that may be one reason why. The only other thing I can think of is that Google's crawler is simply "smarter" than the Moz crawler (no offense, you guys put out an awesome product!). Any help is greatly appreciated and I'm looking forward to being an active participant in the Q&A community! Cheers, Brad
Moz Pro | | brad_dubs0 -
The pages that add robots as noindex will Crawl and marked as duplicate page content on seo moz ?
When we marked a page as noindex with robots like {<meta name="<a class="attribute-value">robots</a>" content="<a class="attribute-value">noindex</a>" />} will crawl and marked as duplicate page content(Its already a duplicate page content within the site. ie, Two links pointing to the same page).So we are mentioning both the links no need to index on SE.But after we made this and crawl reports have no change like it tooks the duplicate with noindex marked pages too. Please help to solve this problem.
Moz Pro | | trixmediainc0 -
Has any on else experienced a spike in crawl errors?
Hi, Since the last time our sites were crawled in SEOmoz they are all showing a spike in Errors. (Mainly duplicate page titles and duplicate content). We haven't changed anything to the structure of the sites but they are all using the same content management system. The image is an example of what we are witnessing for all our sites based on the same system. Is anyone else experiencing anything similar? or does anyone know of any changes that SEOmoz has implemented which may be affecting this? Thanks in advance, Anthony. WzdQV WzdQV WzdQV.jpg WzdQV.jpg
Moz Pro | | BallyhooLtd1 -
Moztool and on page ranking matching
How does the Moztool compare and filter the search phrases you enter in your campaign? Or more correctly, will it filter out stop words or is it an exact match? For example I enter a phrase to track that say: "book ski trip austria" Identified in Google I see that most users search for just that "book ski trip austria" But in content, I cant write that as that is uncorrect english and I want to maby write something like: "When you book a ski trip to austria you get..." How will this affect my on page SEO report, will it still match and mark a "V" in done or show a an error? Even more interesting is, what happen if you do phrases in different order like "An austrian skip trip will make you feel..."
Moz Pro | | Macaper0 -
A question about Mozbot and a recent crawl on our website.
Hi All, Rogerbot has been reporting errors on our website's for over a year now, and we correct the issues as soon as they are reported. However I have 2 questions regarding the recent crawl report we got on the 8th. 1.) Pages with a "no-index" tag are being crawled by roger and are being reported as duplicate page content errors. I can ignore these as google doesnt see these pages, but surely roger should ignore pages with "no-index" instructions as well? Also, these errors wont go away in our campaign until Roger ignores the URL's. 2.) What bugs me most is that resource pages that have been around for about 6 months have only just been reported as being duplicate content. Our weekly crawls have never picked up these resources pages as being a problem, why now all of a sudden? (Makes me wonder how extensive each crawl is?) Anyone else had a similar problem? Regards GREG
Moz Pro | | AndreVanKets0 -
Too Many On-Page Links: Crawl Diag vs On-Page
I've got a site I'm optimizing that has thousands of 'too many links on-page' warnings from the SeoMoz crawl diagnostic. I've been in there and realized that there are indeed, the rent is too damned high, and it's due to a header/left/footer category menu that's repeating itself. So I changed these links to NoFollow, cutting my total links by about 50 per page. I was too impatient to wait for a new crawl, so I used the On Page Reports to see if anything would come up on the Internal Link Count/External Link Count factors, and nothing did. However, the crawl (eventually) came back with the same warning. I looked at the link Count in the crawl details, and realized that it's basically counting every single '<a href'="" on="" the="" page.="" because="" of="" this,="" i="" guess="" my="" questions="" are="" twofold:<="" p=""></a> <a href'="" on="" the="" page.="" because="" of="" this,="" i="" guess="" my="" questions="" are="" twofold:<="" p="">1. Is no-follow a valid strategy to reduce link count for a page? (Obviously not for SeoMoz crawler, but for Google)</a> <a href'="" on="" the="" page.="" because="" of="" this,="" i="" guess="" my="" questions="" are="" twofold:<="" p="">2. What metric does the On-Page Report use to determine if there are too many Internal/External links? Apologies if this has been asked, the search didn't seem to come up with anything specific to this.</a>
Moz Pro | | icecarats0 -
Crawl Diagnostics finding pages that dont exist. Will Rel Canon Help?
I have recently set up a campaign for www.completeoffice.co.uk. Im the in-house developer there. When the crawl diagnostics completed, i went to check the results, and to my surprise, it had well over 100 missing or empty title tags. I then clicked it to see what pages, and nearly all the pages it say have missing or empty title tags, DO NOT EXIST. This has really confused me and need help figuring out how to solve this. Can anyone help? Attached image is a screen shot of some of the links it showed me on crawl diagnostics, nearly all of these do not exist. Will the relation Canonical tag in the head section of the actual pages help? For example, The actual page that exist is: www.completeoffice.co.uk/Products.php Whereas, when crawled it actually showed www.completeoffice.co.uk/Products/Products.php Will have the rel can tag in the header of the real products.php solve this?
Moz Pro | | CompleteOffice0 -
Multiple Page Title Elements?
Greetings, I am baffled by the recommendation I repeatedly receive from seomoz's on-page optimization tool. The web page I am working on only shows one title between <title>and</title> in it's head. However, seomoz is also reading "Pass to multiple" from somewhere. It recommends that I "Avoid Multiple Page Title Elements" and I would like to but cannot find them. Any suggestions? By the way, I inherited this site and am just trying to deconstruct someone else's work. As a novice, I realize there might be some obvious explanation that I am just missing. Thanks!
Moz Pro | | shedontdiet0