Moz Crawl shows over 100 times more pages than my site has?
-
The latest crawl stats are attached. My site has just over 300 pages?
Wondering what I have done wrong?
-
total pages is higher you are right Keri but still only 581
-
I believe this image looks at what's indexed that's a subset of your sitemap that you submitted. You may want to look at Google Index -> Index Status in GWT to see what it shows there.
-
latest Moz crawl
-
latest webmaster tools crawl
-
I will definetly be paying attention to those numbers Keri. Webmaster tools is showing the right number of pages (something over 300 with 90% of those indexed)
-
It's not going to be a penalty, but it'll be good to have a bit less of a load on your server (bots no longer crawling thousands of pages) and just have your real pages in the index.
Places to look for interesting changes in site metrics would be your organic traffic in analytics and taking a look at your Google Webmaster Tools account to see your impressions, pages crawled, etc.
-
Thanks Keri, I will update asap.
could you let me know how big an issue would this be? (When you have the time of course;))
-
You're welcome! I may have opened a can of worms, however. That sitemap is generated by an automated tool (based on the footer at the bottom), so somehow it's finding that page 28 as well.
You may also want to ask the developer if you should be indexing the categories in the blog archives. There are resources on Moz about the best way to set that up in Wordpress, but I don't have them at my fingertips at the moment (I have a snuggly baby sleeping on my lap instead that's slowing me down a tad).
To answer your next question, after you figure out where the page 28 is being linked from and cure that, yes, you can do a one-time crawl from Research Tools. It won't overwrite your campaign info, but you can at least see if Moz is seeing thousands of pages or just a few hundred to see if stuff was fixed. Again, happy to provide more detail if/when you need it (and others will likely jump in with help on the thread, too).
I'd love to also see a little update a few weeks down the line of any changes you've noticed on your site metrics after getting this fixed.
-
You rock:)
-
And I found it. The sitemap at http://www.nineclouds.ca/sitemap includes a page /28, which is where the crawlers are finding the non-existent pages.
-
If you look at http://www.nineclouds.ca/blog/page/23, you'll see that there's a double arrow in the pagination at the right that goes to page 24, even though the last page is page 21. Google somehow has found the pages greater than 21 (which I'm not sure how they found), and once they found one of those, they keep seeing the link there with the double arrows to go to another page. Same happened with Rogerbot. I'm not sure where the bad originating link is (what legit page on your site is linking to something over page 21), but that's the loop that's happening and causing a ton of pages to be indexed. Get rid of those, and you'll also get rid of most of your errors.
-
Not shy about that at all thanks Keri.
any help you can provide is greatly appreciated.
-
Hi Bill,
Using my admin powers, I took a peek at your account. I'm still trying to figure out where it's coming from, but you have thousands of empty pages of your blog indexed. I'll dig around a little more and see if I can figure out what's up.
If you're comfortable with sharing your URL here in a public forum, other people can come take a look too. Otherwise, I'm happy to send you a private message with part of what's up and give your developer a place to start looking.
-
Thanks Keri. I am the owner of the site not the programmer so I am looking up the terms you are using as I write this response. If I am using pagination is there a way for the moz not to allow for this? If I understand your question about the calendar correctly I do have one as part of my blog that dates each post? Can I get the bot to not recognize this calendar?
-
My first guess would be parameters or something are being crawled. Do you have pagination? Sorting ascending and descending? A calendar that's getting crawled through the year 2525?
Your next step would be to look into what those duplicate pages are and see if something is amiss that's generating a ton of URLs.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Does subdomain (or sub sub domain) affect analytics data of root site?
We self-host our public website, but over time have also added subdomains onto it that are not public and are for internal or even client portals. I am seeking advice as to whether those subdomains affect the analytics data (self referrals, visits, bounces) of the public site that I am tasked with analyzing. I feel that it does skew the data but need to build a solid case to move the public website to a new domain, so as to leave the existing one in tact with all of its subs.
Reporting & Analytics | | MarketingGroup0 -
How does Google measure page position in Webmasters?
Does anyone know exactly how Google measures page position in Webmaster Tools? For example: In Google Webmaster Tools, we had a product which on the 22/12/15 was at position 7, and then dropped to position 112 on the 30/12/15. It then rose back up to position 7 on the 6/01/16 and then down to position 25 on the 16/01/16. What does this mean and why?
Reporting & Analytics | | CostumeD0 -
Tracking time spent on a section of a website in Google Analytics
Hi, I've been asked by a client to track time spent or number of pages visited on a specific section of their website using Google Analytics but can't see how to do this. For example, they have a "golf" section within their site and want to measure how many people either visit 5 page or more within the golf section or spend at least 6 minutes browsing the various golf section pages. Can anyone advise how if this can be done, and if so, how I go about it. Thanks
Reporting & Analytics | | geckonm0 -
How can you tell if Google has already assessed a penalty against your site for spammy links?
Is there any way to tell for sure if there is a penalty? My client has a ton of low quality back links, and I think they are in danger of a Penguin penalty. Any way to know? The links are there for a business reason.... their clients mention them in the footer, with a backlink. It is not a link scheme. but folks are generally not clicking on a footer link, and so there is a pro/con of leaving it as it. Any way, to diagnose whether a Penguin penalty has already hit?
Reporting & Analytics | | DianeDP2 -
Can 500 errors hurt rankings for an entire site or just the pages with the errors?
I'm working with a site that had over 700 500 errors after a redesign in april. Most of them were fixed in June, but there are still about 200. Can 500 errors affect rankings sitewide, or just the pages with the errors? Thanks for reading!
Reporting & Analytics | | DA20130 -
How do I keep the SEOmoz bot from showing up in my Google Analytics?
I noticed today that we had a huge spike in visits on December 9th of this month. We are talking like 13,000 more visits. The network the visits were from was psinet inc. Any suggestions on how to keep these bots from registering in my Google Analytics? Is there an ip address I can exclude?
Reporting & Analytics | | Hosmercars0 -
Run Crawl Diagnostics
hi i have fix some error refering the error list how to re-run crawl diagnostics immediate again to check the error ? thanks
Reporting & Analytics | | AlfredLim0 -
Should I delete a page that gets search traffic, that I don't care about?
I have a page on my site that consistently gets traffic, every month. Googlers seems to love it. But I don't like it at all. Webmaster tools shows that google allows us a certain number of search impressions each day. - it flatlines, they are limiting the impressions we get. We also getthe same number of clickthroughs each day. So my question is for anyone who has this same experience, who may have experimented by deleting a page you don't care about. Did you just lose that number of clicks each day or did other pages on your site get displayed and clicked through instead?
Reporting & Analytics | | loopyal0