Moz Crawl shows over 100 times more pages than my site has?
-
The latest crawl stats are attached. My site has just over 300 pages?
Wondering what I have done wrong?
-
total pages is higher you are right Keri but still only 581
-
I believe this image looks at what's indexed that's a subset of your sitemap that you submitted. You may want to look at Google Index -> Index Status in GWT to see what it shows there.
-
latest Moz crawl
-
latest webmaster tools crawl
-
I will definetly be paying attention to those numbers Keri. Webmaster tools is showing the right number of pages (something over 300 with 90% of those indexed)
-
It's not going to be a penalty, but it'll be good to have a bit less of a load on your server (bots no longer crawling thousands of pages) and just have your real pages in the index.
Places to look for interesting changes in site metrics would be your organic traffic in analytics and taking a look at your Google Webmaster Tools account to see your impressions, pages crawled, etc.
-
Thanks Keri, I will update asap.
could you let me know how big an issue would this be? (When you have the time of course;))
-
You're welcome! I may have opened a can of worms, however. That sitemap is generated by an automated tool (based on the footer at the bottom), so somehow it's finding that page 28 as well.
You may also want to ask the developer if you should be indexing the categories in the blog archives. There are resources on Moz about the best way to set that up in Wordpress, but I don't have them at my fingertips at the moment (I have a snuggly baby sleeping on my lap instead that's slowing me down a tad).
To answer your next question, after you figure out where the page 28 is being linked from and cure that, yes, you can do a one-time crawl from Research Tools. It won't overwrite your campaign info, but you can at least see if Moz is seeing thousands of pages or just a few hundred to see if stuff was fixed. Again, happy to provide more detail if/when you need it (and others will likely jump in with help on the thread, too).
I'd love to also see a little update a few weeks down the line of any changes you've noticed on your site metrics after getting this fixed.
-
You rock:)
-
And I found it. The sitemap at http://www.nineclouds.ca/sitemap includes a page /28, which is where the crawlers are finding the non-existent pages.
-
If you look at http://www.nineclouds.ca/blog/page/23, you'll see that there's a double arrow in the pagination at the right that goes to page 24, even though the last page is page 21. Google somehow has found the pages greater than 21 (which I'm not sure how they found), and once they found one of those, they keep seeing the link there with the double arrows to go to another page. Same happened with Rogerbot. I'm not sure where the bad originating link is (what legit page on your site is linking to something over page 21), but that's the loop that's happening and causing a ton of pages to be indexed. Get rid of those, and you'll also get rid of most of your errors.
-
Not shy about that at all thanks Keri.
any help you can provide is greatly appreciated.
-
Hi Bill,
Using my admin powers, I took a peek at your account. I'm still trying to figure out where it's coming from, but you have thousands of empty pages of your blog indexed. I'll dig around a little more and see if I can figure out what's up.
If you're comfortable with sharing your URL here in a public forum, other people can come take a look too. Otherwise, I'm happy to send you a private message with part of what's up and give your developer a place to start looking.
-
Thanks Keri. I am the owner of the site not the programmer so I am looking up the terms you are using as I write this response. If I am using pagination is there a way for the moz not to allow for this? If I understand your question about the calendar correctly I do have one as part of my blog that dates each post? Can I get the bot to not recognize this calendar?
-
My first guess would be parameters or something are being crawled. Do you have pagination? Sorting ascending and descending? A calendar that's getting crawled through the year 2525?
Your next step would be to look into what those duplicate pages are and see if something is amiss that's generating a ton of URLs.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Solved How to solve orphan pages on a job board
Working on a website that has a job board, and over 4000 active job ads. All of these ads are listed on a single "job board" page, and don’t obviously all load at the same time. They are not linked to from anywhere else, so all tools are listing all of these job ad pages as orphans. How much of a red flag are these orphan pages? Do sites like Indeed have this same issue? Their job ads are completely dynamic, how are these pages then indexed? We use Google’s Search API to handle any expired jobs, so they are not the issue. It’s the active, but orphaned pages we are looking to solve. The site is hosted on WordPress. What is the best way to solve this issue? Just create a job category page and link to each individual job ad from there? Any simpler and perhaps more obvious solutions? What does the website structure need to be like for the problem to be solved? Would appreciate any advice you can share!
Reporting & Analytics | | Michael_M2 -
Google Analytic showing Bounce Rate More than 100% i.e. 2000%, 3000% Bounces for Many Pages
Hello Guys, Google Analytic showing Bounce Rate More than 100% i.e. 2000%, 3000% etc for Many Pages. Can anyone let me know what is the issue? Thanks!
Reporting & Analytics | | dsouzac0 -
Page Performance
Not long ago, I had a couple of peers asking why I was using sessions to evaluate page performance. They said it wasn't a good metric for evaluating a single page because it only looked at how many site visitors began their journey through you site form that page. They were trying to convert me over to pageviews, which they said was a superior metric because it show you every time that page had been loaded and therefore provided better insight. Moz uses sessions on their landing page report. Is this because it's an SEO tool, so all they are concerned with is how individual URLs attract site traffic? Signed, Confused in California
Reporting & Analytics | | PGD20110 -
641 Crawl Errors In My Moz Report - 190 are high priority Duplicate Content
Hi everyone, There are high and medium level errors. I was surprised to see any especially since Google Analytics shows no errors whatsoever.190 errors - duplicate content.A lot of images are showing in the Moz Crawl Report as errors, and when I click on one of these links in the report, it directs to the image which displays on a blog post on the site unusually since I haven't started blogging yet.. So it looks like all those errors are because the images are appearing on their own post.So for example a picture of a mountain would be referred to with www.domain.com/mountains ; the image would be included in the content on a page but why give an image a page/post all of it's own when that was not my intention. Is there a way I can change this?# ----------------------------------------
Reporting & Analytics | | SEOguy1
These are things I first see at the top of the Moz Report: There are 2 similar home urls at the top of the report: http status code is 200 for both (1) and (2) Link Count for (1) is 71. Link count for (2) is 60. No client or server errors Rel Canonical Rel-Canonical Target
Yes http:// domain. co.uk/home
Yes http:// domain. co.uk/home/ Does this mean that the home page is being seen as a duplicate by Google and the search engines?http status codes on every page is 200.Your help would be appreciated.Best Regards,0 -
Why would page views per visitor suddenly increase?
My website traffic is growing by about 1% a week. It has a fairly stable page views/visitor of about 1.69. There's normally very little variability in this As we sell an industrial product. Today page views jumped by 50% and so did page views/visitor but visitor numbers stayed the same. I dont have a useful hypothesis to explain this. Analytics shows me that the traffic source, country of origin and pages viewed are pretty much the same as normal. There's been no substantive change to the site (today we changed the text in a widget to link to a new page - and no one visited it). It doesn't look like 1 person has gone through the whole site as that would skew the distribution of page views by country So why would user behavour suddenly change? I'll look at it for the rest of the week but in 7 years of looking after this website I haven't seen anything like this before.
Reporting & Analytics | | Zippy-Bungle0 -
Mobile Site on Google Analytics
Hi mozzers, We just launched a mobile site and I was wondering what are the main steps to follow for gettting your mobile site tracked via GA (m.example.com)? We have a profile for www.example.com GATC: javascript or PHP to install? Should the profile be on a subdomain? What else to consider when implementing a mobile site on GA? Thanks
Reporting & Analytics | | Ideas-Money-Art0 -
Tool to check GA code present on every page?
Is there a tool to check if the Google Analytics code is present on every page of a website? Thanks for your help!
Reporting & Analytics | | gerardoH0 -
Is 10 Keyword Targets for Page Rank too many?
My client has selected 10 keywords that they want to rank on the first page of Google for. Is 10 keywords too many to try and rank for? I have heard that you should focus on top 5 keywords instead.
Reporting & Analytics | | dseasterling0