Investigating a huge spike in indexed pages
-
I've noticed an enormous spike in pages indexed through WMT in the last week. Now I know WMT can be a bit (OK, a lot) off base in its reporting but this was pretty hard to explain. See, we're in the middle of a huge campaign against dupe content and we've put a number of measures in place to fight it. For example:
-
Implemented a strong canonicalization effort
-
NOINDEX'd content we know to be duplicate programatically
-
Are currently fixing true duplicate content issues through rewriting titles, desc etc.
So I was pretty surprised to see the blow-up. Any ideas as to what else might cause such a counter intuitive trend? Has anyone else see Google do something that suddenly gloms onto a bunch of phantom pages?
-
-
I haven't contacted the forum yet but that's my next step.
Pages indexed: 91k
Blocked by robots.txt: 8.4million
I don't even know how you could create 8.4 million indexable pages from our content.
-
Have you contacted the Google Webmaster Help forums? As that seems to be a glitch in Google.
How many pages are scraped by Mozbot? If the amount that mozbot shows is different, then you should either sit and wait until Google removes those indexed pages or create a conversation on the forums so someone at google can give you a hint of what is going on.
-
Any help out there? Since the original question was posted, I've seen some improvement but even with aggressive canonicalization and noindexing, I'm still seeing a boatload of indexed pages. I am still seeing pages indexed that I've asked explicitly to be omitted by robots.txt (/search.aspx and */filter). I'm guessing it's just going to take a while to deindex what's there. Still, 91k pages indexed is quite a lot when you consider we only have about 3-4k pages and some articles.
Is anyone aware of any significant releases by Google?
-
Quite recent. We were actually seeing a nice downward trend in the huge number of pages indexed and then the number tripled. Crazy is an understatement. I would have thought the number of pages would fall given the number of pages that now use canonicals.
-
How long have you waited since you applied all the rules to avoid duplicate content, as if it was just recently, then Google should be "rebuilding" the index of your site and stats may be a little crazy while that is happening.
If it was over 2 month ago and you are seeing the increase now, then I'd suggest you revise the rules you created to see if your own Website isn't creating all those new pages.
Hope that helps.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is my page being indexed?
To put you all in context, here is the situation, I have pages that are only accessible via an intern search tool that shows the best results for the request. Let's say i want to see the result on page 2, the page 2 will have a request in the url like this: ?p=2&s=12&lang=1&seed=3688 The situation is that we've disallowed every URL's that contains a "?" in the robots.txt file which means that Google doesn't crawl the page 2,3,4 and so on. If a page is only accessible via page 2, do you think Google will be able to access it? The url of the page is included in the sitemap. Thank you in advance for the help!
Technical SEO | | alexrbrg0 -
Why my website does not index?
I made some changes in my website after that I try webmaster tool FETCH AS GOOGLE but this is 2nd day and my new pages does not index www. astrologersktantrik .com
Technical SEO | | ramansaab0 -
Indexing pages content that is not needed
Hi All, I have a site that has articles and a side block that shows interesting articles in a column block. While we google for a keyword i can see the page but the meta description is picked from the side block "interesting articles" and not the actual article in the page. How can i deny indexing that block alone Thanks
Technical SEO | | jomin740 -
Duplicate Page Content but where?
Hi All Moz is telling me I have duplicate page content and sure enough the PA MR mT are all 0 but it doesnt give me a link to this content! This is the page: http://www.orsgroup.com/index.php?page=Scanning-services But I cant find where the duplicate content is other than on our own youtube page which I will get removed here: http://www.youtube.com/watch?v=Pnjh9jkAWuA Can anyone help please? Andy
Technical SEO | | ORS-Group0 -
Can you noindex a page, but still index an image on that page?
If a blog is centered around visual images, and we have specific pages with high quality content that we plan to index and drive our traffic, but we have many pages with our images...what is the best way to go about getting these images indexed? We want to noindex all the pages with just images because they are thin content... Can you noindex,follow a page, but still index the images on that page? Please explain how to go about this concept.....
Technical SEO | | WebServiceConsulting.com0 -
Indexing Issue
Hi, I am working on www.stjohnswaydentalpractice.co.uk Google only seems to be indexing two of the pages when i search site:www.stjohnswaydentalpractice.co.uk I have added the site to webmaster tools and created a new sitemap which is showing that it has only submitted two of the pages. Can anyone shed any light for why these pages are not being indexed? Thanks Faye
Technical SEO | | dentaldesign0 -
Should I allow index of category / tag pages on Wordpress?
Quite simply, is it best to allow index of category / tag pages on a Wordpress blog or no index them? My thought is Google will / might see it as duplicate content? Thanks, K
Technical SEO | | SEOKeith0 -
New Domain Page 7 Google but Page 1 Bing & Yahoo
Hi just wondered what other people's experience is with a new domain. Basically have a client with a domain registered end of May this year, so less than 3 months old! The site ranks for his keyword choice (not very competitive), which is in the domain name. For me I'm not at all surprised with Google's low ranking after such a short period but quite surprsied to see it ranking page 1 on Bing and Yahoo. No seo work has been done yet and there are no inbound links. Anyone else have experience of this? Should I be surprised or is that normal in the other two search engines? Thanks in advance Trevor
Technical SEO | | TrevorJones0