Investigating a huge spike in indexed pages
-
I've noticed an enormous spike in pages indexed through WMT in the last week. Now I know WMT can be a bit (OK, a lot) off base in its reporting but this was pretty hard to explain. See, we're in the middle of a huge campaign against dupe content and we've put a number of measures in place to fight it. For example:
-
Implemented a strong canonicalization effort
-
NOINDEX'd content we know to be duplicate programatically
-
Are currently fixing true duplicate content issues through rewriting titles, desc etc.
So I was pretty surprised to see the blow-up. Any ideas as to what else might cause such a counter intuitive trend? Has anyone else see Google do something that suddenly gloms onto a bunch of phantom pages?
-
-
I haven't contacted the forum yet but that's my next step.
Pages indexed: 91k
Blocked by robots.txt: 8.4million
I don't even know how you could create 8.4 million indexable pages from our content.
-
Have you contacted the Google Webmaster Help forums? As that seems to be a glitch in Google.
How many pages are scraped by Mozbot? If the amount that mozbot shows is different, then you should either sit and wait until Google removes those indexed pages or create a conversation on the forums so someone at google can give you a hint of what is going on.
-
Any help out there? Since the original question was posted, I've seen some improvement but even with aggressive canonicalization and noindexing, I'm still seeing a boatload of indexed pages. I am still seeing pages indexed that I've asked explicitly to be omitted by robots.txt (/search.aspx and */filter). I'm guessing it's just going to take a while to deindex what's there. Still, 91k pages indexed is quite a lot when you consider we only have about 3-4k pages and some articles.
Is anyone aware of any significant releases by Google?
-
Quite recent. We were actually seeing a nice downward trend in the huge number of pages indexed and then the number tripled. Crazy is an understatement. I would have thought the number of pages would fall given the number of pages that now use canonicals.
-
How long have you waited since you applied all the rules to avoid duplicate content, as if it was just recently, then Google should be "rebuilding" the index of your site and stats may be a little crazy while that is happening.
If it was over 2 month ago and you are seeing the increase now, then I'd suggest you revise the rules you created to see if your own Website isn't creating all those new pages.
Hope that helps.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is there a way to get a list of all pages of your website that are indexed in Google?
I am trying to put together a comprehensive list of all pages that are indexed in Google and have differing opinions on how to do this.
Technical SEO | | SpodekandCo0 -
Correct linking to the /index of a site and subfolders: what's the best practice? link to: domain.com/ or domain.com/index.html ?
Dear all, starting with my .htaccess file: RewriteEngine On
Technical SEO | | inlinear
RewriteCond %{HTTP_HOST} ^www.inlinear.com$ [NC]
RewriteRule ^(.*)$ http://inlinear.com/$1 [R=301,L] RewriteCond %{THE_REQUEST} ^./index.html
RewriteRule ^(.)index.html$ http://inlinear.com/ [R=301,L] 1. I redirect all URL-requests with www. to the non www-version...
2. all requests with "index.html" will be redirected to "domain.com/" My questions are: A) When linking from a page to my frontpage (home) the best practice is?: "http://domain.com/" the best and NOT: "http://domain.com/index.php" B) When linking to the index of a subfolder "http://domain.com/products/index.php" I should link also to: "http://domain.com/products/" and not put also the index.php..., right? C) When I define the canonical ULR, should I also define it just: "http://domain.com/products/" or in this case I should link to the definite file: "http://domain.com/products**/index.php**" Is A) B) the best practice? and C) ? Thanks for all replies! 🙂
Holger0 -
Web page is showing up on Google but doesn't show when it was cached, so is it indexed?
Hey everyone So I created a new page on a WordPress website, it was live for a few hours till I changed my mind & switched it back to a draft. Just out of curiosity I did the Site:www.example.com/Example search on Google to see if it had been indexed & apparently it had but when I click on cached to see what time it got indexed at exactly it's showing me an error. So does this mean it is indexed or not?
Technical SEO | | conversiontactics0 -
Link Indexing Thoughts
We have have several promotional Articles put out for a few client sites, (posted on sites - not article directories) That was in Sept, it looks like they have not yet been indexed - any ideas on best to get them indexed? Not just these, but a lot of external links indexed quickly -Google seem to be slowing getting to them (big web after all....)
Technical SEO | | OnlineAssetPartners0 -
How do I 301 redirect a number of pages to one page
I want to redirect all pages in /folder_A /folder_B to /folder_A/index.php. Can I just write one or two lines of code to .htaccess to do that?
Technical SEO | | Heydarian0 -
Page not being indexed
Hi all, On our site we have a lot of bookmaker reviews, and we are ranking pretty good for most bookmaker names as keywords, however a single bookmaker seems to have been shunned by Google. For a search "betsafe" in Denmark, this page does not appear among the top 50: http://www.betxpert.com/bookmakere/betsafe All of our other review pages rank in top 10-20 for the bookmaker name as keyword. What to do if Google has "banned" a page? Best regards, Rasmus
Technical SEO | | rasmusbang0 -
2000 pages indexed in Yahoo, 0 in Google. NO PR, What is wrong?
Hello Everyone, I have a friend with a blog site that has over 2000 pages indexed in Yahoo but none in Google and no page rank. The web site is http://www.livingorganicnews.com/ I know it is not the best site but I am guessing something is wrong and I don't see it. Can you spot it? Does he have some settings wrong? What should he do? Thank you.
Technical SEO | | QuietProgress0 -
Can I use canonical tags to merge property map pages and availability pages to their counterpart overview pages?
I have a property website, for each property are 4-5 tabs each with their own URL, these pages include the overview page which is content rich, and auxilliary pages such as maps, availability, can I use a canonical tag to merge the tabs with very little content to their corresponding overview page which is content rich? I.e. www.mywebsite.co.uk/property-1/overview This page has tabs for map, town info, availability which all have their own url i.e. www.mywebsite.co.uk/property-1/map
Technical SEO | | assertive-media
www.mywebsite.co.uk/property-1/availability
www.mywebsite.co.uk/property-1/towninfo Because these auxilary pages do not contain much content can I place a canonical tag in them pointing back to the content rich overview page at www.mywebsite.co.uk/property-1/overview?0