Investigating a huge spike in indexed pages
-
I've noticed an enormous spike in pages indexed through WMT in the last week. Now I know WMT can be a bit (OK, a lot) off base in its reporting but this was pretty hard to explain. See, we're in the middle of a huge campaign against dupe content and we've put a number of measures in place to fight it. For example:
-
Implemented a strong canonicalization effort
-
NOINDEX'd content we know to be duplicate programatically
-
Are currently fixing true duplicate content issues through rewriting titles, desc etc.
So I was pretty surprised to see the blow-up. Any ideas as to what else might cause such a counter intuitive trend? Has anyone else see Google do something that suddenly gloms onto a bunch of phantom pages?
-
-
I haven't contacted the forum yet but that's my next step.
Pages indexed: 91k
Blocked by robots.txt: 8.4million
I don't even know how you could create 8.4 million indexable pages from our content.
-
Have you contacted the Google Webmaster Help forums? As that seems to be a glitch in Google.
How many pages are scraped by Mozbot? If the amount that mozbot shows is different, then you should either sit and wait until Google removes those indexed pages or create a conversation on the forums so someone at google can give you a hint of what is going on.
-
Any help out there? Since the original question was posted, I've seen some improvement but even with aggressive canonicalization and noindexing, I'm still seeing a boatload of indexed pages. I am still seeing pages indexed that I've asked explicitly to be omitted by robots.txt (/search.aspx and */filter). I'm guessing it's just going to take a while to deindex what's there. Still, 91k pages indexed is quite a lot when you consider we only have about 3-4k pages and some articles.
Is anyone aware of any significant releases by Google?
-
Quite recent. We were actually seeing a nice downward trend in the huge number of pages indexed and then the number tripled. Crazy is an understatement. I would have thought the number of pages would fall given the number of pages that now use canonicals.
-
How long have you waited since you applied all the rules to avoid duplicate content, as if it was just recently, then Google should be "rebuilding" the index of your site and stats may be a little crazy while that is happening.
If it was over 2 month ago and you are seeing the increase now, then I'd suggest you revise the rules you created to see if your own Website isn't creating all those new pages.
Hope that helps.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Mobile website indexing
Hi we have a mobile version of our website at mobile.gardening-services-edinburgh.com its been live for 5, maybe 6 months, it has its own mobile-sitemap.xml have tried submitting this sitemap to google and for some reason it does not index these pages any ideas, most welcome
Technical SEO | | McSEO0 -
Why are my images not being indexed?
I have submitted an image sitemap with over 2,000 images yet only about 35 have been indexed. Could you please help me understand why Google is not indexing my images? www.creative-calendars.com
Technical SEO | | nicole20140 -
Why Google ranks a page with Meta Robots: NO INDEX, NO FOLLOW?
Hi guys, I was playing with the new OSE when I found out a weird thing: if you Google "performing arts school london" you will see w w w . mountview . org. uk at the 3rd position. The point is that page has "Meta Robots: NO INDEX, NO FOLLOW", why Google indexed it? Here you can see the robots.txt allows Google to index the URL but not the content, in article they also say the meta robots tag will properly avoid Google from indexing the URL either. Apparently, in my case that page is the only one has the tag "NO INDEX, NO FOLLOW", but it's the home page. so I said to myself: OK, perhaps they have just changed that tag therefore Google needs time to re-crawl that page and de-index following the no index tag. How long do you think it will take to don't see that page indexed? Do you think it will effect the whole website, as I suppose if you have that tag on your home page (the root domain) you will lose a lot of links' juice - it's totally unnatural a backlinks profile without links to a root domain? Cheers, Pierpaolo
Technical SEO | | madcow780 -
Page for page 301 redirects from old server to new server
Hi guys:
Technical SEO | | cindyt-17038
I have a client who is moving their entire ecommerce site from one hosting platform (Yahoo Store) to another (BigCommerce) and from one domain to another. The old domain is registered with the Yahoo as of yesterday and we have redirected the old domain (at the domain level) to the new domain. However, we are having trouble getting the pages to redirect page for page. Currently they are all redirecting to the new domain home page. We did just move the old domain from GoDaddy to Yahoo yesterday thinking this would solve it however as of this morning the old pages are still redirecting to the home page of the new domain. To complete the 301 redirect picture, we uploaded the redirects (all relative links for both from and to) to BigCommerce. And while the domain was hosted at GoDaddy with a redirect to the new domain, they were working. We moved the domain to Yahoo because of email issues thinking it should still work. Is it possibly just a waiting game now as the change populates across the DNS? old url to test:
rock-n-roll-action-figures.com/fender-jazz-bass-miniature-guitar-replica-classic-red-finish.html0 -
How to make my good sub-page rank ahead of my generic home page?
I have an ecommerce site for the clothes drying racks my family business makes, and it sells a few other laundry items also. It's about 5 years old. We used to rank on the first page for basic phrases like "clothes drying rack" and "umbrella clothesline". About 1.5 years ago we fell hard in the rankings. Since then "umbrella clothesline" has moved back to the first page, but "clothes drying rack" is stuck on the 3rd page and always with the result being the generic homepage instead of the good sub-page (which used to rank on the first page) that really shows-n-tells about our drying rack. Here are the three pages I am talking about. Home page = http://www.bestdryingrack.com/ Drying rack page = http://www.bestdryingrack.com/clothes-drying-rack-main.html and umbrella clothesline page = http://www.bestdryingrack.com/umbrella-clotheslines.html Any ideas on how to get the drying rack page to start ranking well again? (hopefully better than the generic homepage ranks) A little technical background: the Moz campaign on this site says that the home page has a PA = 42 with 190 LRD's and 344 external links. Both the umbrella clothesline page and the clothes drying rack page have almost equal statistics of PA = 35 with 20 LRD's and 23 external links. My anchor text distribution is maybe unbalanced. The drying rack page has 15 external links with the anchor of "Clothes Drying Rack". But the umbrella clothesline page has 14 external links with the anchor of "outdoor umbrella clothesline" and it ranks on the first page for that search. I can't figure out how to get OSE to tell me anchor text stats for just the homepage and not the whole site since www.bestdryingrack.com/index.html 301's to the plain www.bestdryingrack.com (if you know how, please share) What's wrong with my poor neglected clothes drying rack page? The only way I can get it to show up on the first page is to do a real specific search like "round wooden clothes drying rack" Your help could save a faltering family business. Thank you!
Technical SEO | | GregB1230 -
Is it bad to have your pages as .php pages?
Hello everyone, Is it bad to have your website pages indexed as .php? For example, the contact page is site.com/contact.php and not /contact. Does this affect your SEO rankings in any way? Is it better to have your pages without the extension? Also, if I'm working with a news site and the urls are dynamic for every article (ie site.com/articleid=2323.) Should I change all of those dynamic urls to static? Thank You.
Technical SEO | | BruLee0 -
Removing some of the indexed pages from my website
I am planning to remove some of the webpages from my website and these webpages are already indexed with search engine. Is there any way by which I need to inform search engine that these pages are no more available.
Technical SEO | | ArtiKalra0 -
Duplicate Page Content and Title for product pages. Is there a way to fix it?
We we're doing pretty good with our SEO, until we added product listing pages. The errors are mostly Duplicate Page Content/Title. e.g. Title: Masterpet | New Zealand Products MasterPet Product page1 MasterPet Product page2 Because the list of products are displayed on several pages, the crawler detects that these two URLs have the same title. From 0 Errors two weeks ago, to 14k+ errors. Is this something we could fix or bother fixing? Will our SERP ranking suffer because of this? Hoping someone could shed some light on this issue. Thanks.
Technical SEO | | Peter.Huxley590