Could large number of "not selected" pages cause a penalty?
-
My site was penalized for specific pages in the UK On July 28 (corresponding with a Panda update).
I cleaned up my website and wrote to Google and they responded that "no manual spam actions had been taken".
The only other thing I can think of is that we suffered an automatic penalty.
I am having problems with my sitemap and it is indexing many error pages, empty pages, etc... According to our index status we have 2,679,794 not selected pages and 36,168 total indexed.
Could this have been what caused the error?
(If you have any articles to back up your answers that would be greatly appreciate)
Thanks!
-
Canonical tag to what? Themselves? Or the page they should be? Are these pages unique by some URL variables only? If so, you can instruct Google to ignore specific get variables to resolve this issue but you would also want to fix your sitemap woes: http://support.google.com/webmasters/bin/answer.py?hl=en&answer=1235687
This is where it gets sticky, these pages are certainly not helping and not being indexed, Google Webmaster tools shows us that, but if you have this problem, how many other technical problems could the site have?
We can be almost certain you have some kind of panda filter but to diagnose it further we would need a link and access to analytics to determine what has gone wrong and provide more detailed guidance to resolve the issues.
This could be a red herring and your problem could be elsewhere but with no examples we can only give very general responses. If this was my site I would certainly look to identify the most likely issues and work through this in a pragmatic way to eliminate possible issues and look at other potentials.
My advice would be to have the site analysed by someone with distinct experience with Panda penalties who can give you specific feedback on the problems and provide guidance to resolve them.
If the URL is sensitive and can't be shared here, I can offer this service and am in the UK. I am sure can several other users at SEOMoz can also help. I know Marie Haynes offers this service as I am sure Ryan Kent could help also.
Shout if you have any questions or can provide more details (or a url).
-
Hi,
Thanks for the detailed answer.
We have many duplicate pages, but they all have canonical tags on them... shouldn't that be solving the problem. Would pages with the canonical tag be showing up here?
-
Yes, this can definitely cause problems. In fact this is a common footprint in sites hit by the panda updates.
It sound like you have some sort of canonical issue on the site: Multiple copies of each page are being crawled. Google is finding lots of copies of the same thing, crawling them but deciding that they are not sufficiently unique/useful to keep in the index. I've been working on a number of sites hit with the same issue and clean up can be a real pain.
The best starting point for reading is probably this article here on SEOmoz : http://www.seomoz.org/learn-seo/duplicate-content . That article includes some useful links on how to diagnose and solve the issues as well, so be sure to check out all the linked resources.
-
Hey Sarah
There are always a lot of moving parts when it comes to penalties but the very fact that you lost traffic on a known panda date really points towards this being a Panda style of penalty. Panda, is an algorithmic penalty so you will not receive any kind of notification in Webmaster Tools and likewise, a re-inclusion request will not help, you have to fix the problem to resolve the issues.
The not selected pages are likely a big part of your problem. Google classes not selected pages as follows:
"Not selected: Pages that are not indexed because they are substantially similar to other pages, or that have been redirected to another URL. More information."
If you have the best part of 3 million of these pages that are 'substantially similar' to other pages then there is every change that this is a very big part of your problem.
Obviously, there are a lot of moving parts to this. This sounds highly likely this is part of your problem and just think how this looks to Google. 2.6 million pages that are duplicated. It is a low quality signal, a possible attempt at manipulation or god knows what else but what we do know, is that is unlikely to be a strong result for any search users so those pages have been dropped.
What to do?
Well, firstly, fix your site map and sort out these duplication problems. It's hard to give specifics without a link to the site in question but just sort this out. Apply the noindex tag dynamically if needs be, remove these duplicates from the sitemap, heck, remove the sitemap alltogether for a while if needs be till it is fixed. Just sort out these issues one way or another.
Happy to give more help here if I can but would need a link or some such to advise better.
Resources
You asked for some links but I am not completely sure what to provide here without a link but let me have a shot and provide some general points:
1. Good General Panda Overview from Dr. Pete
http://www.seomoz.org/blog/fat-pandas-and-thin-content
2. An overview of canonicalisation form Google
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=139066
3. A way to diagnose and hopefully recover from Panda from John Doherty at distilled.
http://www.distilled.net/blog/seo/beating-the-panda-diagnosing-and-rescuing-a-clients-traffic/
4. Index Status Overview from Google
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=2642366
Summary
You have a serious problem here but hopefully one that can be resolved. Panda is a primarily focused at on page issues and this is an absolute doozy of an on page issue so sort it out and you should see a recovery. Keep in mind you have 75 times more problem pages than actual content pages at the moment in your site map so this may be the biggest case I have ever seen so I would be very keen to see how you get on and what happens when you resolve these issues as I am sure would the wider SEOMoz community.
Hope this helps & please fire over any questions.
Marcus
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why is my home page ranking much higher than my collection page?
Hi everyone, Why is my client's home page ranking high for a certain keyword phrase rather than a collection page I have which is well optimised for this keyword? The collection page is on the 10th SERPs page. I did see there were keywords used in the footer of page and the keyword was also used in some intro text on the home page so I removed the keyword from these two places nearly 2 weeks ago and requested google to reindex both the collection page and home page and I've not seen any improvement of the collection page's ranking in SERPs. I also changed the meta description and meta title as the ctr was poor but there wasn''t that many impressions either. It is a competitive keyword organically so maybe the collection page's authority is just not good enough compared to the competitors hence why they are choosing the home page as it has higher page authority however this still is not helpful to searchers who land on home page. Does anyone have any ideas of what else I can do to get google to rank the ocllection page higher for the keyword instead of home page?
Intermediate & Advanced SEO | | TZ19820 -
What is the "Homepage" for an International Website With Multiple Languages?
BACKGROUND: We are developing a new multi-language website that is going to have: 1. Multiple directories for various languages:
Intermediate & Advanced SEO | | mirabile
/en-us, /de, etc....
2. Hreflang tags
3. Universal footer links so user can select their preferred language.
and
4. Automatic JS detection of location on homepage only, so that when the user lands on /, it redirect them to the correct location. Currently, the auto JS detection only happens on /, and no other pages of the website. The user can also always choose to override the auto-detection on the homepage anytime, by using the language-selector links on the bottom. QUESTION: Should we try to place a 301 on / to point to en/us? Someone recommended this to us, but my thinking is "NO" - we do NOT want to 301 /. Instead, I feel like we should allow Google Access to /, because that is also the most authoritative page on the website and where all incoming links are pointing. In most cases, users / journalists / publications IMHO are just going to link to /, not dilly dally around with the language-directory. My hunch is just to keep / as is, but also work to help Google understand the relationship between all of the different language-specific directories. I know that Google officially doesn't advocate meta refresh redirects, but this only happens on homepage, and we likewise allow user to override this at any time (and again, universal footer links will point both search engines and users to all other locations.) Thoughts? Thanks for any tips/feedback!2 -
Why would my total number of indexed pages stop increasing?
I have an ecommerce marketplace that has new items added daily. In search consoloe my pages have always gone up almost every week. It hasn't increased in 5 weeks. We haven't made any changes to the site and the sitemap looks good. Any ideas on what I should look for?
Intermediate & Advanced SEO | | EcommerceSite0 -
SEO: How to change page content + shift its original content to other page at the same time?
Hello, I want to replace the content of one page of our website (already indexeed) and shift its original content to another page. How can I do this without problems like penalizations etc? Current situation: Page A
Intermediate & Advanced SEO | | daimpa
URL: example.com/formula-1
Content: ContentPageA Desired situation: Page A
URL: example.com/formula-1
Content: NEW CONTENT! Page B
URL: example.com/formula-1-news
Content: ContentPageA (The content that was in Page A!) Content of the two pages will be about the same argument (& same keyword) but non-duplicate. The new content in page A is more optimized for search engines. How long will it take for the page to rank better?0 -
WordPress – parent category "blog" instead of regular "post page"?
In WordPress you normally show you blog posts on: Your home page. Your "posts page" (configurable in the Reading Settings) I want to do neither and have a third option instead: Assign a parent category called "blog" for all posts, and show the latest posts on that category's archive page. For the readers, the experience will be 100% the same as a regular "posts page". The UI, permalinks, and breadcrumbs will be 100% the same. But, I have heard that the "posts page" is important for Google for indexing and understanding your blog. So is is smarter SEO-wise to use a "posts page" instead of a parent category named "blog"? What negative effects might there be, if I have no "posts page" and just use the parent category "blog" instead?
Intermediate & Advanced SEO | | NikolasB0 -
Will SEO cause a drop in the number of impressions?
Hello, I have been a member of the Moz community for a long time. I very seldom ask questions here but this time I really need your help to make sure I will not make mistakes that will negatively affect my site. My site monetizes according to the number of impressions visualized by the users who visit it. I now want to try to optimize it by using all those nice SEO techniques I have learned through Moz. My goal is to make sure that if I use the various SEO strategies, I will still be able to obtain the high number of impressions I get now. If not, I prefer to leave the situation untouched and I will not start optimizing the site for SEO. Please kindly read the situation below and give me a little support to make sure I am doing the right thing. I would like to ask for your professional advice to solve and issue related to duplicate content. Please visit my site: www.chhedonna.it. The sitemap has indexed 21.890 articles, but if I digit the command 'site:www.chedonna.it', I obtain 158.000 results.I believe that the duplicated content has emerged due to three errors I would like to indicate in order for you to tell me whether my interpretation is correct or not: The article writers made a mistake in that although the content of the articles is different, they have employed the same title for all of them. Therefore, you can see 5 articles with similar/duplicated Tag Titles but the content of said articles is different. For example, http://www.chedonna.it/attualita/cronaca-rosagossip/2012/12/03/heidi-klum-senza-trucco-e-parrucco-foto and http://www.chedonna.it/attualita/cronaca-rosagossip/2012/12/03/heidi-klum-senza-trucco-e-parrucco-foto-2/ are different articles (i.e., the content of the articles is different from each other) that have been published using the same titles. If I inserted the 'follow-noindex' tag to 3 of the 4 duplicates, as shown in the example above, would that be a solution? I fear that if I did insert the follow-noindex, I would cause a drop in the number of impressions visualized by my site's users. It is important to point out, in fact, that I monetize the site via the number of impressions I generate. Therefore, it is fundamental that I do not compromise the number of impressions that the site gets, if I try to optimize it for SEO reasons. On the other hand, I believe that the idea to operate via a 'rel canonical' would not be right, considering that the content of every post having the same title is different, that is, the articles are different even if they focus on the same topic. Also, I would not find it beneficial to use a '301 redirect', since the number of duplicated Titles Tags is very consistent. 2. The second error concerns the duplicate content due to the images that have been included in the articles. For instance, http://www.chedonna.it/che-donna-di-mondo/fare-la-valigia/2012/08/06/campeggio-vacanza-in-liberta/attachment/tenda/ http://www.chedonna.it/che-donna-di-mondo/fare-la-valigia/2012/08/06/campeggio-vacanza-in-liberta/attachment/tenda-2/ http://www.chedonna.it/che-miss/2012/12/04/tatuaggi-fiore-di-loto-significato-e-foto/attachment/tatuaggio-fiore-di-loto-2/ http://www.chedonna.it/che-miss/2013/03/15/tatuaggi-fiore-di-loto-significato-e-foto-2/attachment/tatuaggio-fiore-di-loto-2-2/. I could solve this problem by preventing the media indexing. But I fear that this would produce a very high number of '404 error' messages. If such a thing did happen, my site would stop monetizing overall and I cannot allow this to occur, as you can understand. My very important question is the following: if I prevent the indexing of the photos, will I get in return a drop in the number of impressions that my site would normally generate? 3. Duplicate content generated by the indexing of archive subpages. For example: http://www.chedonna.it/tag/angelica-e-ferdinando/ http://www.chedonna.it/tag/angelica-e-ferdinando/page2/ http://wwwchedonna.it/tag/angelica-e-ferdinando/page3/ If I prevented the media indexing, will I get in return a drop in the number of impressions and many 404 errors? Thank you very much for taking the time to help me sort out this very important issue. Cheers, Sal
Intermediate & Advanced SEO | | salvyy0 -
Too many on page links - product pages
Some of the pages on my client's website have too many on page links because they have lists of all their products. Is there anything I should/could do about this?
Intermediate & Advanced SEO | | AlightAnalytics0 -
How do I fix the error duplicate page content and duplicate page title?
On my site www.millsheating.co.uk I have the error message as per the question title. The conflict is coming from these two pages which are effectively the same page: www.millsheating.co.uk www.millsheating.co.uk/index I have added a htaccess file to the root folder as I thought (hoped) it would fix the problem but I doesn't appear to have done so. this is the content of the htaccess file: Options +FollowSymLinks RewriteEngine On RewriteCond %{HTTP_HOST} ^millsheating.co.uk RewriteRule (.*) http://www.millsheating.co.uk/$1 [R=301,L] RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.html\ HTTP/ RewriteRule ^index\.html$ http://www.millsheating.co.uk/ [R=301,L] AddType x-mapp-php5 .php
Intermediate & Advanced SEO | | JasonHegarty0