Can too many "noindex" pages compared to "index" pages be a problem?
-
Hello,
I have a question for you: our website virtualsheetmusic.com includes thousands of product pages, and due to Panda penalties in the past, we have no-indexed most of the product pages hoping in a sort of recovery (not yet seen though!). So, currently we have about 4,000 "index" page compared to about 80,000 "noindex" pages.
Now, we plan to add additional 100,000 new product pages from a new publisher to offer our customers more music choice, and these new pages will still be marked as "noindex, follow".
At the end of the integration process, we will end up having something like 180,000 "noindex, follow" pages compared to about 4,000 "index, follow" pages.
Here is my question: can this huge discrepancy between 180,000 "noindex" pages and 4,000 "index" pages be a problem? Can this kind of scenario have or cause any negative effect on our current natural SEs profile? or is this something that doesn't actually matter?
Any thoughts on this issue are very welcome.
Thank you!
Fabrizio
-
Julian, we sell digital sheet music and the additional 100,000 are products from Alfred music publishing company. Of course they will not be "high quality pages", but they are product pages, each one offering a piece of music. We are an e-commerce website, how can we avoid having product pages?! But of course, as Wesley said above, we can improve each product page quality content by giving more/custom information for each product, increasing user reviews, etc.
Other suggestions?
-
Thank you Wesley, yes, I think you are right. Our business is suffering really too much without traffic coming from the "noindex" pages, and after many months we still don't see recovery. I think the best approach would be probably to keep the pages in the index and differentiate them as much as we can.
Thank you!
-
Panda is probably the worst penalty to have. Very few site ever recover, even though site owner have spent a lot of time, effort and money trying to solve it. e.g. http://searchengineland.com/google-panda-two-years-later-losers-still-losing-one-real-recovery-149491
In this video, about 12.43 - matt cutts is clear, if you think its low quality 404 it, in other delete it.
May I ask why you want to keep these 180,000 pages live? And why are you planning to add another 100,000 pages? Surely they cant be high quality pages?
-
Fabrizo, as far as I know Google Panda is now part of the standard Google algorithm and it won't be a periodic event anymore. Penguin still is though.
If your product pages are duplicate content according to Google try and see if you can do something about that instead of no-indexing it. Is there no way you can update the products so they display a more prominent description? I understand that manually it's not a possibility because there are way too much products for that to be an option.
I did notice that on a lot of your product pages you have a standard text: "This item includes: PDF (digital sheet music to print), Scorch files (for online playing, transposition and printing), Videos, MIDI and Mp3 audio files (including <a title="This item includes Mp3 music accompaniment files.">Mp3 music accompaniment files</a>)*
Genre: classical
Skill Level: medium"Since this is basicly the only text on a lot of pages I think it's a big part of the problem. Maybe you can change this text so it looks different for every product?
Try tools like http://www.plagspotter.com/ to find the duplicate content and see which solution is best for your specific problem.
I hope i helped and if you need more help let me know
-
I understand what you mean and I agree with you in general, but specifically to our own website, I have no idea who put that link on that page, which is by the way a "nofollow" link. We never built links, all our incoming links are either natural and/or links from our own affiliates. I don't see much of "that stuff" on our back-link profile... am I in error?
Anyhow, yes, we are aware the situation is quite complex. Thank you again.
-
I actually looked at the competitors ranking #3 and #4 for the phrase "download sheet music" since your ranking 5th. Either way, its not a matter of too much or too little. It's how much of the link profile is authentic vs how much is made up of stuff like this....
http://www.dionneco.com/2011/02/love-is-a-parallax/
that's what I meant by fake links.
I think what you may be missing is how complex the situation really is. There's a lot more to be considered than a number in Open Site Explorer - which is actually only a portions of what's really out there.
You may also want to look at changes you can make on-site. I'm a firm believer that proper HTML, accessibility, UX and all that really matter.
-
Thank you Takeshi, I think you got the problem right. The "crawling" side of the issue is something I was thinking about too!
We are actually working on every aspect of our website to improve its content because we have suffered by Panda a lot in the past two years, so here is the strategy we begun to take since March:
1. "noindexing" most of our thin or almost-duplicate content to get it removed from the index
2. Improve our best content and differentiate it as much as we can with compelling content (this takes a long time!)
3. Consolidating similar pages with the use of canonical tags.
In order to tackle the "slower crawling" problem you have highlighted here, do you think that would be probably better for us to stop engines to crawl those pages altogether via robots.txt once they have been removed? Would that solve the crawl issue? I could do that at least with these new 100,000 new product pages we plan to add!
Thank you!
-
Wesley, that's because of being penalized by Panda several times in the past... so we are trying the "clean-up" strategy with the hope to be "de-penalized" by Panda at the next related algorithm update. Looks like we had too many "thin" or "almost duplicate" pages... that's why we removed so many pages from the index! But if we don't see improvements in the coming 1-2 months, I guess we'll put the product pages in the index because our business is suffering a big deal!
-
Colin, what do you mean with "fake links" exactly? Our link profile looks actually in better shape than our main competitors:
virtualsheetmusic.com (our site): links: 614,013 root domains: 2,233
sheetmusicplus.com (competitor): links: 5,322,596 root domains: 6,149 (worse than our profile!)
musicnotes.com (competitor): links: 6,527,429 root domains: 2,914 (much worse than our profile!)
Am I missing anything?
-
The discrepancy between noindexed/indexed pages is not in itself a problem. However having all those pages will present a challenge to Google, in terms of crawling. Even though the pages won't be indexed, Google will need to spend some of your limited crawl budget crawling all those pages.
Also, to recover from Panda it's necessary to not only noindex duplicate content, but improve your indexed content. That means things like consolidating similar pages into one page, writing unique content for your pages, and getting unique user-generated content such as reviews.
-
Why would you want to no-index your product pages? They seem like the kind of pages you want to get found on.
There shouldn't be a problem between the amount of index pages VS no-index pages except you won't get found on the no-index ones. Product pages tend to be the kind of pages that you REALLY want to get found on.
I think you should rethink your strategy to recover from the penalties.
Try to find out where exactly the penalties came from and fix the errors in that area of our website. -
Can't say I've been in that situation, but search engines seem to interpret that tag as an on/off situation. and I think you probably know that your problems aren't related to or able to be solved by robots meta tags.
You need less fake links. OSE finds well over half a million links from 3K root domains to your site. Look at your competitors - a few thousand links from a handful of domains.
It's a shame because it seems like the internet wanted to make you the authority naturally - You've got a handful of really solid links coming in. If you could shed the spam somehow you'd be doing a lot better.
So yea, stating the obvious, I know. best of luck to you and hope the site recovers!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Too many SEO changes needed on a page. Create a new page?
I've been doing some research on a keyword with Page Optimization. I'm finding there's a lot of changes suggested. I'm wondering that because of the amount of changes required is it better to create a new page entirely from scratch that has all the suggestions implemented OR change the current page? Thanks, Chris
Intermediate & Advanced SEO | | Chris29181 -
Can a duplicate page referencing the original page on another domain in another country using the 'canonical link' still get indexed locally?
Hi I wonder if anyone could help me on a canonical link query/indexing issue. I have given an overview, intended solution and question below. Any advice on this query will be much appreciated. Overview: I have a client who has a .com domain that includes blog content intended for the US market using the correct lang tags. The client also has a .co.uk site without a blog but looking at creating one. As the target keywords and content are relevant across both UK and US markets and not to duplicate work the client has asked would it be worthwhile centralising the blog or provide any other efficient blog site structure recommendations. Suggested solution: As the domain authority (DA) on the .com/.co.uk sites are in the 60+ it would risky moving domains/subdomain at this stage and would be a waste not to utilise the DAs that have built up on both sites. I have suggested they keep both sites and share the same content between them using a content curated WP plugin and using the 'canonical link' to reference the original source (US or UK) - so not to get duplicate content issues. My question: Let's say I'm a potential customer in the UK and i'm searching using a keyword phrase that the content that answers my query is on both the UK and US site although the US content is the original source.
Intermediate & Advanced SEO | | JonRayner
Will the US or UK version blog appear in UK SERPs? My gut is the UK blog will as Google will try and serve me the most appropriate version of the content and as I'm in the UK it will be this version, even though I have identified the US source using the canonical link?2 -
How many pages should be on landscapers website
Hi Guys, We have a good website strong onsite and offsite seo. A year ago, we had a 15 pages website for all main keywords we needed and we were on top 3 for most of these keywords in google. We were happy but we wanted more.. So we created lots of unique content targeting long tail keywords and created 100 more pages for the website. In next 4-5 months we lost positions for almost all our main keywords but got lots of longtails SERPs. Trafiic grew but the quality and the conversion rate shrinked. Everybody keep saying that it doesn't matter how many pages you have on the website as long as content is unique and I don't think it is true. I see lots of 3-5 paged websites without any seo in top 3 results in google. Does it mean that if I delete all these 100 pages that I created I will have more chances to get my main keywords SERP back? Basically does the seo juice that you have on domain is spreading across all pages and the more pages you have the less juice every page will get?
Intermediate & Advanced SEO | | vadimmarusin100 -
Any idea why this page isn't indexing?
Hi Mozzers, Question for all of you. Any idea why this page isn't indexing in Google? It's indexing in Bing, but we don't see it in Google's results. It doesn't seem like we have any noindex tags or anyway issues with the robots files either. Any ideas? http://ohva.k12.com/
Intermediate & Advanced SEO | | petertong230 -
Google indexing "noindex" pages
1 weeks ago my website expanded with a lot more pages. I included "noindex, follow" on a lot of these new pages, but then 4 days ago I saw the nr of pages Google indexed increased. Should I expect in 2-3 weeks these pages will be properly noindexed and it may just be a delay? It is odd to me that a few days after including "noindex" on pages, that webmaster tools shows an increase in indexing - that the pages were indexed in other words. My website is relatively new and these new pages are not pages Google frequently indexes.
Intermediate & Advanced SEO | | khi50 -
What happen if a canonical tag points to a noindex page?
Hello,
Intermediate & Advanced SEO | | fablau
I have question. We have hundreds of affiliates that have implemented our datafeed on their websites, and to avoid duplicate content issues we are requiring them to put a canonical tag on their own product pages pointing to our own original product page. So, for example, if an affiliate has a page about our Product 101, they will have to add a canonical tag pointing to the corresponding product page on our own website: www.ourwebsite.com/products/product101 Now, since many of our product pages have defined a "noindex" tag (due to Panda issues), may that be a problem? In other words: what kind of problems could cause having our affiliates defining a canonical tag on their own product pages pointing to the original product page on our website which have a "noindex" met tag defined? Maybe it is a stupid question we shouldn't worry about, but any thoughts about this scenario are very welcome! Thank you in advance.0 -
Duplicate peices of content on multiple pages - is this a problem
I have a couple of WordPress clients with the same issue but caused in different ways: 1. The Slash WP theme which is a portfolio theme, involves setting up multiple excerpts of content that can then be added to multiple pages. So although the pages themselves are not identical, there are the same snippets of content appearing on multiple pages 2. A WP blog which has multiple categories and/or tags for each post, effectively ends up with many pages showing duplicate excerpts of content. My view has always been to noindex these pages (via Yoast), but was advised recently not to. In both these cases, even though the pages are not identical, do you think this duplicate content across multiple pages could cause an issue? All thoughts appreciated
Intermediate & Advanced SEO | | Chammy0 -
Use of rel="alternate" hreflang="x"
Google states that use of rel="alternate" hreflang="x" is recommended when: You translate only the template of your page, such as the navigation and footer, and keep the main content in a single language. This is common on pages that feature user-generated content, like a forum post. Your pages have broadly similar content within a single language, but the content has small regional variations. For example, you might have English-language content targeted at readers in the US, GB, and Ireland. Your site content is fully translated. For example, you have both German and English versions of each page. Does this mean that if I write new content in different language for a website hosted on my sub-domain, I should not use this tag? Regards, Shailendra Sial
Intermediate & Advanced SEO | | IM_Learner0