Last Panda: removed a lot of duplicated content but no still luck!
-
Hello here,
my website virtualsheetmusic.com has been hit several times by Panda since its inception back in February 2011, and so we decided 5 weeks ago to get rid of about 60,000 thin, almost duplicate pages via noindex metatags and canonical (we have no removed physically those pages from our site giving back a 404 because our users may search for those items on our own website), so we expected this last Panda update (#25) to give us some traffic back... instead we lost an additional 10-12% traffic from Google and now it looks even really badly targeted.
Let me say how disappointing is this after so much work!
I must admit that we still have many pages that may look thin and duplicate content and we are considering to remove those too (but those are actually giving us sales from Google!), but I expected from this last Panda to recover a little bit and improve our positions on the index. Instead nothing, we have been hit again, and badly.
I am pretty desperate, and I am afraid to have lost the compass here. I am particularly afraid that the removal of over 60,000 pages via noindex metatags from the index, for some unknown reason, has been more damaging than beneficial.
What do you think? Is it just a matter of time? Am I on the right path? Do we need to wait just a little bit more and keep removing (via noindex metatags) duplicate content and improve all the rest as usual?
Thank you in advance for any thoughts.
-
Never mind, I have just found your site... thank you again!
-
Thank you very much Marie for your time and explanation, I appreciated it. Do you offer SEO consultation? Please, let me know.
Thank you again!
-
The short answer to this is that this is not what the disavow tool was meant for, so no I wouldn't use it. Affiliate links SHOULD be nofollowed though. However, affiliate links won't cause you to be affected by Panda. Link related issues are totally unrelated to Panda.
Unfortunately at this point though I'm going to bow out of taking this discussion any further due to time constraints. Q&A is a good place to get someone to take a quick look at your site, but if you've got lots of questions it may be worthwhile to pay a consultant to help out with your site's traffic drop issues.
-
Marie, I was thinking, do you think the new Google's Disavow Links Tool could help me with my affiliate's inbound links? I mean, in case I could be damaged by that kind of link profile...
-
Yes, I think will be easier to change our own contents and tell them to add the canonical tag to our page. Thanks!
-
Actually you can see the subsequent pages still in the index, just enter on Google:
site:virtualsheetmusic.com inurl:downloads/Indici/Guitar.html
and you will see what I mean. I see though that most of those pages have been cached before I put the canonical tag, so I guess it is just a matter of time.
Am I correct? I mean, if a page has a canonical tag that points to a different page it should NOT be in the index, right?Thank you for looking!
-
If there's duplicate content then you've either got to change yours, get theirs changed, or get them to use a rel-canonical tag pointing to your site or a noindex tag.
-
I just had a quick look but I don't see any other versions of the page you listed in the index. If you just added the rel prev and next it won't take effect until the pages are crawled which could take even up to a few weeks to happen.
-
Sorry Marie, I forgot to answer your inquiry about music2print.com: that's one of our affiliates! That's another issue we could suffer for... how do you suggest to tackle the affiliate-possible-duplicate content? Thanks!
-
Yes EGOL, I understand that my only way is to really thicken and differentiate the pages with real and unique content. I will try that and keep you posted! Thank you for your help again.
-
Marie, look at the following page, it is the main (first) page of our guitar index:
http://www.virtualsheetmusic.com/downloads/Indici/Guitar.html
Now, if you want to browse the guitar repertoire to the second page of the index, you click the page "2" or "next" link right? And then the second page appears:
http://www.virtualsheetmusic.com/downloads/Indici/Guitar.html?cp=2&lpg=20
And so on... well, those subsequent pages are the ones I was talking about: they have the rel=prev and rel-next tags together with the canonical tag that refers to the main (first) index page, but many of those subsequent pages are still in the index, Shouldn't they disappear and only the first page kept in the index?
As for what you wrote about how I can expect a recover from Panda, it makes sense and I really hope this new integration of Panda into the main algorithm will gradually speed things up. Thank you for your opinion on that.
I think my approach will be to keep noindexing those pages that really don't bring any business first and in the meantime improve all the others one by one. To nonidex all pages and start releasing just the optimized ones one by one scares me too much!
-
Most of the content on my site is articles that are 500 to 5000 words and one to ten photos - all on a single page.
It was very easy for me to "noindex" the republished content and "noindex" the blog posts that were very short.
For a site that consists of pages where most of the content is thin and duplicated a massive rewriting job is required in my opinion. That is what I would do if I wanted to make an attempt at recovering such a site.
I had to chop off my foot to save my ass.
-
I'm not sure that I'm following what you are saying. Which pages are in the index that you feel should not be because of their canonical tag?
You mentioned above that it sounds like it is "easy" to recover from Panda. I don't think that is true for most sites. Most likely in EGOL's case he had a site that had some fantastic content to go along with the duplicate and thin content. If there is good stuff there, then getting rid of the low quality stuff can sometimes be a quick fix. But, if you've got a site that consists almost completely of thin or duplicated content then it may not be so easy.
In my experience, when a site recovers from Panda, it does not happen gradually as the site gets cleaned up and improved. Rather, there is a sudden uptick when Panda refreshes provided that you have done enough work for Google to say that enough of your site is high quality. However, this may change now that Panda will be rolling out as part of the regular algorithm and not just every 4-6 weeks or so as before.
-
The academic year is coming to a close in the northern hemisphere. Hire a music scholar who is also a great writer and attack this. Or hire a writer who appreciates music. Better yet, hire one of each.
It is time to exert yourself.
-
Thank you Marie, yes, the canonical should tell Google what you said, but I don't understand why the other pages (subsequent index pages) are still in the index despite the canonical tag. Am I missing something?
About the thin content and how that affect the whole site, I have no more doubts, that's clear and I will tackle that page by page. I am just wondering if my presence on Google is going to improve little by little over time while I tackle the problem page-by-page, or will my site score get better only when everything will be clean and improved? To deindex everything and start rewriting with the best products first. as EGOL suggested really scares me since we live with the site and we could ending up making no money at all for too long.
-
Yes, I see, it's great to know you could recover pretty easily. I will keep working on the contents then, even though I guess is going to be a long way... thanks!
-
You have a canonical tag on that page which tells Google that this particular page is the version that you would like in the index. It is indeed in the index. But there's not much on the page of value.
EGOL explained well how Panda can affect an entire site. I look at it as a flag. So, if Google sees that you have a certain amount of duplicate or thin or otherwise low quality content, then they put a flag on the entire site that says, "This site is generally low quality." and as such, the whole site has trouble ranking, even if there are some good pages in the midst of the low quality ones.
-
When you have a Panda problem it can damage rankings across your site.
I had a Panda problem with two sites.
One had some republished content and some very short blog posts. Rankings went down for the entire site. I noindexed them and the rankings came back in a few weeks.
The other site had hundreds of printable .pdfs that contained only an image and a few words. These were images using the .pdf format to control the scale of the printer. Rankings went down for the entire site. I noindexed the .pdfs and rankings came back in a few weeks.
In my opinion, your site needs a huge writing job.
-
Thank you Egol for reinforcing what Marie said, but still I can't figure out why some of my best pages, with many reviews and unique content, have dropped from the top rankings (from 1st page to 13th page) the last November:
http://www.seomoz.org/q/what-can-do-to-put-these-pages-back-in-the-top-results
Thank you again.
-
Wow, thank you so much Marie for your extended reply and information, it is like gold for me!
Some thoughts about what you wrote:
For example, take this page:
http://www.virtualsheetmusic.com/downloads/Indici/Guitar.html
There is almost no text on that page that is unique to that page. Why should it be in the search results? I did a search for the text on the top of the page and saw that it was repeated on thousands of your pages. The rest of the text is all from other pages as well. If there is nothing on this page that is unique and adds value, then it needs to be noindexed.
I actually used to not care about subsequent pages in indexes such as the Guitar one because I thought that what Google needed was just the new rel=prev and rel=next tags to figure out that the important page was the first one only, but then I got scared by Panda and 5 weeks ago I put the canonical tag on subsequent pages pointing to the main page. So, I don't understand why you still find the subsequent pages on the index... shouldn't the canonical tag help on that?
And I get it now more than before: we really need to make our product pages more unique and compelling and we'll do that. Our best pages have many users reviews, but looks like that's not enough... look at what I am discussing on this thread about our best product pages with many and unique user reviews on them:
http://www.seomoz.org/q/what-can-do-to-put-these-pages-back-in-the-top-results
Those pages are dropped from page 1 to over page 10! Why?! Everything looks non-sense if you look at the data and how some thinner pages rank better than thicker ones. IN other words, despite what you write makes perfectly sense to me and I will try to pursue it, if I analyze Google results and my pages rankings, I cannot understand what Google wants from me (i.e. Why it's penalizing my good pages?).
And so, my last question is: have you idea when I will begin to see some improvements? So far I haven't seen any good results from my last action of dropping over 50,000 thin pages from the index, which I must say, it is not much encouraging!
Thank you again very much again.
-
I agree with Marie. The content is duplicate AND the content is very thin. Both of the Panda problems on every page.
A complete authorship job is needed.
Every page needs to be 100% unique and substantive.
Comments that appear on some pages are the only content that I saw that I would consider as unique.
If I owned this site and was willing to make a big investment I would deindex everything and start rewriting with the best products first.
-
Hi Fabrizo. I have a few thoughts for you. In order to recover from Panda you really need to make your pages compelling. Think, for each page, "Would Google want to show this page to people who are searching for information?"
I still see that there is a lot of work to be done to recover. For example, take this page:
http://www.virtualsheetmusic.com/downloads/Indici/Guitar.html
There is almost no text on that page that is unique to that page. Why should it be in the search results? I did a search for the text on the top of the page and saw that it was repeated on thousands of your pages. The rest of the text is all from other pages as well. If there is nothing on this page that is unique and adds value, then it needs to be noindexed.
Is music2print.com your site as well? I see that the pages redirect to your site, but they mustn't have always done that because they are still listed in the Google index. If you had duplicate versions of the site then this is a sure-fire way to get a Panda flag on your site. If you no longer want music2print.com in the index then you can use the url removal tool in WMT to get rid of it. With the 301 in place, eventually Google will figure it out but it could take some time.
When I look at a product page such as http://www.virtualsheetmusic.com/score/JesuGu.html, the page is extremely thin. This is one of the difficulties with having a commerce site that sells products. In order for Google to want to display your products prominently in search, they need to see that there is something there that users will want to see that is better than other sites selling this product. When I search for "Jesu, Joy of Man's Desiring sheet music" I see that there are 136,000 results. Why would Google want to display yours to a user? Now, the argument that I usually get when I say this is that everyone else is doing the same thing. Sometimes it can be a mystery why Panda affects one site and not the next, and comparing won't get us anywhere.
So, what can you do for products like this? You need to make these pages SUPER useful. I like giving thinkgeek.com as an example. This site sells products that you can buy on other sites but they go above and beyond to describe the product in unique ways. As such, they rank well for their products.
Also, the way you have your pages set up with tabs is inviting a duplicate content issue as well. For example, these pages are all considered separate pages:
http://www.virtualsheetmusic.com/score/JesuGu.html
http://www.virtualsheetmusic.com/score/JesuGu.html?tab=pdf
http://www.virtualsheetmusic.com/score/JesuGu.html?tab=mp3
http://www.virtualsheetmusic.com/score/JesuGu.html?tab=midi
...and so on. But they are creating a duplicate content problem because they are almost identical to each other. EDIT: Actually, you are using the canonical tag correctly so this is not as big an issue. However, if the canonical tag on http://www.virtualsheetmusic.com/score/JesuGu.html?tab=pdf is pointing to http://www.virtualsheetmusic.com/score/JesuGu.html, you are saying to Google, "http://www.virtualsheetmusic.com/score/JesuGu.html" is the main version of this page and I want this page to appear in your index. The problem is that THIS page contains almost no valuable information that can't be found elsewhere and the majority of the page is templated material that is seen on every page of your site.
Unfortunately there are a lot of issues here and I'm afraid that recovery from Panda is going to be very challenging.
If this were my site I would likely noindex EVERYTHING and then one page at a time work on creating the best page possible to put into the search results. You may start by looking at your analytics and finding out which pages were actually bringing in traffic at some time and then rewrite those pages. You may need to be creative. You could write something about the history of the composition. Is there a story around it? Was it ever played for someone famous? Has anyone famous every played it? If so, on what instrument? Is there anything unusual about the composition such as the key or tempo? Can you embed a video of someone playing the composition?
It may sound ridiculous to do so much work for each item, but unless you can add value that can't be found elsewhere, then Panda is going to continue to keep your rankings down.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Do you think this case would be of a duplicated content and what would be the consequences in such case?
At the webpage https://authland.com/ which is a food&wine tours and activities booking platform, primary content - services thumbnails containing information about the destination, title and prices of the particular services, can be found at several sub-pages/urls. For example, service https://authland.com/zadar/zadar-region-food-and-wine-tour/1/. Its thumbnail/card through which the service is available, can be found on multiple pages (Categories, Destinations, All services, Most recent services...) Is this considered a duplicated content? Since all of the thumbnails for services on the platform, are to be found on multiple pages. If it is, which would be the best way to avoid that content being perceived by Google bots as such? Thank you very much!
Intermediate & Advanced SEO | | ZD20200 -
Search console, duplicate content and Moz
Hi, Working on a site that has duplicate content in the following manner: http://domain.com/content
Intermediate & Advanced SEO | | paulneuteboom
http://www.domain.com/content Question: would telling search console to treat one of them as the primary site also stop Moz from seeing this as duplicate content? Thanks in advance, Best, Paul. http0 -
Duplicate content, the distrubutors are copying the content of the manufacturer
Hi everybody! While I was checking all points of the Technical Site Audit Checklist 2015 (great checklist!), I found that the distrubutors of my client are copying part of the content to add it in their websites. When I take a content snippet, and put it in quotes and search for it I get four or five sites that have copied the content. They are distributors of my client. The first result is still my client (the manufacturer), but... should I recommend any action to this situation. We don't want to bother the distributors with obstacles. This situation could be a problem or is it a common situation and Google knows perfectly where the content is comming from? Any recommendation? Thank you!
Intermediate & Advanced SEO | | teconsite0 -
Duplicate Content Issues :(
I am wondering how we can solve our duplicate content issues. Here is the thing: There are so many ways you can write a description about a used watch. http://beckertime.com/product/mens-rolex-air-king-no-date-stainless-steel-watch-wsilver-dial-5500/ http://beckertime.com/product/mens-rolex-air-king-stainless-steel-date-watch-wblue-dial-5500/ Whats different between these two? The dial color. We have a lot of the same model numbers but with different conditions, dial colors, and bands.. What ideas do you have?
Intermediate & Advanced SEO | | KingRosales0 -
301 or 404 Question for thin content Location Pages we want to remove
Hello All, I have a Hire Website with many categories and individual location pages for each of the 70 depots we operate. However, being dynamic pages, we have thousands of thin content pages. We have decided to only concentrate on our best performing locations and get rid of the rest as its physically impossible to write unique content for all our location pages for every categories. Therefore my question is. Would it cause me problems by having to many 301's for the location pages I am going to re-direct ( i was only going to send these back to the parent category page) or should I just 404 all those location pages and at some point in the future when we are in a position to concentrate on these locations then redo them with new content ? in terms of url numbers It would affect a few thousand 301's or 404's depending on people thoughts. Also , does anyone know what percentage of thin content on a site should be acceptable ?.. I know , none is best in an ideal world but it would be easier if there we could get away with a little percentage. We have been affected by Panda , so we are trying to tidy things up as best at possible, Any advice greatly appreciated? thanks Peter
Intermediate & Advanced SEO | | PeteC120 -
No-index pages with duplicate content?
Hello, I have an e-commerce website selling about 20 000 different products. For the most used of those products, I created unique high quality content. The content has been written by a professional player that describes how and why those are useful which is of huge interest to buyers. It would cost too much to write that high quality content for 20 000 different products, but we still have to sell them. Therefore, our idea was to no-index the products that only have the same copy-paste descriptions all other websites have. Do you think it's better to do that or to just let everything indexed normally since we might get search traffic from those pages? Thanks a lot for your help!
Intermediate & Advanced SEO | | EndeR-0 -
Duplicate content when changing a site's URL due to algorithm penalty
Greetings A client was hit by penguin 2.1, my guess is that this was due to linkbuilding using directories. Google webmaster tools has detected about 117 links to the site and they are all from directories. Furthermore, the anchor texts are a bit too "perfect" to be natural, so I guess this two factors have earned the client's site an algorithm penalty (no manual penalty warning has been received in GWT). I have started to clean some of the backlinks, on Oct the 11th. Some of the webmasters I asked complied with my request to eliminate backlinks, some didn´t, I disavowed the links from the later. I saw some improvements on mid october for the most important KW (see graph) but ever since then the rankings have been falling steadily. I'm thinking about giving up on the domain name and just migrating the site to a new URL. So FINALLY MY QUESTION IS: if I migrate this 6-page site to a new URL, should I change the content completely ? I mean, if I just copy paste the content of the curent site into a new URL I will incur in dpolicate content, correct?. Is there some of the content I can copy ? or should I just start from scratch? Cheers hRggeNE
Intermediate & Advanced SEO | | Masoko-T0 -
Duplicate Content www vs. non-www and best practices
I have a customer who had prior help on his website and I noticed a 301 redirect in his .htaccess Rule for duplicate content removal : www.domain.com vs domain.com RewriteCond %{HTTP_HOST} ^MY-CUSTOMER-SITE.com [NC]
Intermediate & Advanced SEO | | EnvoyWeb
RewriteRule (.*) http://www.MY-CUSTOMER-SITE.com/$1 [R=301,L,NC] The result of this rule is that i type MY-CUSTOMER-SITE.com in the browser and it redirects to www.MY-CUSTOMER-SITE.com I wonder if this is causing issues in SERPS. If I have some inbound links pointing to www.MY-CUSTOMER-SITE.com and some pointing to MY-CUSTOMER-SITE.com, I would think that this rewrite isn't necessary as it would seem that Googlebot is smart enough to know that these aren't two sites. -----Can you comment on whether this is a best practice for all domains?
-----I've run a report for backlinks. If my thought is true that there are some pointing to www.www.MY-CUSTOMER-SITE.com and some to the www.MY-CUSTOMER-SITE.com, is there any value in addressing this?0