Large-Scale Penguin Cleanup - How to prioritize?
-
We are conducting a large-scale Penguin cleanup / link cleaning exercise across 50+ properties that have been on the market mostly all for 10+ years. There is a lot of link data to sift through and we are wondering how we should prioritize the effort.
So far we have been collecting backlink data for all properties from AHref, GWT, SeoMajestic and OSE and consolidated the data using home-grown tools.
As a next step we are obviously going through the link cleaning process. We are interested in getting feedback on how we are planning to prioritize the link removal work. Put in other words we want to vet if the community agrees with what we consider are the most harmful type of links for penguin.
- Priority 1: Clean up site-wide links with money-words; if possible keep a single-page link
- Priority 2: Clean up or rename all money keyword links for money keywords in the top 10 anchor link name distribution
- Priority 3: Clean up no-brand sitewide links; if possible keep a single-page link
- Priority 4: Clean up low-quality links (other niche or no link juice)
- Priority 5: Clean up multiple links from same IP C class
Does this sound like a sound approach? Would you prioritize this list differently?
Thank you for any feedback /T
-
Your data sources are correct (AHREFs, Bing, Ose & Majestic) but I recommend including Bing as well. The data is free and you will find at least some links not shown in other sources.
The link prioritization you shared is absolutely incorrect.
"Priority 1: Clean up site-wide links with money-words; if possible keep a single-page link"
While it is true site-wide links are commonly manipulative, removing the site wide link and keeping a single one does not necessarily make it less manipulative. You have only removed one of the elements which are often used to identify manipulative links.
"Priority 2: Clean up or rename all money keyword links for money keywords in the top 10 anchor link name distribution"
A manipulative link is still manipulative regardless of the anchor text used. Based in April 2012, Google used anchor text as a means to identify manipulative links. That was over 18 months ago and Google's link identification process has evolved substantially since that time.
"Priority 3: Clean up no-brand sitewide links; if possible keep a single-page link"
Same response as #1 & 2
"Priority 4: Clean up low-quality links (other niche or no link juice)"
See below
"Priority 5: Clean up multiple links from same IP C class"
The IP address should not be given any consideration whatsoever. You are using a concept that had validity years ago and is completely outdated.
bonegear.net IP address 66.7.211.83
vitopian.com IP address 64.37.49.163
There are no commonalities between the above two IP addresses, be it C block or otherwise, yet they are both hosted on the same server.
You have identified the issue affecting your site (Step 1) and collected a solid list of your backlinks using multiple sources (Step 2). The backlink report is an excellent step which places you well above most site owners and SEOs in your situation.
Step 3 - Identify links from every linking domain.
a. Have an experienced, knowledgeable human visit each and every linking domain. Yes, that is a lot of work but it is what's necessary if you are going to accurately identify all of the manipulative links. Prior to beginning this step, be absolutely sure the person can accurately identify manipulative links with AT LEAST 95% accuracy, although 100% is strongly desired.
b. Document the effort. I have had 3 clients who approached me with a Penguin issue, we confirmed there was not any manual action in place at the time we began the clean up process, but before we finished the sites incurred a manual penalty. Solid documentation of the clean up effort is required by Google in case the Penguin issue morphs into a manual penalty. Also, it just makes sense. You mentioned 50+ web properties so clearly others will be performing these tasks.
c. Audit the effort. A wise former boss once stated "You must inspect what you expect". Unless you carefully audit the work, the process will fail. Evaluators will mis-identify links. You will lose some quality links and manipulative links will be missed as well.
d. While you are on the site, capture manipulative site's e-mail address and contact forum URL (if any). This information is helpful to contact site owners to request link removal.
Step 4 - Conduct a Webmaster Outreach Campaign. Each manipulative domain needs to be contacted in a comprehensive manner. In my experience, most SEOs and site owners do not put in the required level of effort.
a. Send a professional request to the site's WHOIS e-mail address.
b. After 3 business days if no response is received, send the same letter to the site's e-mail address found on the website.
c. After another 3 business days, if no response is received submit the e-mail via the site's contact form. Take a screenshot of the submission on the site (not required for Penguin as no documentation is, but it is helpful for the process).
All of the manipulative link penalties (Penguin and manual) I have worked with have been cleaned up manually. With that said, we use Rmoov to manage the Webmaster Outreach process. It sends and maintains a copy of every e-mail sent. It even has a place to add the Contact Form URL. A big time saver.
If a site owner responds and removes the link, that's great. CHECK IT! If there are only a few links, manually confirm link removal. If there are many URLs, use Screaming Frog or another tool to confirm link removal.
If a site owner refuses or requests money, you can often achieve link removal by having further respectful conversations.
If a site owner does not respond, you can use "extra measures". Call the phone number listed in WHOIS. Send a physical letter to the WHOIS address. Reach out to them on social media sites. Is it a .com domain with missing WHOIS information? You can report them on INTERNIC. Is it a spammy wordpress.com or blogspot site? You can report that as well.
When Matt Cutts introduced the Disavow Tool, he clearly said "...at the point where you have written to as many people as you can, multiple times, you have really tried hard to get in touch and you have only been able to get a fraction of those links down and there is still a small fraction of those links left, that's where you can use our Disavow Tool".
The above process satisfies that requirement. In my experience, not much less than the above process meets that need. The overwhelming majority of those tackling these penalties try to perform the minimal amount of work possible, which is why forums are flooded with complaints about numerous attempts to remove manipulative link penalties and failing.
Upon completion of the above, THEN upload a Disavow list of the links you could not remove after every reasonable human effort. In my experience you should have removed at least 20% of the linking DOMAINS (with rare exceptions).
It can take up to 60 days thereafter, but if you truly cleaned up the links in a quality manner, then the Penguin issues should be fully resolved.
The top factors in determining whether you succeed or fail are:
1. Your determination to follow the above process thoroughly
2. The experience, training and focus of your team
You can resolve the issue in one round of effort and have the Penguin issue resolved within a few months....or you can be one of those site owners who thinks it is impossible and be struggling with the same issue a year later. If you are not 100% committed, RUN AWAY. By that I mean change domain names and start over.
Good Luck.
TLDR - Don't try to fool Google. Anchor text and site wide links are part of the MECHANISM used to identify manipulative links. Don't confuse the mechanism with the message. Google's clear message: EARN links, don't "build" links. Polishing up the old manipulative links is a complete waste of your time. AT BEST, you will enjoy limited success for a period of time until Google catches up. Many site owners and SEOs have already been there, and it is a painful process.
-
When you say "clean up" do you mean removing the links or disavowing them?
You will never be able to get them all removed, so in the end you will need to a Disavow anyways. If your time frame is short, you may want to make Priority One be doing a Disavow for each of the 50+ sites you are working with. Then you can proceed with attempting to get the links removed. I have not heard that there is any downside to having a link removed that already appears on your disavow file...
As for the order of the Priorities, you may want to shuffle them a bit depending on the different situations on the different websites. I suggest you read this Moz Blog article called It's Penguin-Hunting Season: How to Be the Predator and Not the Prey
...and then test a few of your sub-pages that used to rank well at the program used in this article which is called the Penguin Analysis Tool. I say sub-page because it needs a single keyword phrase you want rank that particular page for so it do the anchor text analysis. And that works better on focused sub-pages than on general homepages. $10 per website will let you fully evaluate two typical pages on each and see which facet of the link profile is most valuable to attack first.
-
Have you read the post at http://moz.com/blog/ultimate-guide-to-google-penalty-removal? Matt Cutts even called it out on Twitter as a good post. That's where I'd first look for ideas.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Can anyone help me diagnose an indexing/sitemap issue on a large e-commerce site?
Hey guys. Wondering if someone can help diagnose a problem for me. Here's our site: https://www.flagandbanner.com/ We have a fairly large e-commerce site--roughly 23,000 urls according to crawls using both Moz and Screaming Frog. I have created an XML sitemap (using SF) and uploading to Webmaster Tools. WMT is only showing about 2,500 urls indexed. Further, WMT is showing that Google is indexing only about 1/2 (approx. 11,000) of the urls. Finally (to add even more confusion), when doing a site search on Google (site:) it's only showing about 5,400 urls found. The numbers are all over the place! Here's the robots.txt file: User-agent: *
Intermediate & Advanced SEO | | webrocket
Allow: /
Disallow: /aspnet_client/
Disallow: /httperrors/
Disallow: /HTTPErrors/
Disallow: /temp/
Disallow: /test/ Disallow: /i_i_email_friend_request
Disallow: /i_i_narrow_your_search
Disallow: /shopping_cart
Disallow: /add_product_to_favorites
Disallow: /email_friend_request
Disallow: /searchformaction
Disallow: /search_keyword
Disallow: /page=
Disallow: /hid=
Disallow: /fab/* Sitemap: https://www.flagandbanner.com/images/sitemap.xml Anyone have any thoughts as to what our problems are?? Mike0 -
Best Format to Index a Large Data Set
Hello Moz, I've been working on a piece of content that has 2 large data sets I have organized into a table that I would like indexed and want to know the best way to code the data for search engines while still providing a good visual experience for users. I actually created the piece 3 times and am deciding on which format to go with and I would love your professional opinions. 1. HTML5 - all the data is coded using tags and contains all the data on page in the . This is the most straight forward method and I know this will get indexed; however, it is also the ugliest looking table and least functional. 2. Java - I used google charts and loaded all the data into a
Intermediate & Advanced SEO | | jwalker880 -
To recover from Penguin update, shall i remove the links or disavow links?
Hi, One of our websites hit by Penguin update and I now know where the links are coming from. I have chance to remove the links from those incoming links but I am a little confused whether i should just remove the links from incoming links or disavow the links? Thanks
Intermediate & Advanced SEO | | Rubix0 -
Penguin 2.0 update, ranking dropped. Advice needed!
Hello After another penguin 2.0 update the website i've been working on dropped in rankings,some of keywords that i ranked in #1 are now on second and third page, you can see this screenshot here http://screencast.com/t/MramoXgTr 95% of my competitors were not even effected with this update at all, most of them don't even optimize their website for SEO, rather they use paid directories. First thing i did is analyzed my backing profile using OSE, to my surprise i found a lot of low quality domains pointing to my pages with a keyword in anchor text. A lot of them blog commenting and low quality article directories. Since i don't have control over these links and i cant remove them i used Disavow tool to do the job. For the past 3 months, i've been doing a lot of hight quality link building; such as
Intermediate & Advanced SEO | | KentR
press releases once in 2 months, squidoo lens and hubpages 3 posts a week for each keyword, youtube video, in fact my youtube video still ranks in #3 for high competitive search, i was involved in social media, posting tweets every week and Facebook posts. I really hope that someone can help me here with a good advice on getting my rankings back here's my website, let me know what do you think about it. Thank You0 -
Best strategy for "product blocks" linking to sister site? Penguin Penalty?
Here is the scenario -- we own several different tennis based websites and want to be able to maximize traffic between them. Ideally we would have them ALL in 1 site/domain but 2 of the 3 are a partnership which we own 50% of and why are they are off as a separate domain. Big question is how do we link the "products" from the 2 different websites without looking spammy? Here is the breakdown of sites: Site1: Tennis Retail website --> about 1200 tennis products Site2: Tennis team and league management site --> about 60k unique visitors/month Site3: Tennis coaching tip website --> about 10k unique visitors/month The interesting thing was right after we launched the retail store website (site1), google was cranking up and sending upwards of 25k search impressions/day within the first 45 days. Orders kept trickling in and doing well overall for first launching. Interesting thing was Google "impressions" peaked at about 60 days post launch and then started trickling down farther and farther and now at about 3k-5k impressions/day. Many keywords phrases were originally on page 1 (position 6-10) and now on page 3-8 instead. Next step was to start putting "product links" (3 products per page) on site2 and site3 -- about 10k pages in total with about 6 links per page off to the product page (1 per product and 1 per category). We actually divided up about 100 different products to be displayed so this would mean about 2k links per product depending on the page. FYI, those original 10k pages from site2 and site3 already rank very well in Google and have been indexed for the past 2+ years in there. Most popular word on the sites is Tennis so very related. Our rationale was "all the websites are tennis related" and figured that the links on the latest and greatest products would be good for our audience. Pre-Penguin, we also figured this strategy would also help us rank for these products as well for when users are searching on them. We are thinking through since traffic and gone down and down and down from the peak of 45 days ago, that Penguin doesn't like all these links -- so what to do now? How to fix it and make the Penguin happy? Here are a couple of my thoughts on fixing it: 1. Remove the "category link" in our "product grouping" which would cut down the link by 1/3rd. 2. Place a "nofollow" on all the links for the other "product links". This would allow us to get the "user clicks" from these while the user is on that page. 3. On our homepage (site2 & site3), place 3 core products that change frequently (weekly) and showcase the latest and greatest products/deals. Thought is to NOT use the "nofollow" on these links since it is the homepage and only about 5 links overall. Heck part of me debated on taking our top 1000 pages (from the 10k page) and put the links ONLY on those and distribute about 500 products on them so this would mean only 2 links per product -- it would mean though about 4k links going there. Still thinking #2 above could be better? Any other thoughts would be great! Thanks, Jeremy
Intermediate & Advanced SEO | | jab10000 -
Could large number of "not selected" pages cause a penalty?
My site was penalized for specific pages in the UK On July 28 (corresponding with a Panda update). I cleaned up my website and wrote to Google and they responded that "no manual spam actions had been taken". The only other thing I can think of is that we suffered an automatic penalty. I am having problems with my sitemap and it is indexing many error pages, empty pages, etc... According to our index status we have 2,679,794 not selected pages and 36,168 total indexed. Could this have been what caused the error? (If you have any articles to back up your answers that would be greatly appreciate) Thanks!
Intermediate & Advanced SEO | | theLotter0 -
What are best practices for anchor text diversification in a post-penguin world?
There is growing concern for this topic as the best white-hat tactics generally allow you to choose your own anchor text (eg. guest posting and infographics)
Intermediate & Advanced SEO | | LaunchAStartup0 -
Anybody else seeing Penguin corrections?
Hi,
Intermediate & Advanced SEO | | rayvensoft
Over the past few days, I have noticed that a few of my pages that were hit by the Google Penguin update come back from the dead and return to the #1 spot for the main keywords. I still don't see any change for secondary keywords I used to rank for, but hey at least there is something. Has anybody else noticed this? NOTE: I did not make any changes to my pages. I had never done any black-hat (just greyish) so I took the advice of many and just waited.1