Large-Scale Penguin Cleanup - How to prioritize?
-
We are conducting a large-scale Penguin cleanup / link cleaning exercise across 50+ properties that have been on the market mostly all for 10+ years. There is a lot of link data to sift through and we are wondering how we should prioritize the effort.
So far we have been collecting backlink data for all properties from AHref, GWT, SeoMajestic and OSE and consolidated the data using home-grown tools.
As a next step we are obviously going through the link cleaning process. We are interested in getting feedback on how we are planning to prioritize the link removal work. Put in other words we want to vet if the community agrees with what we consider are the most harmful type of links for penguin.
- Priority 1: Clean up site-wide links with money-words; if possible keep a single-page link
- Priority 2: Clean up or rename all money keyword links for money keywords in the top 10 anchor link name distribution
- Priority 3: Clean up no-brand sitewide links; if possible keep a single-page link
- Priority 4: Clean up low-quality links (other niche or no link juice)
- Priority 5: Clean up multiple links from same IP C class
Does this sound like a sound approach? Would you prioritize this list differently?
Thank you for any feedback /T
-
Your data sources are correct (AHREFs, Bing, Ose & Majestic) but I recommend including Bing as well. The data is free and you will find at least some links not shown in other sources.
The link prioritization you shared is absolutely incorrect.
"Priority 1: Clean up site-wide links with money-words; if possible keep a single-page link"
While it is true site-wide links are commonly manipulative, removing the site wide link and keeping a single one does not necessarily make it less manipulative. You have only removed one of the elements which are often used to identify manipulative links.
"Priority 2: Clean up or rename all money keyword links for money keywords in the top 10 anchor link name distribution"
A manipulative link is still manipulative regardless of the anchor text used. Based in April 2012, Google used anchor text as a means to identify manipulative links. That was over 18 months ago and Google's link identification process has evolved substantially since that time.
"Priority 3: Clean up no-brand sitewide links; if possible keep a single-page link"
Same response as #1 & 2
"Priority 4: Clean up low-quality links (other niche or no link juice)"
See below
"Priority 5: Clean up multiple links from same IP C class"
The IP address should not be given any consideration whatsoever. You are using a concept that had validity years ago and is completely outdated.
bonegear.net IP address 66.7.211.83
vitopian.com IP address 64.37.49.163
There are no commonalities between the above two IP addresses, be it C block or otherwise, yet they are both hosted on the same server.
You have identified the issue affecting your site (Step 1) and collected a solid list of your backlinks using multiple sources (Step 2). The backlink report is an excellent step which places you well above most site owners and SEOs in your situation.
Step 3 - Identify links from every linking domain.
a. Have an experienced, knowledgeable human visit each and every linking domain. Yes, that is a lot of work but it is what's necessary if you are going to accurately identify all of the manipulative links. Prior to beginning this step, be absolutely sure the person can accurately identify manipulative links with AT LEAST 95% accuracy, although 100% is strongly desired.
b. Document the effort. I have had 3 clients who approached me with a Penguin issue, we confirmed there was not any manual action in place at the time we began the clean up process, but before we finished the sites incurred a manual penalty. Solid documentation of the clean up effort is required by Google in case the Penguin issue morphs into a manual penalty. Also, it just makes sense. You mentioned 50+ web properties so clearly others will be performing these tasks.
c. Audit the effort. A wise former boss once stated "You must inspect what you expect". Unless you carefully audit the work, the process will fail. Evaluators will mis-identify links. You will lose some quality links and manipulative links will be missed as well.
d. While you are on the site, capture manipulative site's e-mail address and contact forum URL (if any). This information is helpful to contact site owners to request link removal.
Step 4 - Conduct a Webmaster Outreach Campaign. Each manipulative domain needs to be contacted in a comprehensive manner. In my experience, most SEOs and site owners do not put in the required level of effort.
a. Send a professional request to the site's WHOIS e-mail address.
b. After 3 business days if no response is received, send the same letter to the site's e-mail address found on the website.
c. After another 3 business days, if no response is received submit the e-mail via the site's contact form. Take a screenshot of the submission on the site (not required for Penguin as no documentation is, but it is helpful for the process).
All of the manipulative link penalties (Penguin and manual) I have worked with have been cleaned up manually. With that said, we use Rmoov to manage the Webmaster Outreach process. It sends and maintains a copy of every e-mail sent. It even has a place to add the Contact Form URL. A big time saver.
If a site owner responds and removes the link, that's great. CHECK IT! If there are only a few links, manually confirm link removal. If there are many URLs, use Screaming Frog or another tool to confirm link removal.
If a site owner refuses or requests money, you can often achieve link removal by having further respectful conversations.
If a site owner does not respond, you can use "extra measures". Call the phone number listed in WHOIS. Send a physical letter to the WHOIS address. Reach out to them on social media sites. Is it a .com domain with missing WHOIS information? You can report them on INTERNIC. Is it a spammy wordpress.com or blogspot site? You can report that as well.
When Matt Cutts introduced the Disavow Tool, he clearly said "...at the point where you have written to as many people as you can, multiple times, you have really tried hard to get in touch and you have only been able to get a fraction of those links down and there is still a small fraction of those links left, that's where you can use our Disavow Tool".
The above process satisfies that requirement. In my experience, not much less than the above process meets that need. The overwhelming majority of those tackling these penalties try to perform the minimal amount of work possible, which is why forums are flooded with complaints about numerous attempts to remove manipulative link penalties and failing.
Upon completion of the above, THEN upload a Disavow list of the links you could not remove after every reasonable human effort. In my experience you should have removed at least 20% of the linking DOMAINS (with rare exceptions).
It can take up to 60 days thereafter, but if you truly cleaned up the links in a quality manner, then the Penguin issues should be fully resolved.
The top factors in determining whether you succeed or fail are:
1. Your determination to follow the above process thoroughly
2. The experience, training and focus of your team
You can resolve the issue in one round of effort and have the Penguin issue resolved within a few months....or you can be one of those site owners who thinks it is impossible and be struggling with the same issue a year later. If you are not 100% committed, RUN AWAY. By that I mean change domain names and start over.
Good Luck.
TLDR - Don't try to fool Google. Anchor text and site wide links are part of the MECHANISM used to identify manipulative links. Don't confuse the mechanism with the message. Google's clear message: EARN links, don't "build" links. Polishing up the old manipulative links is a complete waste of your time. AT BEST, you will enjoy limited success for a period of time until Google catches up. Many site owners and SEOs have already been there, and it is a painful process.
-
When you say "clean up" do you mean removing the links or disavowing them?
You will never be able to get them all removed, so in the end you will need to a Disavow anyways. If your time frame is short, you may want to make Priority One be doing a Disavow for each of the 50+ sites you are working with. Then you can proceed with attempting to get the links removed. I have not heard that there is any downside to having a link removed that already appears on your disavow file...
As for the order of the Priorities, you may want to shuffle them a bit depending on the different situations on the different websites. I suggest you read this Moz Blog article called It's Penguin-Hunting Season: How to Be the Predator and Not the Prey
...and then test a few of your sub-pages that used to rank well at the program used in this article which is called the Penguin Analysis Tool. I say sub-page because it needs a single keyword phrase you want rank that particular page for so it do the anchor text analysis. And that works better on focused sub-pages than on general homepages. $10 per website will let you fully evaluate two typical pages on each and see which facet of the link profile is most valuable to attack first.
-
Have you read the post at http://moz.com/blog/ultimate-guide-to-google-penalty-removal? Matt Cutts even called it out on Twitter as a good post. That's where I'd first look for ideas.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
At scale way to check content in google?
Is there any tools people know about where I can verify that Google is seeing all of our content at scale. I know I can take snippets and plug them into Google to see if we are showing up, but this is very time consuming and want to know across a bulk of pages.
Intermediate & Advanced SEO | | HashtagHustler0 -
Large number of links form Pinterest
Could unusually large number of links from Pinterest cause issues? Would Google categorise them as spammy links or site wide links? I have a small site with Urls around 800-1000. But webmaster shows 5321 links from Pinterest.com and 1467 from Pinterest.se. Please see attachment. ffNLF
Intermediate & Advanced SEO | | riyaaaz0 -
Questions on Google Penguin Clean-up Strategy
Hello Moz Community! I was hit with a REAL bad penalty in May 2013, and the date corresponds to Penguin #4. Never received a manual spam action, but the 50% drop in traffic was very apparent. Since then, I've had a slow reduction in traffic, to where I am today... which is almost baseline. Increases in traffic have not occurred regardless of efforts. In researching a little more, I see that my old SEO companies built my links with exact keyterm matches, many of them repeated over and over, verbatim, on different sites. I've heard two pieces of advice that I don't like 1) scrap the site, or 2) disavow all the links. I would rather see if I can get the webmasters to change the link to something generic, or my brand name, before I do either of these. To scrap my site and start new will be damn near impossible because I'm in an extremely competitive niche, and my site has age (since 2007), so rather work with what I have. A couple of questions, for folks who are in the know about this penalty, if I may: This penguin update, #4, on May 22nd, was it ONLY because of the link text? Or was it also because of the link quality? None of the updates before it harmed me, and I believe those were because of the quality? Could it be for links linking from my blog to my site? My blog (ex. www.mysite.com/blog), has close to 1,000 blog posts, and back in the days I would write these really long, keyword stuffed links leading to www.mysite.com. I've been in the process of cleaning these up, and shortening them, and changing them to more generic (click here's), but it is a LONG and painstaking process. If I get webmasters to change text to just the url or brand name, that's better than disavowing, correct? As long the linking site has a decent spam score and PA/DA on OSE? Is having SOME exact anchor text okay on these links? Is it just the abuse that's the problem? If so, how many should I leave? (like 5 max per keyword?) Or should I just change to the url, or disavow altogether, any and all links that have exact keyword matches? I've downloaded my link profile from OSE and Majestic, and will do so from Ahrefs (I believe it is)? Does Webmaster Tools have any section that can help give me insights into the issue? If so, can you point me in the right direction? Can I get partial credit, for some work done? For instance, say a major update, or crawl, happens, and I've only fixed/disavowed 25% percent of the links by then, is there a possibility that I get a small boost in traffic? Or am I in the doghouse till they are all fixed? Say I clean/disavow everything up, will my improvement be seen in the next crawl? Or the next Penguin update? As there may be a substantial difference in time there. 😎 I see AHREFS, has some information on anchor text... any rules of thumb as to percentages of use of a certain anchor text, to see if I'm abusing or not, before I start undertaking all of this? Thanks! Could the penalty have "passed" altogether, and this is just where I rank? Thanks guys, but the last thing I want to do is ditch my site... I will work hard on this, but need some guidance. Much appreciated! David
Intermediate & Advanced SEO | | DavidC.0 -
Best Format for URLs on large Ecommerce Site?
I saw this article, http://www.distilled.net/blog/seo/common-ecommerce-technical-seo-problems/, and noticed that Geoff mentioned that product URLs format should be in one of the following ways: Product Page: site.com/product-name Product Page: site.com/category/sub-category/product-name However, for SEO, is there a preferred way? I understand that the top one may be better to prevent duplicate page issues, but I would imagine that the bottom would be better for conversion (maybe the user backtracks to site.com/category/sub-category/ to see other products that he may be interested in). Also, I'd imagine that the top URL would not be a great way to distribute link juice since everything would be attached to the root, right?
Intermediate & Advanced SEO | | eTundra0 -
Large volume of ning files in subdomain - hurting or helping?
I have a client that has 600 pages in their root domain and a subdomain that contains 7500 pages of un-seoable Ning pages. PLUS another 650 pages from Sched.com that also is contributing to a large volume of errors. My question is - should I create a new domain for the Ning content - or am I better off with the volume of pages - even if they have loads of errors? Thanks!
Intermediate & Advanced SEO | | robertdonnell0 -
Penguin Rescue! A lead has been hit and I need to save them!
I had a meeting today with a prospective client who has been hit by Penguin. Their previous SEO company has obviously used some questionable techniques which is great for me, bad for the client. Their leads have dropped from 10 per day to 1 or 2. Their analytics shows a drop after the 25th, a back link check shows a lot of low quality links. Domain metrics are pretty good and they are still ranking ok for some keywords. I have 1 month to turn it around for them. How do you wise people think it can be done? First of all I will check the on-site optimisation. I will ensure that the site isn't over optimised. Secondly, do I try and remove the bad links? Or just hit the site with good content and good links to outweigh the bad ones. Also, do you think G is actually dropping rankings for the over optimisation / bad links or are the links are just being discredited rsulting in the drop in rankings. 2 very different things. Any advice is appreciated. Thanks
Intermediate & Advanced SEO | | SimpsonGareth0 -
URL structure + process for a large travel site
Hello, I am looking at the URL structure for a travel site that will want to optimise lots of locations to a wide variety of terms, so for example hotels in london
Intermediate & Advanced SEO | | onefinestay
hotels in kensington (which is in london)
five star hotels in kensington
etc I am keen to see if my thought process is correct as you see so many different URL techniques out there. Or am i overthinking it too much? Lets assume we make the page /london/ as our homepage. we would then logically link to /london/hotels to optimise specifically for 'london hotels' We then have two options in my mind for optimising for 'kensington hotels': Link to a page that keeps /london/hotels/ in its URL to maintain consistency ie A. /london/hotels/kensington or should we be linking to: B. /london/kensington/hotels/ (as it allows us to maintain a logical geo-landing page hierarchy) I feel A is good as the URL matches the search phrase 'hotels in kensington' matches the order of the search phrase, but it loses value if any links find these pages with 'kensington' in the anchor text, as they would not really strengthen the 'kensington' hub page. /london/kensington Ie: i land on the 'kensington hotels' page and want to see more about kensington, then i could go from /london/kensington/hotels
to
/london/kensington quite easily and logically in the breadcrumb. I feel B. is the best option for now.. Happy to I am only musing as i see some good sites that use option A, which effectively pushes the location (/kensington/ to the end of the URL for each additional niche sub page, ie /london/hotels/five-star-hotels/kensington/) Some of the bigger travel sites dont even use folder, they just go:
example.com/five-star-hotels-in-kensington/ Comments welcome!!! Thanks0 -
Removing large section of content with traffic, what is best de-indexing option?
If we are removing 100 old urls (archives of authors that no longer write for us), what is the best option? we could 301 traffic to the main directory de-index using no-index, follow 404 the pages Thanks!
Intermediate & Advanced SEO | | nicole.healthline0