How to Not Scrap Content, but still Being a Hub
-
Hello Seomoz members. I'm relatively new to SEO, so please forgive me if my questions are a little basic.
One of the sites I manage is GoldSilver.com. We sell gold and silver coins and bars, but we also have a very important news aspect to our site.
For about 2-3 years now we have been a major hub as a gold and silver news aggregator. At 1.5 years ago (before we knew much about SEO), we switched from linking to the original news site to scraping their content and putting it on our site. The chief reason for this was users would click outbound to read an article, see an ad for a competitor, then buy elsewhere. We were trying to avoid this (a relatively stupid decision with hindsight).
We have realized that the Search Engines are penalizing us, which I don't blame them for, for having this scraped content on our site.
So I'm trying to figure out how to move forward from here. We would like to remain a hub for news related to Gold and Silver and not be penalized by SEs, but we also need to sell bullion and would like to avoid loosing clients to competitors through ads on the news articles.
One of the solutions we are thinking about is perhaps using an iFrame to display the original url, but within our experience. An example is how trap.it does this (see attached picture). This way we can still control the experience some what, but are still remaining a hub.
Thoughts?
Thank you,
nick
-
I honestly can't offer any short term suggestions. It's a big challenge to know what the best short term path is. Ultimately, you'll need to remove all the scraped content. Do that without replacing it and in the short term, you won't see any gains, though you may even see some short term losses as it's possible you're not being purely penalized.
-
Alan,
Thank you for your thoughts. I agree we need to change our strategy and move away from scraped content. Any technical work arounds we try to do (like iFrame) may work now, but ultimately we seem to just be delaying the inevitable.
Since that strategy will take a while to implement, what would you recommend for the shorter term?
-
Nick,
You're in a difficult situation, to say the least. iFrames were a safe bet a couple years ago, however Google has gotten better and better at discovering content contained in previously safe environments within the code. And they're just going to get better at it over time.
The only truly safe solution for a long term view is to change strategy drastically. Find quality news elsewhere, and have content writers create unique articles built on the core information contained in those. Become your own news site with a unique voice.
The expense is significant given you'll need full time writers, however with a couple entry level writers right out of college, or just a year or two into the content writing / journalism path, you've got a relatively low cost of entry. The key is picking really good talent.
I was able to replace an entire team of 12 poorly chosen writers with 3 very good writers, for example.
The other reality with that is needing to lose all the scraped content. It's got to go. You can't salvage it, or back-date newly written content around it, not in the volume you're dealing with. So you're going to have to earn ranking all over again, but through real, value added reasons.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is this considered duplicate content?
Hi Guys, We have a blog for our e-commerce store. We have a full-time in-house writer producing content. As part of our process, we do content briefs, and as part of the brief we analyze competing pieces of content existing on the web. Most of the time, the sources are large publications (i.e HGTV, elledecor, apartmenttherapy, Housebeautiful, NY Times, etc.). The analysis is basically a summary/breakdown of the article, and is sometimes 2-3 paragraphs long for longer pieces of content. The competing content analysis is used to create an outline of our article, and incorporates most important details/facts from competing pieces, but not all. Most of our articles run 1500-3000 words. Here are the questions: NOTE: the summaries are written by us, and not copied/pasted from other websites. Would it be considered duplicate content, or bad SEO practice, if we list sources/links we used at the bottom of our blog post, with the summary from our content brief? Could this be beneficial as far as SEO? If we do this, should be nofollow the links, or use regular dofollow links? For example: For your convenience, here are some articles we found helpful, along with brief summaries: <summary>I want to use as much of the content that we have spent time on. TIA</summary>
White Hat / Black Hat SEO | | kekepeche1 -
Content Regurgitators
Hey, There are few websites such as http://bestthenews.com/ which regularly copy and paste articles from one of our sites onto theirs - along with all the links back to our site. The sites don't have a high spam score - but I cant imagine these sites serve any purpose (ie genuine readership) other than trying to boost their traffic. At the moment we haven't done anything about these, as they are backlinks after all - but could these sites have a negative impact and should we be asking them to remove? We have even had our content copied and pasted by AGDA (Australian Graphic Design Association) - which is OK as the site has great authority so the links are good, however it's still strange that a large reputable organization would just copy and paste articles without notifying us. Curious to here other experience / opinions on the matter. Cheers!
White Hat / Black Hat SEO | | wearehappymedia1 -
Competitor ranking well with duplicate content—what are my options?
A competitor is ranking #1 and #3 for a search term (see attached) by publishing two separate sites with the same content. They've modified the title of the page, and serve it in a different design, but are using their branded domain and a keyword-rich domain to gain multiple rankings. This has been going on for years, and I've always told myself that Google would eventually catch it with an algorithm update, but that doesn't seem to be happening. Does anyone know of other options? It doesn't seem like this falls under any of the categories that Google lists on their web spam report page—is there any other way to get bring this up with the powers that be, or is it something that I just have to live with and hope that Google figures out some day? Any advice would help. Thanks! how_to_become_a_home_inspector_-_Google_Search_2015-01-15_18-45-06.jpg
White Hat / Black Hat SEO | | inxilpro0 -
How to re-rank an established website with new content
I can't help but feel this is a somewhat untapped resource with a distinct lack of information.
White Hat / Black Hat SEO | | ChimplyWebGroup
There is a massive amount of information around on how to rank a new website, or techniques in order to increase SEO effectiveness, but to rank a whole new set of pages or indeed to 're-build' a site that may have suffered an algorithmic penalty is a harder nut to crack in terms of information and resources. To start I'll provide my situation; SuperTED is an entertainment directory SEO project.
It seems likely we may have suffered an algorithmic penalty at some point around Penguin 2.0 (May 22nd) as traffic dropped steadily since then, but wasn't too aggressive really. Then to coincide with the newest Panda 27 (According to Moz) in late September this year we decided it was time to re-assess tactics to keep in line with Google's guidelines over the two years. We've slowly built a natural link-profile over this time but it's likely thin content was also an issue. So beginning of September up to end of October we took these steps; Contacted webmasters (and unfortunately there was some 'paid' link-building before I arrived) to remove links 'Disavowed' the rest of the unnatural links that we couldn't have removed manually. Worked on pagespeed as per Google guidelines until we received high-scores in the majority of 'speed testing' tools (e.g WebPageTest) Redesigned the entire site with speed, simplicity and accessibility in mind. Htaccessed 'fancy' URLs to remove file extensions and simplify the link structure. Completely removed two or three pages that were quite clearly just trying to 'trick' Google. Think a large page of links that simply said 'Entertainers in London', 'Entertainers in Scotland', etc. 404'ed, asked for URL removal via WMT, thinking of 410'ing? Added new content and pages that seem to follow Google's guidelines as far as I can tell, e.g;
Main Category Page Sub-category Pages Started to build new links to our now 'content-driven' pages naturally by asking our members to link to us via their personal profiles. We offered a reward system internally for this so we've seen a fairly good turnout. Many other 'possible' ranking factors; such as adding Schema data, optimising for mobile devices as best we can, added a blog and began to blog original content, utilise and expand our social media reach, custom 404 pages, removed duplicate content, utilised Moz and much more. It's been a fairly exhaustive process but we were happy to do so to be within Google guidelines. Unfortunately, some of those link-wheel pages mentioned previously were the only pages driving organic traffic, so once we were rid of these traffic has dropped to not even 10% of what it was previously. Equally with the changes (htaccess) to the link structure and the creation of brand new pages, we've lost many of the pages that previously held Page Authority.
We've 301'ed those pages that have been 'replaced' with much better content and a different URL structure - http://www.superted.com/profiles.php/bands-musicians/wedding-bands to simply http://www.superted.com/profiles.php/wedding-bands, for example. Therefore, with the loss of the 'spammy' pages and the creation of brand new 'content-driven' pages, we've probably lost up to 75% of the old website, including those that were driving any traffic at all (even with potential thin-content algorithmic penalties). Because of the loss of entire pages, the changes of URLs and the rest discussed above, it's likely the site looks very new and probably very updated in a short period of time. What I need to work out is a campaign to drive traffic to the 'new' site.
We're naturally building links through our own customerbase, so they will likely be seen as quality, natural link-building.
Perhaps the sudden occurrence of a large amount of 404's and 'lost' pages are affecting us?
Perhaps we're yet to really be indexed properly, but it has been almost a month since most of the changes are made and we'd often be re-indexed 3 or 4 times a week previous to the changes.
Our events page is the only one without the new design left to update, could this be affecting us? It potentially may look like two sites in one.
Perhaps we need to wait until the next Google 'link' update to feel the benefits of our link audit.
Perhaps simply getting rid of many of the 'spammy' links has done us no favours - I should point out we've never been issued with a manual penalty. Was I perhaps too hasty in following the rules? Would appreciate some professional opinion or from anyone who may have experience with a similar process before. It does seem fairly odd that following guidelines and general white-hat SEO advice could cripple a domain, especially one with age (10 years+ the domain has been established) and relatively good domain authority within the industry. Many, many thanks in advance. Ryan.0 -
Dynamic Content Boxes: how to use them without get Duplicate Content Penalty?
Hi everybody, I am starting a project with a travelling website which has some standard category pages like Last Minute, Offers, Destinations, Vacations, Fly + Hotel. Every category has inside a lot of destinations with relative landing pages which will be like: Last Minute New York, Last Minute Paris, Offers New York, Offers Paris, etc. My question is: I am trying to simplify my job thinking about writing some dynamic content boxes for Last Minute, Offers and the other categories, changing only the destination city (Rome, Paris, New York, etc) repeated X types in X different combinations inside the content box. In this way I would simplify a lot my content writing for the principal generic landing pages of each category but I'm worried about getting penalized for Duplicate Content. Do you think my solution could work? If not, what is your suggestion? Is there a rule for categorize a content as duplicate (for example number of same words in a row, ...)? Thanks in advance for your help! A.
White Hat / Black Hat SEO | | OptimizedGroup0 -
Noindexing Thin Content Pages: Good or Bad?
If you have massive pages with super thin content (such as pagination pages) and you noindex them, once they are removed from googles index (and if these pages aren't viewable to the user and/or don't get any traffic) is it smart to completely remove them (404?) or is there any valid reason that they should be kept? If you noindex them, should you keep all URLs in the sitemap so that google will recrawl and notice the noindex tag? If you noindex them, and then remove the sitemap, can Google still recrawl and recognize the noindex tag on their own?
White Hat / Black Hat SEO | | WebServiceConsulting.com0 -
Why are "outdated" or "frowned upon" tactics still dominating?
Hey, my first post here. I recently picked up a new client in real estate for a highly competitive market. One trend I'm noticing with all the top sites they are doing old tactics such as:
White Hat / Black Hat SEO | | Jay328
-Paid Directories
-Terrible/Spam Directories
-Overuse of exact text keywords for example: City name + real estate
-Blogroll/link exchange
-Tons of meta key words
-B.S. press releases blog commenting with kw as name Out of all the competition there is only one guy who is following the rules of today. One thing I'm noticing is that nobody is doing legit guest blogging, has great social presence, has awesome on page, etc. It's pretty frustrating as I'm trying to follow the rules and seeing these guys kill it by doing "bad seo". Anybody else find themselves in this situation? I know I'm probably beating a dead horse but I needed to vent about this 😉2 -
"take care about the content" is it always true?
Hi everyone, I keep reading answer ,in reference to ranking advice, in wich the verdict is always the same: "TAKE CARE ABOUT THE CONTENT INSTEAD OF PR", and phrases like " you don't have to waste your time buying links, you have first of all to engage your visitors. ideally it works but not when you have to deal with small sites and especially when you are going to be ranked for those keywords where there's not too much to write. i'll give you an example still unsolved: i've got a client who just want to be ranked first for his flagship store, now his site is on the fourth position and the first ranked is a site with no content and low authority but it has the excact keyword match domain. tell me!!! what kind of content should i produce in order to be ranked for the name of the shop and the city?? the only way is to get links.... or to stay forth..... if you would like to help me, see more details below: page: http://poltronafraubrescia.zenucchi.it keyword: poltrona frau brescia competitor ranked first: http://turra.poltronafraubrescia.it/ competiror ranked second: http:// poltronafraubrescia.com/
White Hat / Black Hat SEO | | guidoboem0