How to Not Scrap Content, but still Being a Hub
-
Hello Seomoz members. I'm relatively new to SEO, so please forgive me if my questions are a little basic.
One of the sites I manage is GoldSilver.com. We sell gold and silver coins and bars, but we also have a very important news aspect to our site.
For about 2-3 years now we have been a major hub as a gold and silver news aggregator. At 1.5 years ago (before we knew much about SEO), we switched from linking to the original news site to scraping their content and putting it on our site. The chief reason for this was users would click outbound to read an article, see an ad for a competitor, then buy elsewhere. We were trying to avoid this (a relatively stupid decision with hindsight).
We have realized that the Search Engines are penalizing us, which I don't blame them for, for having this scraped content on our site.
So I'm trying to figure out how to move forward from here. We would like to remain a hub for news related to Gold and Silver and not be penalized by SEs, but we also need to sell bullion and would like to avoid loosing clients to competitors through ads on the news articles.
One of the solutions we are thinking about is perhaps using an iFrame to display the original url, but within our experience. An example is how trap.it does this (see attached picture). This way we can still control the experience some what, but are still remaining a hub.
Thoughts?
Thank you,
nick
-
I honestly can't offer any short term suggestions. It's a big challenge to know what the best short term path is. Ultimately, you'll need to remove all the scraped content. Do that without replacing it and in the short term, you won't see any gains, though you may even see some short term losses as it's possible you're not being purely penalized.
-
Alan,
Thank you for your thoughts. I agree we need to change our strategy and move away from scraped content. Any technical work arounds we try to do (like iFrame) may work now, but ultimately we seem to just be delaying the inevitable.
Since that strategy will take a while to implement, what would you recommend for the shorter term?
-
Nick,
You're in a difficult situation, to say the least. iFrames were a safe bet a couple years ago, however Google has gotten better and better at discovering content contained in previously safe environments within the code. And they're just going to get better at it over time.
The only truly safe solution for a long term view is to change strategy drastically. Find quality news elsewhere, and have content writers create unique articles built on the core information contained in those. Become your own news site with a unique voice.
The expense is significant given you'll need full time writers, however with a couple entry level writers right out of college, or just a year or two into the content writing / journalism path, you've got a relatively low cost of entry. The key is picking really good talent.
I was able to replace an entire team of 12 poorly chosen writers with 3 very good writers, for example.
The other reality with that is needing to lose all the scraped content. It's got to go. You can't salvage it, or back-date newly written content around it, not in the volume you're dealing with. So you're going to have to earn ranking all over again, but through real, value added reasons.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplication content management across a subdir based multisite where subsites are projects of the main site and naturally adopt some ideas and goals from it
Hi, I have the following problem and would like which would be the best solution for it: I have a site codex21.gal that is actually part of a subdirectories based multisite (galike.net). It has a domain mapping setup, but it is hosted on a folder of galike.net multisite (galike.net/codex21). My main site (galike.net) works as a frame-brand for a series of projects aimed to promote the cultural & natural heritage of a region in NW Spain through creative projects focused on the entertainment, tourism and educational areas. The projects themselves will be a concretion (put into practice) of the general views of the brand, that acts more like a company brand. CodeX21 is one of those projects, it has its own logo, etc, and is actually like a child brand, yet more focused on a particular theme. I don't want to hide that it makes part of the GALIKE brand (in fact, I am planning to add the Galike logo to it, and a link to the main site on the menu). I will be making other projects, each of them with their own brand, hosted in subsites (subfolders) of galike.net multisites. Not all of them might have their own TLD mapped, some could simply be www.galike.net/projectname. The project codex21.gal subsite might become galike.net/codex21 if it would be better for SEO. Now, the problem is that my subsite codex21.gal re-states some principles, concepts and goals that have been defined (in other words) in the main site. Thus, there are some ideas (such as my particular vision on the possibilities of sustainable exploitation of that heritage, concepts I have developed myself as "narrative tourism" "geographical map as a non lineal story" and so on) that need to be present here and there on the subsite, since it is also philosophy of the project. BUT it seems that Google can penalise overlapping content in subdirectories based multisites, since they can seem a collection of doorways to access the same product (*) I have considered the possibility to substitute those overlapping ideas with links to the main page of the site, thought it seems unnatural from the user point of view to be brought off the page to read a piece of info that actually makes part of the project description (every other child project of Galike might have the same problem). I have considered also taking the subsite codex21 out of the network and host it as a single site in other server, but the problem of duplicated content might persist, and anyway, I should link it to my brand Galike somewhere, because that's kind of the "production house" of it. So which would be the best (white hat) strategy, from a SEO point of view, to arrange this brand-project philosophy overlapping? (*) “All the same IP address — that’s really not a problem for us. It’s really common for sites to be on the same IP address. That’s kind of the way the internet works. A lot of CDNs (content delivery networks) use the same IP address as well for different sites, and that’s also perfectly fine. I think the bigger issue that he might be running into is that all these sites are very similar. So, from our point of view, our algorithms might look at that and say “this is kind of a collection of doorway sites” — in that essentially they’re being funnelled toward the same product. The content on the sites is probably very similar. Then, from our point of view, what might happen is we will say we’ll pick one of these pages and index that and show that in the search results. That might be one variation that we could look at. In practice that wouldn’t be so problematic because one of these sites would be showing up in the search results. On the other hand, our algorithm might also be looking at this and saying this is clearly someone trying to overdo things with a collection of doorway sites and we’ll demote all of them. So what I recommend doing here is really trying to take a step back and focus on fewer sites and making those really strong, and really good and unique. So that they have unique content, unique products that they’re selling. So then you don’t have this collection of a lot of different sites that are essentially doing the same thing.” (John Mueller, Senior Webmaster Trend Analyst at Google. https://www.youtube.com/watch?time_continue=1&v=kQIyk-2-wRg&feature=emb_logo)
White Hat / Black Hat SEO | | PabloCulebras0 -
Regular links may still
Good day: I understand guest articles are a good way to pass linkjuice and some authors have a link to their website on the "Author Bio" section of the article. These links are usually regular links. However, I noticed that some of these sites (using wordpress) have several SEO plugins with the following settings: Nofollow: Tell search engines not to spider links on this webpage. My question is: If the setting above was activated, I would assume the author's website link would look like a regular link but some other code could still be present in the page (ex, header) that would prevent this regular link from being followed. Therefore, the guest writer would not experience any linkjuice. Any idea if there's a way of being able to see if this scenario is happening? What code would we look for?
White Hat / Black Hat SEO | | Audreythenurse0 -
How do I make a content calendar to increase my rank for a key word?
I've watched more than a few seminars on having a content calendar. Now I'm curious as to what I would need to do to increase ranking for a specific keyword in local SEO. Let's say I wanted to help them increase their rank for used trucks in buffalo, NY. Would I regularly publish blog posts about used trucks? Thanks!
White Hat / Black Hat SEO | | oomdomarketing0 -
Have just submitted Disavow file to Google: Shall I wait until after they have removed bad links to start new content lead SEO campaign?
Hi guys, I am currently conducting some SEO work for a client. Their previous SEO company had built a lot of low quality/spam links to their site and as a result their rankings and traffic have dropped dramatically. I have analysed their current link profile, and have submitted the spammiest domains to Google via the Disavow tool. The question I had was.. Do I wait until Google removes the spam links that I have submitted, and then start the new content based SEO campaign. Or would it be okay to start the content based SEO campaign now, even though the current spam links havent been removed yet.. Look forward to your replies on this...
White Hat / Black Hat SEO | | sanj50500 -
Low quality websites with spammy EMDs still ranking higher than genuine websites?
Hey guys, I've just been doing some searching and couldn't quite contemplate how heavily low-quality and spammy EMDs are still running some Google searches. Just take "cheap kitchens", for instance. Here are a list of URLs that appeared; http://kitchenunitsdoors.co.uk/ http://www.kitchenunits9.co.uk/ http://www.aboutkitchenunits.co.uk/ http://www.cheapkitchenunits1.co.uk/ http://www.cheapkitchensonline.com/ http://www.buycheapkitchens.com/ http://www.cheapkitchenscheapkitchen.co.uk/ http://www.cheapkitchensforsale1.co.uk/ http://cheapkitchensaberdeen.co.uk/ http://www.kitchensderby1.co.uk/ http://www.cheapcheapkitchens.co.uk/ http://kitchen-cheap.co.uk/ http://www.cheapestkitchensinbritain.co.uk/ http://www.cheapkitchenss.co.uk/ http://www.cheaperthanmfi.com/ http://cheapkitchenuk.co.uk/ As you can see, none of them appear to be genuine retailers and are setup purely to influence Google rankings. I'm amazed that Google is still giving so much weight to these types of sites - especially considering how search is meant to be better than it ever was before! Any insights into why this is?
White Hat / Black Hat SEO | | Webrevolve0 -
DIV Attribute containing full DIV content
Hi all I recently watched the latest Mozinar called "Making Your Site Audits More Actionable". It was presented by the guys at seogadget. In the mozinar one of the guys said he loves the website www.sportsbikeshop.co.uk and that they have spent a lot of money on it from an SEO point of view (presumably with seogadget) so I decided to look through the source and noticed something I had not seen before and wondered if anyone can shed any light. On this page (http://www.sportsbikeshop.co.uk/motorcycle_parts/content_cat/852/(2;product_rating;DESC;0-0;all;92)/page_1/max_20) there is a paragraph of text that begins with 'The ever reliable UK weather...' and when you via the source of the containing DIV you will notice a bespoke attribute called "threedots=" and within it, is the entire text content for that DIV. Any thoughts as to why they would put that there? I can't see any reason as to why this would benefit a site in any shape or form. Its invalid markup for one. Am I missing a trick..? Thoughts would be greatly appreciated. Kris P.S. for those who can't be bothered to visit the site, here is a smaller version of what they have done: This is an introductory paragraph of text for this page.
White Hat / Black Hat SEO | | yousayjump0 -
Duplicate Content due to Panda update!
I can see that a lot of you are worrying about this new Panda update just as I am! I have such a headache trying to figure this one out, can any of you help me? I have thousands of pages that are "duplicate content" which I just can't for the life of me see how... take these two for example: http://www.eteach.com/Employer.aspx?EmpNo=18753 http://www.eteach.com/Employer.aspx?EmpNo=31241 My campaign crawler is telling me these are duplicate content pages because of the same title (which that I can see) and because of the content (which I can't see). Can anyone see how Google is interpreting these two pages as duplicate content?? Stupid Panda!
White Hat / Black Hat SEO | | Eteach_Marketing0 -
I'm worried my client is asking me to post duplicate content, am I just being paranoid?
Hi SEOMozzers, I'm building a website for a client that provides photo galleries for travel destinations. As of right now, the website is basically a collection of photo galleries. My client believes Google might like us a bit more if we had more "text" content. So my client has been sending me content that is provided free by tourism organizations (tourism organizations will often provide free "one-pagers" about their destination for media). My concern is that if this content is free, it seems likely that other people have already posted it somewhere on the web. I'm worried Google could penalize us for posting content that is already existent. I know that conventionally, there are ways around this-- you can tell crawlers that this content shouldn't be crawled-- but in my case, we are specifically trying to produce crawl-able content. Do you think I should advise my client to hire some bloggers to produce the content or am I just being paranoid? Thanks everyone. This is my first post to the Moz community 🙂
White Hat / Black Hat SEO | | steve_benjamins0