How do I use public content without being penalized for duplication?
-
The NHTSA produces a list of all recalls for automobiles. In their "terms of use" it states that the information can be copied. I want to add that to our site, so there is an up-to-date list for our audience to see. However, I'm just copying and pasting. I'm allowed to according to NHTSA, but google will probably flag it right? Is there a way to do this without being penalized?
Thanks,
Ruben
-
I didn't think about other sites, but that's a fabulous point. Best to play it safe.
Thanks for your input!
- Ruben
-
My gut says that your idea to keep the content noindexed is best. Even if the content is unique it borders on the area of auto-generated. Now, I might change your mind if there were a lot of users that were actively interacting with most of these pages. If not though, then you'll end up having a large portion of your site consisting of auto-generated content that doesn't seem useful. Plus...it's also possible that other sites are using this information so you would end up having content that is duplicated on other sites too.
I could be wrong, but my gut says not to try to use this content for ranking purposes.
-
I appreciate the follow up, Marie. Please give me your thoughts on the following idea;
The NHTSA only posts the updates for the past month. If I noindex the page for now (which is what I'm doing) and wait five months, then what would happen? At that point, yes, the current month would be duplicated, but I'd have four months of "unique" content because the NHTSA deletes there's. Plus, I could add pictures of all the automobile, too. Do you think that would be enough to index it?
(I'm most likely going to keep it noindex, because this borders on shady, or at least, I could see google taking it that way, but just as a thought experiment, what do you think?) Or anyone else?
Thanks,
Ruben
-
To expand on EGOL's answer, if you are taking someone else's content (even with their permission) and wanting Google to index it then Google can see that you have a large amount of copied content on your site. This can trigger the Panda filter and can cause Google to consider your whole site as low quality.
You can add a noindex tag as EGOL suggested or you could use a canonical tag to show Google who the originator of the content is, but probably the noindex tag is easiest.
There is one other option as well. If you think it is possible that you can add significant value to the content that is being provided then you can still keep it indexed. If you can combine the recall information with other valuable information then that might be ok to index. But, you have to truly be providing value, not just padding the page with words to make it look unique.
-
Alright, that sounds good. Thanks!
-
I had a bunch of republished articles on my website, done mostly at the request of government agencies and universities. That site got hit in one of the early Panda updates. So, I deleted a lot of that content and to the rest I added this line above the tag
name="robots" content="noindex, follow" />
That tells google not to index the page but to follow the links and allow pagerank to flow through. My site recovered a few weeks later.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to solve this issue and avoid duplicated content?
My marketing team would like to serve up 3 pages of similar content; www.example.com/one, www.example.com/two and www.example.com/three; however the challenge here is, they'd like to have only one page whith three different titles and images based on the user's entry point (one, two, or three). To avoid duplicated pages, how would suggest this best be handled?
Intermediate & Advanced SEO | | JoelHer0 -
Fix Duplicate Content Before Migration?
My client has 2 Wordpress sites (A and B). Each site is 20 pages, with similar site structures, and 12 of the pages on A having nearly 100% duplicate content with their counterpart on B. I am not sure to what extent A and/or B is being penalized for this. In 2 weeks (July 1) the client will execute a rebrand, renaming the business, launching C, and taking down A and B. Individual pages on A and B will be 301 redirected to their counterpart on C. C will have a similar site structure to A and B. I expect the content will be freshened a bit, but may initially be very similar to the content on A and B. I have 3 questions: Given that only 2 weeks remain before the switchover - is there any purpose in resolving the duplicate content between A and B prior to taking them down? Will 301 redirects from penalized pages on A or B actually hurt the ranking of the destination page on C? If a page on C has the same content as its predecessor on A or B, could it be penalized for that, even though the page on A or B has since been taken down and replaced with a 301 redirect?
Intermediate & Advanced SEO | | futumara0 -
Duplicate content but different pages?
Hi there! Im getting LOTS of "duplicate content" pages but the thing is they are different pages. My website essentially is a niche video hosting site with embedded videos from Youtube. Im working on adding personal descriptions to each video but keeping the same video title (should I re-word it from the original also? Any help?
Intermediate & Advanced SEO | | sarevme0 -
How bad is duplicate content for ecommerce sites?
We have multiple eCommerce sites which not only share products across domains but also across categories within a single domain. Examples: http://www.artisancraftedhome.com/sinks-tubs/kitchen-sinks/two-tone-sinks/medium-rounded-front-farmhouse-sink-two-tone-scroll http://www.coppersinksonline.com/copper-kitchen-and-farmhouse-sinks/two-tone-kitchen-farmhouse-sinks/medium-rounded-front-farmhouse-sink-two-tone-scroll http://www.coppersinksonline.com/copper-sinks-on-sale/medium-rounded-front-farmhouse-sink-two-tone-scroll We have selected canonical links for each domain but I need to know if this practice is having a negative impact on my SEO.
Intermediate & Advanced SEO | | ArtisanCrafted0 -
Duplicate content created by website Calendar - A Penalty?
A colleague of mine asked me a question about duplicate content coming from their event calendar. I don't think this will affect them negatively, but I would love some feedback and thoughts. ThanksOne of my clients, LifeTech Academy, is using my RavenTools software. Raventools has reported a HUGE amount of duplicate content (4.4K instances).The duplicate content all revolves around their calendar and repeating events (http://lifetechacademy.org/events/)The question is this - will this impact their SEO efforts in a negative way?
Intermediate & Advanced SEO | | Bill_K0 -
Duplicate Page Content
We have different plans that you can signup for - how can we rectify the duplicate page content and title issue here? Thanks. | http://signup.directiq.com/?plan=100 | 0 | 1 | 32 | 1 | 200 |
Intermediate & Advanced SEO | | directiq
| http://signup.directiq.com/?plan=104 | 0 | 1 | 32 | 1 | 200 |
| http://signup.directiq.com/?plan=116 | 0 | 1 | 32 | 1 | 200 |
| http://signup.directiq.com/?plan=117 | 0 | 1 | 32 | 1 | 200 |
| http://signup.directiq.com/?plan=102 | 0 | 1 | 32 | 1 | 200 |
| http://signup.directiq.com/?plan=119 | 0 | 1 | 32 | 1 | 200 |
| http://signup.directiq.com/?plan=101 | 0 | 1 | 32 | 1 | 200 |
| http://signup.directiq.com/?plan=103 | 0 | 1 | 32 | 1 | 200 |
| http://signup.directiq.com/?plan=5 |0 -
Duplicate Content/ Indexing Question
I have a real estate Wordpress site that uses an IDX provider to add real estate listings to my site. A new page is created as a new property comes to market and then the page is deleted when the property is sold. I like the functionality of the service but it creates a significant amount of 404's and I'm also concerned about duplicate content because anyone else using the same service here in Las Vegas will have 1000's of the exact same property pages that I do. Any thoughts on this and is there a way that I can have the search engines only index the core 20 pages of my site and ignore future property pages? Your advice is greatly appreciated. See link for example http://www.mylvcondosales.com/mandarin-las-vegas/
Intermediate & Advanced SEO | | AnthonyLasVegas0 -
How do you implement dynamic SEO-friendly URLs using Ajax without using hashbangs?
We're building a new website platform and are using Ajax as the method for allowing users to select from filters. We want to dynamically insert elements into the URL as the filters are selected so that search engines will index multiple combinations of filters. We're struggling to see how this is possible using symfony framework. We've used www.gizmodo.com as an example of how to achieve SEO and user-friendly URLs but this is only an example of achieving this for static content. We would prefer to go down a route that didn't involve hashbangs if possible. Does anyone have any experience using hashbangs and how it affected their site? Any advice on the above would be gratefully received.
Intermediate & Advanced SEO | | Sayers1