Is this dangerous (a content question)
-
Hi
I am building a new shop with unique products but I also want to offer tips and articles on the same topic as the products (fishing). I think if was to add the articles and advice one piece at a time it would look very empty and give little reason to come back very often.
The plan, therefore, is to launch the site pulling articles from a number of article websites - with the site's permission. Obviously this would be 100% duplicate content but it would make the user experience much better and offer added value to my site as people are likely to keep returning even when not in the mood to purchase anything; it also offers the potential for people to email links to friends etc. note: over time we will be adding more unique content and slowly turning off the pulled articled.
Anyway, from an seo point of view I know the duplicate content would harm the site but if I was to tell google not to index the directory and block it from even crawling the directory would it still know there is duplicate content on the site and apply the penalty to the non duplicate pages? I'm guessing no but always worth a second opinion.
Thanks
Carl
-
Hi Carl,
Several large publications do this sort of thing already, but they do have a lot of content of their own to back the duplicate / blocked content up. The most large-scale example of this is newspapers that syndicate content from other papers, often internationally. I was the SEO on a project like this for a large UK paper, and we blocked the duplicated content's subfolder via robots.txt so that the newspaper was not re-publishing indexable content from its international sister.
Your other option is to use the canonical tag to point back to the original version of the content.
Syndication shouldn't be harmful, and if you were doing this with a lot of content on the site to begin with, it would be normal and fine. What worries me is Google seeing a new site where there is literally no content (to begin with) and a large, blocked section. After the Panda update, it's pretty important to show a resource-heavy website, even if the site's purpose is filled without content. For instance, a property search engine I worked on saw a huge Panda penalty because all of their articles were on an artlce subdomain, not on the same subdomain as the "money" part of their site. We had to move the articles over to the main site.
It's not possible for me to say exactly what will happen if you go ahead with this, but I must advise that you should be building out your unique content both before launch, and quickly post-launch. It's vital that unique, indexable content be live on the site for it to perform well, even for commercial queries that don't rely on a site having articles.
Cheers,
Jane
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What is the best format for animated content
We want to use some movement in our designs, charts etc. what format is the most SEO friendly?
Technical SEO | | remkoallertz1 -
Magento Duplicate Content help!
How can I remove the duplicate page content in my Magento store from being read as duplicate. I added the Magento robots file that i have used on many stores and it keeps giving us errors. Also we have enabled the canonical links in magento admin I am getting 3616 errors and can't seem to get around it .. any suggestions?
Technical SEO | | adamxj20 -
Duplicate content question...
I have a high duplicate content issue on my website. However, I'm not sure how to handle or fix this issue. I have 2 different URLs landing to the same page content. http://www.myfitstation.com/tag/vegan/ and http://www.myfitstation.com/tag/raw-food/ .In this situation, I cannot redirect one URL to the other since in the future I will probably be adding additional posts to either the "vegan" tag or the "raw food tag". What is the solution in this case? Thank you
Technical SEO | | myfitstation0 -
Questions about root domain setup
Hi There, I'm a recent addition to SEOmoz and over the past few weeks I've been trying to figure things out. This whole SEO process has been a bit of a brain burner but its slowly becoming a little more clearer. For awhile I noticed that I was unable to get Open Site Explorer to display information about my site. It mentioned that that there was not enough data for the URL. Too recent of a site, no links, etc. Eventually I changed the the URL to include "www." and it pulled up results. I also noticed that a few of my page warnings are because of duplicate page content. One page will be listed as http://enbphotos.com. The other will be listed as http://www.enbphotos.com. I guess I'm not sure what this all means and how to change it. I'm also not really sure what the terminology even is and something regarding root domain seemed appropriate but I'm not sure if it is accurate. Any help/suggestions/links would be appreciated! Thanks, Chris
Technical SEO | | enbphotos0 -
How to get rid of duplicate content
I have duplicate content that looks like http://deceptionbytes.com/component/mailto/?tmpl=component&link=932fea0640143bf08fe157d3570792a56dcc1284 - however I have 50 of these all with different numbers on the end. Does this affect the search engine optimization and how can I disallow this in my robots.txt file?
Technical SEO | | Mishelm1 -
Robots.txt questions...
All, My site is rather complicated, but I will try to break down my question as simply as possible. I have a robots.txt document in the root level of my site to disallow robot access to /_system/, my CMS. This looks like this: # /robots.txt file for http://webcrawler.com/
Technical SEO | | Horizon
# mail webmaster@webcrawler.com for constructive criticism **User-agent: ***
Disallow: /_system/ I have another robots.txt file in another level down, which is my holiday database - www.mysite.com/holiday-database/ - this is to disallow access to /holiday-database/ControlPanel/, my database CMS. This looks like this: **User-agent: ***
Disallow: /ControlPanel/ Am I correct in thinking that this file must also be in the root level, and not in the /holiday-database/ level? If so, should my new robots.txt file look like this: # /robots.txt file for http://webcrawler.com/
# mail webmaster@webcrawler.com for constructive criticism **User-agent: ***
Disallow: /_system/
Disallow: /holiday-database/ControlPanel/ Or, like this: # /robots.txt file for http://webcrawler.com/
# mail webmaster@webcrawler.com for constructive criticism **User-agent: ***
Disallow: /_system/
Disallow: /ControlPanel/ Thanks in advance. Matt0 -
How do I get content to be indexed at the top?
I have a paragraph at the top of my homepage. I was told I could use css to make the content visually appear at the bottom of the page but it would still get indexed at the top of the page, still giving it the same level of importance. Can anyone tell me how to do this?
Technical SEO | | BradBorst0 -
Duplicate content?
I have a question regarding a warning that I got on one of my websites, it says Duplicate content. I'm canonical url:s and is also using blocking Google out from pages that you are warning me about. The pages are not indexed by Google, why do I get the warnings? Thanks for great seotools! 3M5AY.png
Technical SEO | | bnbjbbkb0