Duplicate content q - Can search engines tell where the original text was copied from?
-
I was under the impression that when a search engine comes across duplicate content it won't be able to determine which one is the original. Is this not the case?
-
Not sure canonical tags work that way! You use the canonical tag to provide the search engines with the "official" URL of your content. This is especially useful when you're using a CMS that delivers your content on a number of alternate urls. It won't help search engines determine the "owner" of the content.
Matt Cutts has said that Google is getting better at identifying the original owner of the site, but it's not perfect. Google don't have a magic way of seeing everything that goes on, they have to crawl which takes time.
I don't know if it would help if you manually submit your page to google using the Google Webmaster tools / fetch as googlebot option as soon as you purplish your page. Might help get your page into the index first. Not exactly a scalable approach if you're publishing a lot of material though.
One of the best ways of protecting yourself is to embed relevant (absolute) links into the body of your content pointing to related articles on your own site. One website I'm working on gets a surprising number of referrals that way! Oh,also think about adding links to your new content from some of your older content, if it's relevant and makes sense to do so.
Authorship markup might also help.
-
What if both sites added canonical tags on their content?
-
That all depends if the site that you have taken the content from has placed canonical tags on there content which says that there site was the original owner of the content. This means that you will get seen as having duplicate content and they wont.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
URL slash creating duplicate content
Hi All, I currently have an issue whereby by domain name (just homepage) has: mydomain.com and: mydomain.com/ Moz crawler flags this up as duplicate content - does anyone know of a way I can fix this? Thanks! Jack
Technical SEO | | Jack11660 -
Duplicate content and rel canonicals?
Hi. I have a question relating to 2 sites that I manage with regards to duplicate content. These are 2 separate companies but the content is off a data base from the one(in other words the same). In terms of the rel canonical, how would we do this so that google does not penalise either site but can also have the content to crawl for both or is this just a dream?
Technical SEO | | ProsperoDigital0 -
Duplicate content or Duplicate page issue?
Hey Moz Community! I have a strange case in front of me. I have published a press release on my client's website and it ranked right away in Google. A week after the page completely dropped and it completely disappeared. The page is being indexed in Google, but when I search "title of the PR", the only results I get for that search query are the media and news outlets that have reported the news. No presence of my client's page. I also have to mention that I found two URLs of the same page: one with lower case letters and one with capital letters. Is this a duplicate page or a duplicate content issue coming from the news websites? How can I solve it? Thanks!
Technical SEO | | Workaholic0 -
Duplicate Content - Reverse Phone Directory
Hi, Until a few months ago, my client's site had about 600 pages. He decided to implement what is essentially a reverse phone directory/lookup tool. There are now about 10,000 reverse directory/lookup pages (.html), all with short and duplicate content except for the phone number and the caller name. Needless to say, I'm getting thousands of duplicate content errors. Are there tricks of the trade to deal with this? In nosing around, I've discovered that the pages are showing up in Google search results (when searching for a specific phone number), usually in the first or second position. Ideally, each page would have unique content, but that's next to impossible with 10,000 pages. One potential solution I've come up with is incorporating user-generated content into each page (maybe via Disqus?), which over time would make each page unique. I've also thought about suggesting that he move those pages onto a different domain. I'd appreciate any advice/suggestions, as well as any insights into the long-term repercussions of having so many dupes on the ranking of the 600 solidly unique pages on the site. Thanks in advance for your help!
Technical SEO | | sally580 -
URL query considered duplicate content?
I have a Magento site. In order to reduce duplicate content for products of the same style but with different colours I have combined them on to 1 product page. I would like to allow the pictures to be dynamic, i.e. allow a user to search for a colour and all the products that offer that colour appear in the results, but I dont want the default product image shown but the product image for that colour applying to the query. Therefore to do this I have to append a query string to the end of the URL to produce this result: www.website.com/category/product-name.html?=red My question is, will the query variations then be picked up as duplicate content: www.website.com/category/product-name.html www.website.com/category/product-name.html?=red www.website.com/category/product-name.html?=yellow Google suggest it has contingencies in its algorithm and I will not be penalised: http://googlewebmastercentral.blogspot.co.uk/2007/09/google-duplicate-content-caused-by-url.html But other sources suggest this is not accurate. Note the article was written in 2007.
Technical SEO | | BlazeSunglass0 -
How to find original URLS after Hosting Company added canonical URLs, URL rewrites and duplicate content.
We recently changed hosting companies for our ecommerce website. The hosting company added some functionality such that duplicate content and/or mirrored pages appear in the search engines. To fix this problem, the hosting company created both canonical URLs and URL rewrites. Now, we have page A (which is the original page with all the link juice) and page B (which is the new page with no link juice or SEO value). Both pages have the same content, with different URLs. I understand that a canonical URL is the way to tell the search engines which page is the preferred page in cases of duplicate content and mirrored pages. I also understand that canonical URLs tell the search engine that page B is a copy of page A, but page A is the preferred page to index. The problem we now face is that the hosting company made page A a copy of page B, rather than the other way around. But page A is the original page with the seo value and link juice, while page B is the new page with no value. As a result, the search engines are now prioritizing the newly created page over the original one. I believe the solution is to reverse this and make it so that page B (the new page) is a copy of page A (the original page). Now, I would simply need to put the original URL as the canonical URL for the duplicate pages. The problem is, with all the rewrites and changes in functionality, I no longer know which URLs have the backlinks that are creating this SEO value. I figure if I can find the back links to the original page, then I can find out the original web address of the original pages. My question is, how can I search for back links on the web in such a way that I can figure out the URL that all of these back links are pointing to in order to make that URL the canonical URL for all the new, duplicate pages.
Technical SEO | | CABLES0 -
Duplicate Content Caused By Blog Filters
We are getting some duplicate content warnings based on our blog. Canonical URL's can work for some of the pages, but most of the duplicate content is caused by blog posts appearing on more than 1 URL. What is the best way to fix this?
Technical SEO | | Marketpath0 -
Anchor text in Flash Discoverable by Search Engines?
What recommendations do you all have to make anchor text discoverable in flash? More importantly is it even possible and does it contribute to link juice?
Technical SEO | | sunfever0