How much (%) of the content of a page is considered too much duplication?
-
Google is not fond of duplication, I have been very kindly told. So how much would you suggest is too much?
-
I would not use a canonnical for your www v non www, use a 301
there is a tutorial there also to fix the index.html problem also, these tutorials are for micdrosoft iis server, if you have linux, you need to find the htaccess alternatives.
I always go for the non www, as www is of no use, so why have it, but for you i would look at what your links point to.
-
Hi Alan
Thankyou for taking the time to offer advice to me. I have read your pages and it does raise some interesting points. One that although basic, is one I haven't yet paid much attention to is the issue of "The choice of www or non-www".
This is interesting in respect of how I set my canonical tags up. I noticed that I rank differently for www.waspkilluk.co.uk than for www.waspkilluk.co.uk/index. So it seems I need to add a canonical tag there. I guess index is my home page - but then isn't the root domain also my default homepage?
In fact - do you think you should set up canonical tags without the www. or won't this work?
Sorry for creating questions from questions.
Warm Regards
Simon
-
Hi James
I have had a thorough study of this issue today and your ideas have proved fruitful. I checked out the article by Matt Cutts http://www.mattcutts.com/blog/canonical-link-tag/ and then read the article by Rand Fishkin. http://www.seomoz.org/blog/canonical-url-tag-the-most-important-advancement-in-seo-practices-since-sitemaps.
it will take a few weeks to implement across the thousand or so pages I have, but it will be interesting to see how or if, it finally affects the root domains ranking.
Many thanks
Simon
-
James gives a good response.
i have a few tutorial pages, where a lot of the instuctions are the same, but the are still indexed and rank.
It maybe a guide of what you can get a way with
http://thatsit.com.au/seo/tutorials/how-to-fix-canonical-domain-name-issues
http://thatsit.com.au/seo/tutorials/how-to-fix-canonical-issues-involving-the-trailing-slash
http://thatsit.com.au/seo/tutorials/how-to-fix-canonical-issues-involving-the-upper-and-lower-case -
It is hard to give an accurate percentage, in my eyes if you want to be in the clear just make unique content on pages if it is not unique content then place a canonical tag to the right page.
I mean Google is coming down harder and harder on sites for poor quality content/duplicant content if you play by the rules and do things right tit will be a long term strategy.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Backup Server causing duplicate content flag?
Hi, Google is indexing pages from our backup server. Is this a duplicate content issue? There are essentially two versions of our entire domain indexed by Google. How do people typically handle this? Any thoughts are appreciated. Thanks, Yael
Intermediate & Advanced SEO | | yaelslater0 -
Minimum amount of content for Ecommerce pages?
Hi Guys, Currently optimizing my e-commerce store which currently has around 100 words of content on average for each category page. Based on this study by Backlinko the more content the better: http://backlinko.com/wp-content/uploads/2016/01/02_Content-Total-Word-Count_line.png Would you say this is true for e-commerce pages, for example, a page like this: http://www.theiconic.com.au/yoga-pants/ What benefits would you receive with adding more content? Is it basically more content, leads to more potential long-tail opportunity and more organic traffic? Assuming the content is solid and not built just for SEO reasons. Cheers.
Intermediate & Advanced SEO | | seowork2140 -
Would you consider this thin content?
Just wondering what the community thinks about the following URLS and whether they are essentially thin content that should be handled through a canonical, noindex or a parameter filtering system: https://www.adversetdisplay.co.uk/products/3x1-popup-exhibition-stand https://www.adversetdisplay.co.uk/products/3x2-popup-exhibition-stand https://www.adversetdisplay.co.uk/products/3x3-popup-exhibition-stand https://www.adversetdisplay.co.uk/products/3x4-popup-exhibition-stand https://www.adversetdisplay.co.uk/products/3x5-popup-exhibition-stand
Intermediate & Advanced SEO | | ColinDocherty0 -
Semi-duplicate content yet authoritative site
So I have 5 real estate sites. One of those sites is of course the original, and it has more/better content on most of the pages than the other sites. I used to be top ranked for all of the subdivsion names in my town. Then when I did the next 2-4 sites, I had some sites doing better than others for certain keywords, and then I have 3 of those sites that are basically the same URL structures (besides the actual domain) and they aren't getting fed very many visits. I have a couple of agents that work with me that I loaned my sites to to see if that would help since it would be a different name. My same youtube video is on each of the respective subdivision pages of my site and theirs. Also, their content is just rewritten content from mine about the same length of content. I have looked over and seen a few of my competitors who only have one site and their URL structures arent good at all, and their content isn't good at all and a good bit of their pages rank higher than my main site which is very frustrating to say the least since they are actually copy cats to my site. I sort of started the precedent of content, mapping the neighborhood, how far that subdivision is from certain landmarks, and then shot a video of each. They have pretty much done the same thing and are now ahead of me. What sort of advice could you give me? Right now, I have two sites that are almost duplicate in terms of a template and same subdivsions although I did change the content the best I could, and that site is still getting pretty good visits. I originally did it to try and dominate the first page of the SERPS and then Penguin and Panda came out and seemed to figure that game out. So now, I would still like to keep all the sites, but I'm assuming that would entail making them all unique, which seems to be tough seeing as though my town has the same subdivisions. Curious as to what the suggestions would be, as I have put a lot of time into these sites. If I post my site will it show up in the SERPS? Thanks in advance
Intermediate & Advanced SEO | | Veebs0 -
Robots.txt & Duplicate Content
In reviewing my crawl results I have 5666 pages of duplicate content. I believe this is because many of the indexed pages are just different ways to get to the same content. There is one primary culprit. It's a series of URL's related to CatalogSearch - for example; http://www.careerbags.com/catalogsearch/result/index/?q=Mobile I have 10074 of those links indexed according to my MOZ crawl. Of those 5349 are tagged as duplicate content. Another 4725 are not. Here are some additional sample links: http://www.careerbags.com/catalogsearch/result/index/?dir=desc&order=relevance&p=2&q=Amy
Intermediate & Advanced SEO | | Careerbags
http://www.careerbags.com/catalogsearch/result/index/?color=28&q=bellemonde
http://www.careerbags.com/catalogsearch/result/index/?cat=9&color=241&dir=asc&order=relevance&q=baggallini All of these links are just different ways of searching through our product catalog. My question is should we disallow - catalogsearch via the robots file? Are these links doing more harm than good?0 -
Trying to advise on what seems to be a duplicate content penalty
So a friend of a friend was referred to me a few weeks ago as his Google traffic fell off a cliff. I told him I'd take a look at it and see what I could find and here's the situation I encountered. I'm a bit stumped at this point, so I figured I'd toss this out to the Moz crowd and see if anyone sees something I'm missing. The site in question is www.finishlinewheels.com In Mid June looking at the site's webmaster tools impressions went from around 20,000 per day down to 1,000. Interestingly, some of their major historic keywords like "stock rims" had basically disappeared while some secondary keywords hadn't budged. The owner submitted a reconsideration request and was told he hadn't received a manual penalty. I figured it was the result of either an automated filter/penalty from bad links, the result of a horribly slow server or possibly a duplicate content issue. I ran the backlinks on OSE, Majestic and pulled the links from Webmaster Tools. While there aren't a lot of spectacular links there also doesn't seem to be anything that stands out as terribly dangerous. Lots of links from automotive forums and the like - low authority and such, but in the grand scheme of things their links seem relevant and reasonable. I checked the site's speed in analytics and WMT as well as some external tools and everything checked out as plenty fast enough. So that wasn't the issue either. I tossed the home page into copyscape and I found the site brandwheelsandtires.com - which had completely ripped the site - it was thousands of the same pages with every element copied, including the phone number and contact info. Furthering my suspicions was after looking at the Internet Archive the first appearance was mid-May, shortly before his site took the nose dive (still visible at http://web.archive.org/web/20130517041513/http://brandwheelsandtires.com) THIS, i figured was the problem. Particularly when I started doing exact match searches for text on the finishlinewheels.com home page like "welcome to finish line wheels" and it was nowhere to be found. I figured the site had to be sandboxed. I contacted the owner and asked if this was his and he said it wasn't. So I gave him the contact info and he contacted the site owner and told them it had to come down and the owner apparently complied because it was gone the next day. He also filed a DMCA complaint with Google and they responded after the site was gone and said they didn't see the site in question (seriously, the guys at Google don't know how to look at their own cache?). I then had the site owner send them a list of cached URLs for this site and since then Google has said nothing. I figure at this point it's just a matter of Google running it's course. I suggested he revise the home page content and build some new quality links but I'm still a little stumped as to how/why this happened. If it was seen as duplicate content, how did this site with no links and zero authority manage to knock out a site that ranked well for hundreds of terms that had been around for 7 years? I get that it doesn't have a ton of authority but this other site had none. I'm doing this pro bono at this point but I feel bad for this guy as he's losing a lot of money at the moment so any other eyeballs that see something that I don't would be very welcome. Thanks Mozzers!
Intermediate & Advanced SEO | | NetvantageMarketing2 -
What constitutes a duplicate page?
Hi, I have a question about duplicate page content and wondered if someone is able to shed some light on what actually constitutes a "duplicate". We publish hundreds of bus timetable pages that have similar, but technically with unique urls and content. For example http://www.intercity.co.nz/travel-info/timetable/lookup/akl The template of the page is oblivious duplicated, but the vast majority of the content is unique to each page, with data being refreshed each night. Our crawl shows these as duplicate page errors, but is this just a generalisation because the urls are very similar? (only the last three characters change for each page - in this case /akl) Thanks in advance.
Intermediate & Advanced SEO | | BusBoyNZ0 -
Duplicate content - canonical vs link to original and Flash duplication
Here's the situation for the website in question: The company produces printed publications which go online as a page turning Flash version, and as a separate HTML version. To complicate matters, some of the articles from the publications get added to a separate news section of the website. We want to promote the news section of the site over the publications section. If we were to forget the Flash version completely, would you: a) add a canonical in the publication version pointing to the version in the news section? b) add a link in the footer of the publication version pointing to the version in the news section? c) both of the above? d) something else? What if we add the Flash version into the mix? As Flash still isn't as crawlable as HTML should we noindex them? Is HTML content duplicated in Flash as big an issue as HTML to HTML duplication?
Intermediate & Advanced SEO | | Alex-Harford0