How do SEOMOZ calculate duplicate content?
-
first of all i have to much duplicate stuff on my website end cleaning it up. But if i look at GWMC the duplicate stuff is a lot less than in SEOMOZ? can someone explain to me what the difference is?
Thnx, Leonie.
-
Hi Andre, Thnx for the reply. i'll read it
-
Moz doesn't just look at the text of a page, it also looks at the template and how "similar" it appears compared to other pages.
Here's a quote from Dr. Pete:
"Our system currently uses a threshold of 95% to determine whether content is duplicated. This is based on the source code (not the text copy), so the amount of actual duplicate content may vary depending on the code/content ratio."
Here are a few articles you can read to get a deeper understanding.
http://www.seomoz.org/blog/duplicate-content-in-a-post-panda-world
http://www.seomoz.org/blog/duplicate-content-block-redirect-or-canonical
http://www.seomoz.org/blog/the-illustrated-guide-to-duplicate-content-in-the-search-engines
http://www.seomoz.org/blog/rethinking-duplicate-content
http://www.seomoz.org/blog/fat-pandas-and-thin-content
http://www.seomoz.org/blog/the-illustrated-guide-to-duplicate-content-in-the-search-engines
Greg
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
In Wordpress getting marked as duplicate content for tags
Moz is marking 11 high priority items for duplicate content. Just switched to wordpress and publishing articles for the site but only have a few. The problem is on the tag pages. Since there aren't very many articles so when you go to the tag pages it lists one or two articles and hence there are pages with duplicate content. Most of the articles have the same tags / categories. Perhaps I'm using too many tags and categories? I'm using about 7 tags and around 2 categories for each post / event. I've read the solution is using canonical tags but a little confused on which page I should use for the tag and then I believe I need to point the duplicate pages to the correct page. For example, I have two events that are for dances and both have the same tags. So when you visit, site.com/tags/dance or site.com/events both pages have the same articles listed. Which page do I select as having the original content? Does it matter? Does that make sense? Someone was also saying I could use the Yoast plugin to fix, but not really seeing anything in the Yoast tools. I also see 301 redirects mentioned as a solution but the tag pages will be changing as we add new articles and they have a purpose so not really seeing that as a solution.
Web Design | | limited70 -
Is it cloaking/hiding text if textual content is no longer accessible for mobile visitors on responsive webpages?
My company is implementing a responsive design for our website to better serve our mobile customers. However, when I reviewed the wireframes of the work our development company is doing, it became clear to me that, for many of our pages, large parts of the textual content on the page, and most of our sidebar links, would no longer be accessible to a visitor using a mobile device. The content will still be indexable, but hidden from users using media queries. There would be no access point for a user to view much of the content on the page that's making it rank. This is not my understanding of best practices around responsive design. My interpretation of Google's guidelines on responsive design is that all of the content is served to both users and search engines, but displayed in a more accessible way to a user depending on their mobile device. For example, Wikipedia pages have introductory content, but hide most of the detailed info in tabs. All of the information is still there and accessible to a user...but you don't have to scroll through as much to get to what you want. To me, what our development company is proposing fits the definition of cloaking and/or hiding text and links - we'd be making available different content to search engines than users, and it seems to me that there's considerable risk to their interpretation of responsive design. I'm wondering what other people in the Moz community think about this - and whether anyone out there has any experience to share about inaccessable content on responsive webpages, and the SEO impact of this. Thank you!
Web Design | | mmewdell0 -
Duplicate Content? Designing new site, but all content got indexed on developer's sandbox
An ecommerce I'm helping is getting a complete redesign. Their developer had a sandbox version of their new site for design & testing. Several thousand products were loaded into the sandbox site. Then Google/Bing crawled and indexed the site (because developer didn't have a robots.txt), picking up and caching about 7,200 pages. There were even 2-3 orders placed on the sandbox site, so people were finding it. So what happens now?
Web Design | | trafficmotion
When the sandbox site is transferred to the final version on the proper domain, is there a duplicate content issue?
How can the developer fix this?0 -
Unique content but exactly the same graphical layout - a problem?
Hello, I have a coaching website at www.bobweikel.com I want to make a second coaching website with the same exact wordpress theme, totally identical except a slightly different logo image. Everything the same. The only difference is that all the content will be unique on the new site. Does it matter to Google that the graphics are absolutely identical? I assume it's fine, I just am making sure.
Web Design | | BobGW0 -
How much does on-site duplicated content affect SERPs?
Hi, We've recently gotten into Moz, with our E-commerce websites, and discovered that it's crawler takes note of about 2500 pages which it thinks are the same (duplicated). We've now begun to completely rewrite every description of every product (including Meta Title/Description) so that this number may be reduced. Since this is the biggest issue Moz spots I'm wondering what the effect of fixing it will be on our position in the SERP (mainly Google). Does anybody have some stories or experience about this topic? Thanks in Advance! 🙂 Alexander
Web Design | | WebmasterAlex0 -
Duplicate Titles for Large Lists
Our blog (www.cowleyweb.com/blog) has recently been given topic categories so we can utilize our old blogs. Otherwise, users would only see what's new and never look back (our blogs are organized by the month they were published) and all that hard work would kind of be a waste after a while. So we came up with a few topics (i.e. social media, internet marketing, etc.) and adding those as tags to blogs. Now, users can click the topics and get a results page on our blog of all the previously published blogs related to that topic. Sounds great. BUT, it's hurting our SEO crawl report. If the list goes beyond one page of search results, the 2nd and subsequent pages get dinged as "duplicate title" b/c they share the same title (i.e. "Social Media"). How can I fix this? I'm not the web designer but something tells me maybe some sort of tag that says "Page 2" or something would do the trick. We use Drupal which is good for customization. I assume tons of bloggers and websites have dealt with this problem. Please help. Want to give the web guy some solutions. Thank you.
Web Design | | JCunningham0 -
Which Content Causes Duplicate Content Errors
My Duplicate Content list starts off with this URL: http://www.nebraskamed.com/about-us/branding/bellevue-medical-center-logo Then it lists the five below as Duplicate Content: http://www.nebraskamed.com/about-us/branding/fonts http://www.nebraskamed.com/about-us/branding/clear-zone http://www.nebraskamed.com/about-us/social-media http://www.nebraskamed.com/about-us/branding/order-stationery http://www.nebraskamed.com/about-us/branding/logo I do notice that most of these pages have images and/or little or no content outside of our sites template. Is this causing SEOmoz to see it as duplicate? Should I use noindex, follows to fix this? This error is happening with branding pages so noindex is an option. What should I do if that's not an option? Should I change our mega menus to be ajax driven to so the links aren't showing up in the code of every page?
Web Design | | Patrick_at_Nebraska_Medicine0 -
Suggestions for content slider/image slider copy/paste application.
Hey Moz Community, I am looking for a content slider that can be easily changed by non-technicals for posting different styles of content/calls to action and this seems to be best: http://www.slidedeck.com/ I have installed a nivo slider on a Seattle Painting site, and flash slider on a commercial painting site. But I want my blog clients to be able to format..then copy/paste code..linke embedding a video. opinions? Thanks John
Web Design | | johnshearer0