Duplicate Page Content
-
Hey Moz Community,
Newbie here. On my second week of Moz and I love it but have a couple questions regarding crawl errors. I have two questions:
1. I have a few pages with duplicate content but it say 0 duplicate URL's. How do I know what is duplicated in this instance?
2. I'm not sure if anyone here is familiar with an IDX for a real estate website. But I have this setup on my site and it seems as though all the links it generates for different homes for sale show up as duplicate pages.
For instance, http://www.handyrealtysa.com/idx/mls...tonio_tx_78258 is listed as having duplicate page content compared with 7 duplicate URLS:
http://www.handyrealtysa.com/idx/mls...tonio_tx_78247
http://www.handyrealtysa.com/idx/mls...tonio_tx_78253
http://www.handyrealtysa.com/idx/mls...tonio_tx_78245
http://www.handyrealtysa.com/idx/mls...tonio_tx_78261
http://www.handyrealtysa.com/idx/mls...tonio_tx_78258
http://www.handyrealtysa.com/idx/mls...tonio_tx_78260
http://www.handyrealtysa.com/idx/mls...tonio_tx_78260I've attached a screenshot that shows 2 of the pages that state duplicate page content but have 0 duplicate URLs. Also you can see somewhat about the idx duplicate pages.
rel="canonical" is functioning on these pages, or so it seems when I view the source code from the page.
Any help is greatly appreciated.
-
The contact-us page re-directs to a different URL (about-us/contact-us) but the original source code for just www.handyrealtysa.com/contact-us matches http://www.handyrealtysa.com/community & http://www.handyrealtysa.com/resources which has no content in the main area.
While a high percentage can be considered duplicates, our crawler will also take into account the main content area to see if anything matches there as well which in the above links are different outside of the navigation and header.
-
-
Can you provide me with a couple of pages that are similar but not flagged as a duplicate?
-
Thanks for the responses.
I used the page checker and is shows most of the IDX pages are 98% similar. This can't be good. I've posed the question to my IDX provider and await their answer.
With regards to the similar pages that show 0 duplicate URLs, what can I do to look into this? These seem to be non-IDX pages, so I could likely do more to fix the error in these pages.
Thanks again!
-
Campaigns have a 90% tolerance for duplicate content. This includes all the source code on the page and not just the viewable text. So if a URL is at least 90% similar in code to another URL, this warning will appear. Although the pages in question are may appear to be different on the front end, they are actually duplicates based on this percentage (at least the example URLs I checked in your campaigns.)
You can run your own tests using this tool: http://www.webconfs.com/similar-page-checker.php
We don't know what standard Google uses, but it's safe to say they are a bit more sophisticated than us - so you might be okay in this regard as long as you have a couple hundred words of unique text per page. Google won't say how much duplicate content is too much, so we like to be better safe than sorry.
Hope this helps!
-
Seeing your problem in an SEO viewpoint, it’s always best for a website not to have any duplicate content. So maybe try linking to the source of the listing on the IDX website.
Your rel="canonical" is in place and in the section where it needs to be.
The duplicate content maybe coming from what you are not doing, but what other similar sites are doing. How many other real-estate sites use the same identical keyword and description for the same listing as you? These similar listings on "other sites", could be the cause for the duplicate content issues on your site. I guess my question would be how many other sites have a house listed @ 20615 Wild Springs Dr, San Antonio, TX 78258 (MLS # 1034019) using the same address and description as you?
My understanding this is a common problem with IDX, not sure if this solves your problem, but may solve why you are having a duplicate content issue.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google ranking content for phrases that don't exist on-page
I am experiencing an issue with negative keywords, but the “negative” keyword in question isn’t truly negative and is required within the content – the problem is that Google is ranking pages for inaccurate phrases that don’t exist on the page. To explain, this product page (as one of many examples) - https://www.scamblermusic.com/albums/royalty-free-rock-music/ - is optimised for “Royalty free rock music” and it gets a Moz grade of 100. “Royalty free” is the most accurate description of the music (I optimised for “royalty free” instead of “royalty-free” (including a hyphen) because of improved search volume), and there is just one reference to the term “copyrighted” towards the foot of the page – this term is relevant because I need to make the point that the music is licensed, not sold, and the licensee pays for the right to use the music but does not own it (as it remains copyrighted). It turns out however that I appear to need to treat “copyrighted” almost as a negative term because Google isn’t accurately ranking the content. Despite excellent optimisation for “Royalty free rock music” and only one single reference of “copyrighted” within the copy, I am seeing this page (and other album genres) wrongly rank for the following search terms: “free rock music”
On-Page Optimization | | JCN-SBWD
“Copyright free rock music"
“Uncopyrighted rock music”
“Non copyrighted rock music” I understand that pages might rank for “free rock music” because it is part of the “Royalty free rock music” optimisation, what I can’t get my head around is why the page (and similar product pages) are ranking for “Copyright free”, “Uncopyrighted music” and “Non copyrighted music”. “Uncopyrighted” and “Non copyrighted” don’t exist anywhere within the copy or source code – why would Google consider it helpful to rank a page for a search term that doesn’t exist as a complete phrase within the content? By the same logic the page should also wrongly rank for “Skylark rock music” or “Pretzel rock music” as the words “Skylark” and “Pretzel” also feature just once within the content and therefore should generate completely inaccurate results too. To me this demonstrates just how poor Google is when it comes to understanding relevant content and optimization - it's taking part of an optimized term and combining it with just one other single-use word and then inappropriately ranking the page for that completely made up phrase. It’s one thing to misinterpret one reference of the term “copyrighted” and something else entirely to rank a page for completely made up terms such as “Uncopyrighted” and “Non copyrighted”. It almost makes me think that I’ve got a better chance of accurately ranking content if I buy a goat, shove a cigar up its backside, and sacrifice it in the name of the great god Google! Any advice (about wrongly attributed negative keywords, not goat sacrifice ) would be most welcome.0 -
Duplicate Content - Deleting Pages
The Penguin update in April 2012 caused my website to lose about 70% of its traffic overnight and as a consequence, the same in volume of sales. Almost a year later I am stil trying to figure out what the problem is with my site. As with many ecommerce sites a large number of the product pages are quite similar. My first crawl with SEOMOZ identified a large number of pages that are very similar - the majority of these are in a category that doesn't sell well anyway and so to help with the problem I am thinking of removing one of my categories (about 1000 products). My question is - would removing all these links boost the overall SEO of the site since I am removing a large chunk of near-duplicate links? Also - if I do remove all these links would I have to put in place a 301 redirect for every single page and if so, what's the quickest way of doing this. My site is www.modern-canvas-art.com Robin
On-Page Optimization | | robbowebbo0 -
I have a question about on page links or duplicate contant
Ok help me out here friends. I’m working with the warnings and errors for my site. I have two problems that relate to each other and I want to know if you had to choose what problem what would you choose. I’m running into some duplicate content and title errors because under categories for my products there are so many products that it creates more than one page and with each new page it has the same title or same content on the page. I tried to make this less in some cases by showing more products per page like 100 items and in most cases per category it will only show one page now. Now some times there’s still more than one page and also this creates too many links now on that category page. So I think I can get rid of all the to many on page links but it will make more pages and duplicate content and title tag. What would you guys do?
On-Page Optimization | | Dataken0 -
Suggestions to avoid duplicate content
Hi, we have about 6500 products, almost all with descriptions. SEOMOZ is showing about 2500 of them with duplicate content. The reason for this is that only one or two words are different for each product. For example, we have 500 award certificates. All are the same size and have the same description. But one is swimming, one baseball, one reading, etc, etc. Apparently the 1 word difference is not enough to differentiate. We have the same issue with our trophies - they are identical, except for figures. Does anyone have any good tips on how to change the content to avoid this issue and to avoid making up content for 2500 items? Thanks! Neil trophycentral.com
On-Page Optimization | | trophycentraltrophiesandawards0 -
What's the best way to tackle duplicate pages in a blog?
We installed a WP blog on a website and the below result is just an example. All of them lead to the same content. What's the best way to resolve it? http://www.calmu.edu/blog/
On-Page Optimization | | Sangeeta
http://www.calmu.edu/blog/calmu-business-spotlight-veev/
http://www.calmu.edu/blog/category/business-buzz/0 -
Duplicate content
the report shows duplicate content for a category page that has more than one page. how can we avoid this since i cannot make a different meta content for the second page of the category page: http://www.geographics.com/2-Cool-Colors-Poster-Board-14x22/c183_66_327_387/index.html http://www.geographics.com/2-Cool-Colors-Poster-Board-14x22/c183_66_327_387/index.html?page=2 thanks, Madlena
On-Page Optimization | | Madlena0 -
Duplicate content - what to do?
Hi, We have a whole lot of articles on our site. In total 5232 actually. The web crawler tells me that in the articles we have a lot of duplicate content. Which is sort of nonsense, since each article is unique. Ah, some might have some common paragraphs because they are recurring news about a weekly competition. But, an example: http://www.betxpert.com/artikler/bookmakere/brandvarme-ailton-snupper-topscorerprisen AND http://www.betxpert.com/artikler/bookmakere/opdaterede-odds-pa-sportschef-situationen-pa-vestegnen These are "duplicate content", however the two article texts are not the same. The menu, and the widgets are all the same, but highly relevant to the article. So what should I do? How can i rid myself of these errors? -Rasmus
On-Page Optimization | | rasmusbang0 -
Is rel=canonical used only for duplicate content
Can the rel-canonical be used to tell the search engines which page is "preferred" when there are similar pages? For instance, I have an internal page that Google is showing on the first page of the SERPs that I would prefer the home page be ranked for. Both the home and internal page have been optimized for the same keyword. What is interesting is that the internal page has very few backlinks compared to the home page but Google seems to favor it since the keyword is in the URL. I am afraid a 301 will drop us from the first page of the SERPs.
On-Page Optimization | | surveygizmo0