Crawl reports urls with duplicate content but its not the case
-
Hi guys!
Some hours ago I received my crawl report.I noticed several records with urls with duplicate content so I went to open those urls one by one.
Not one of those urls were really with duplicate content but I have a concern because website is about product showcase and many articles are just images with href behind them. Many of those articles are using the same images so maybe thats why the seomoz crawler duplicate content flag is raised. I wonder if Google has problem with that too.See for yourself how it looks like:
http://by.vg/NJ97y
http://by.vg/BQypEThose two url's are flagged as duplicates...please mind the language(Greek) and try to focus on the urls and content.
ps: my example is simplified just for the purpose of my question.
<colgroup><col width="3436"></colgroup>
| URLs with Duplicate Page Content (up to 5) | -
Disclaimer: I just answered a question just like this on another thread, so I literally copied and pasted my response from there, and edited where necessary.
The SEOmoz web app uses a similarity threshold of 95% of the html code. This takes everything on the page, both hidden and visible into account.
In this case, it's counting all of the navigation and sidebar as well, which is significant. What's left of the unique content - the part that matters, makes up less than 5% of the code.Here's a tool you can use to check the similarity: http://www.duplicatecontent.net/
I ran the pages through a couple of tools which showed 98% similarity. (but only 75% text similarity, which is good, but not great)
SEOKeith is absolutely right that there's very little on those pages to help them rank. Without text, you're fighting an uphill battle.
Hope this helps! Best of luck with your SEO.
-
Yeah, thats what I m going to do in my next meeting. Either way I also feel such websites need to have more pics than anything else, maybe a blog page or separate pages with articles could link to those products one by one with related description having a side content website for the actual product pages.
-
Maybe explain to the client it's not going to rank as well without text and has less chance of getting found by searches (generally speaking...).
I get duplicate content flagging as well sometimes, I check the pages manually when it happens.
-
Thanks Keith. I ve been using seomoz for some days so I wasnt sure about this.
Client wants website with as less text as possible so I guess my only hopes are title and alt attributes.
-
Those pages are very similar so it's probably throwing the duplicate content switch in SEOmoz, you might want to ignore it in this case.
I would add some more text to those pages personally to aid with ranking, you can position the text over the images with CSS.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate Content/Missing Meta Description | Pages DO NOT EXISIT!
Hello all, For the last few months, Moz has been showing us that our site has roughly 2,000 duplicate content errors. Pages that were actually duplicate content, I took care of accordingly using best practice (301 redirects, canonicalization,etc.). Still remaining after these fixes were errors showing for pages that we have never created. Our homepage is www.primepay.com. An example of pages that are being shown as duplicate content is http://primepay.com/blog/%5BLink%20to%20-%20http:/www.primepay.com/en/payrollservices/payroll/payroll/payroll/online-payroll with a referring page of http://primepay.com/blog/%5BLink%20to%20-%20http:/www.primepay.com/en/payrollservices/payroll/payroll/online-payroll. Some of these are even now showing up as 403 and 404 errors. The only real page on our site within that URL strand is primepay.com/payroll or primepay.com/payroll/online-payroll. Therefore, I am not sure where Moz is getting these pages from. Another issue we are having in relation to duplicate content is that moz is showing old campaign url’s tacked on to our blog page i.e. http://primepay.com/blog?title=&page=2&utm_source=blog&utm_medium=blogCTA&utm_campaign=IRSblogpost&qt-blog_tabs=1. As of this morning, our duplicate content went from 2,000 to 18,000. I exported all of our crawl diagnostics data and looked to see what the referring pages were, and even they are not pages that we have created. When you click on these links, they take you to a random point in time from the homepage of our blog; some dating back to 2010. I checked our crawl stats in both Google and Bing’s Webmaster tool, and there are no duplicate content or 400 level errors being reporting from their crawl. My team is truly at a loss with trying to resolve this issue and any help with this matter would be greatly appreciated.
Moz Pro | | PrimePay0 -
Duplicate content across two websites
Hi. I'm looking at ways to compare duplicate content across two different websites instead of one, as with the Moz crawler. Instead it will flag up up duplicates present on both site A and B.
Moz Pro | | Blink-SEO0 -
Why do I see a duplicate content errors when rel="canonical" tag is present
I was reviewing my first Moz crawler report and noticed the crawler returned a bunch of duplicate page content errors. The recommendations to correct this issue are to either put a 301 redirect on the duplicate URL or use the rel="canonical" tag so Google knows which URL I view as the most important and the one that should appear in the search results. However, after poking around the source code I noticed all of the pages that are returning duplicate content in the eyes of the Moz crawler already have the rel="canonical" tag. Does the Moz crawler simply not catch whether that tag is being used? If I have that tag in place, is there anything else I need to do in order to get that error to stop showing up in the Moz crawler report?
Moz Pro | | shinolamoz0 -
Crawl diagnostics incorrectly reporting duplicate page titles
Hi guys, I have a question in regards to the duplicate page titles being reported in my crawl diagnostics. It appears that the URL parameter "?ctm" is causing the crawler to think that duplicate pages exist. In GWT, we've specified to use the representative URL when that parameter is used. It appears to be working, since when I search site:http://www.causes.com/about?ctm=home, I am served a single search result for www.causes.com/about. That begs the question, why is the SEOMoz crawler saying there is duplicate page titles when Google isn't (doesn't appear under the HTML improvements for duplicate page titles)? A canonical URL is not used for this page so I'm assuming that may be one reason why. The only other thing I can think of is that Google's crawler is simply "smarter" than the Moz crawler (no offense, you guys put out an awesome product!). Any help is greatly appreciated and I'm looking forward to being an active participant in the Q&A community! Cheers, Brad
Moz Pro | | brad_dubs0 -
Crawl test from tools
Hi, I notice that the crawl test which is from the Research Tools doesn't really get a new crawl even though there is 2 crawl per day. It will only provide the data which was acquire from the crawl diagnostics in my pro account. There is no point for me to get the data which I get from my crawl diagnostic isn't it? Even seomoz provided with more than 2 crawl per day also useless in this case. This whole thing doesn't make sense as the crawl diagnostics will only perform a full crawl test once every week. but even the crawl test also not helping any thing out for me.
Moz Pro | | hanzoz0 -
Rank Report Not Updating
The ranking tool indicates that a new rank report will be generated every Tuesday. My last report was on November 15th. It is now November 25th. It missed last week. Should I report something wrong or is this normal? How do I keep it updating on a weekly basis like it's supposed to? I checked in settings to see if I had done anything wrong and couldn't find anything. Regards, Dino
Moz Pro | | Dino640 -
Campaign Crawl Report
Hello, Just a quicky, is there anyway I can do a crawl report for something in a campaign so I can compare the changes? I know you can do a separate crawl test, but it wont show the differences,and the next crawl date isnt untill the 28th.
Moz Pro | | Prestige-SEO0 -
On-Page URL
Hopefully I am missing something basic... I can't see how to specifically add and delete On-Page reports. It seems like running a report adds it but how to delete? Also, how does one change the URL for a report? I have re-organized some pages and can't seem the get the on-page report to keep my URL change. Here is what I tried. From the On-Page report card for a keyword I changed the URL and ran the test. Test runs ok but if I navigate back to the summary my old bad URL is still there.
Moz Pro | | Banknotes0