Crawl reports urls with duplicate content but its not the case
-
Hi guys!
Some hours ago I received my crawl report.I noticed several records with urls with duplicate content so I went to open those urls one by one.
Not one of those urls were really with duplicate content but I have a concern because website is about product showcase and many articles are just images with href behind them. Many of those articles are using the same images so maybe thats why the seomoz crawler duplicate content flag is raised. I wonder if Google has problem with that too.See for yourself how it looks like:
http://by.vg/NJ97y
http://by.vg/BQypEThose two url's are flagged as duplicates...please mind the language(Greek) and try to focus on the urls and content.
ps: my example is simplified just for the purpose of my question.
<colgroup><col width="3436"></colgroup>
| URLs with Duplicate Page Content (up to 5) | -
Disclaimer: I just answered a question just like this on another thread, so I literally copied and pasted my response from there, and edited where necessary.
The SEOmoz web app uses a similarity threshold of 95% of the html code. This takes everything on the page, both hidden and visible into account.
In this case, it's counting all of the navigation and sidebar as well, which is significant. What's left of the unique content - the part that matters, makes up less than 5% of the code.Here's a tool you can use to check the similarity: http://www.duplicatecontent.net/
I ran the pages through a couple of tools which showed 98% similarity. (but only 75% text similarity, which is good, but not great)
SEOKeith is absolutely right that there's very little on those pages to help them rank. Without text, you're fighting an uphill battle.
Hope this helps! Best of luck with your SEO.
-
Yeah, thats what I m going to do in my next meeting. Either way I also feel such websites need to have more pics than anything else, maybe a blog page or separate pages with articles could link to those products one by one with related description having a side content website for the actual product pages.
-
Maybe explain to the client it's not going to rank as well without text and has less chance of getting found by searches (generally speaking...).
I get duplicate content flagging as well sometimes, I check the pages manually when it happens.
-
Thanks Keith. I ve been using seomoz for some days so I wasnt sure about this.
Client wants website with as less text as possible so I guess my only hopes are title and alt attributes.
-
Those pages are very similar so it's probably throwing the duplicate content switch in SEOmoz, you might want to ignore it in this case.
I would add some more text to those pages personally to aid with ranking, you can position the text over the images with CSS.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What's more valuable: new content or optimizing old content
We are a niche legacy print publication that's been around for close to 20 years. Recently, we combined several old sites in one new responsive site. We have over 7,000 articles -many of which are evergreen and can be repurposed when needed. Most of the old pieces although published, have not been optimized for SEO. However, as we create new pieces, we optimize them for search and social and they tend to get more organic traffic. Where we're torn is on how much we should balance our limited editorial resources between cleaning up and optimizing our extensive archive to improve our organic reach, vs. pumping out new original pieces each week. I realize that without a lot of data the answers will be varied - I guess I'm looking for a best practices approach for content publishers. If it helps at all, our main conversion goal is selling subscriptions to our print and digital publications. We know that organic traffic tends to be more engaged than our social referrals. Unfortunately, due to the nature of the magazine fulfilment business, it's tough to know which channels convert better. Thanks!
Moz Pro | | RicardoSalcedo0 -
Website Issues - Duplicate Content
Hello, I'm fairly new to using Moz and I logged on this morning to find Issues have been found in one of the websites - 22 High Priority and 44 Medium. I know it's due to duplicate content in the blog, but i can't figure out what is duplicated? I've only recently come on board this website so I don't know if the content has been plagiarised or what? The link to the site is here: delacyspa.co.uk Any help would be appreciated. Thanks zFxQmmd
Moz Pro | | Cowbang0 -
Rankings Report not working
Hi (What happened to the User Voice feedback in PRO campaign reports? Another cutback because of scalability?) I'm getting sent here for Help on the reports It's Friday morning here in France ; I'm preparing a meeting with a client and as part of this I want their Rankings Report from the campaing I set up for them month's ago Half the keywords are showing up as "SAT" ; the message at the top says "Your keywords are updated weekly on Saturday. The last update was January 26th, 2013" They're not new keywords, if I click on them I get historic data up untill January 19th I'm guessing you had a problem on January 26th but why not put January 19th rankings rather than SAT ? The report is now useless. Here's the url if useful http://pro.seomoz.org/campaigns/54444/rankings Neil
Moz Pro | | NeilInFrance0 -
SEOMoz reports and 404 errors
My SEOMoz report shows a 404 error, found today for this url: http://globalheavyhaul.com/google.com i do not have this anchor text anywhere on my website. How did Roger figure out that somebody looked for that page? Do I need to worry about 404 errors that are the result of user mistakes, instead of actual bad links?
Moz Pro | | FreightBoy0 -
Crawl Diagnostics Error Spike
With the last crawl update to one of my sites there was a huge spike in errors reported. The errors jumped by 16,659 -- majority of which are under the duplicate title and duplicate content category. When I look at the specific issues it seems that the crawler is crawling a ton of blank pages on the sites blog through pagination. The odd thing is that the site has not been updated in a while and prior to this crawl on Jun 4th there were no reports of these blank pages. Is this something that can be an error on the crawler side of things? Any suggestions on next steps would be greatly appreciated. I'm adding an image of the error spike Xovep.jpg?1 Xovep.jpg?1
Moz Pro | | VanadiumInteractive1 -
How do I fix a duplicate content error with a top level domain?
Hi, I'm getting a duplicate content error from the SEOmoz crawler due to an issue with trailing slashes. It's showing www.milengo.com and www.milengo.com/ as having duplicate page titles. However I'm pretty sure this has been fixed in the .htaccess file since if you type in the domain with a trailing slash it automatically redirects to the domain without a trailing slash, so this shouldn't be an issue. I'm stuck here. Any ideas? Thanks. Rob
Moz Pro | | milengo0 -
Port 80 and Duplicate Content
The SEOmoz Web App is showing me that every single URL on one of my clients' domains has a duplicate in the form of the URL + :80. For instance, the app is showing me that www.example.com/default.aspx is duplicated in the form of www.example.com:80/default.aspx Any idea if this is an actual problem or just some kind of reporting error? Any help would be appreciated.
Moz Pro | | AnthonyMangia0 -
Why does the crawl report say I should have meta description and title tags in my xml files?
Just had my first crawl report today which has been very useful in finding missing and duplicated title tags and meta descriptions but it has flagged up the fact that my xml files are missing these. Surely non HTML documents shouldn't have them (or need them) so why are they showing up in the report?
Moz Pro | | PandyLegend0