What is Considered Duplicate Content by Crawlers?
-
I am asking this because I have a couple of site audit tools that I use to crawl a site I work on every week and they are showing duplicate content issues (which I know there is a lot on this site) but some of what is flagged as duplicate content makes no sense.
For example, the following URL's were grouped together as duplicate content:
|
https://www.firefold.com/contact-us
|
| https://www.firefold.com/sale |
|
|
How are these pages duplicate content? I am confused on what site audit tools are considering duplicate content.
Just FYI, this is data from Moz crawl diagnostics but SEMrush site auditor is giving me the same type of data.
Any help would be greatly appreciated.
Ryan
-
Yea I just started working on this site. I haven't used Moz Analytics much so just wanting to see how their crawler crawls pages.
And yes I agree, there are a lot of BIG BIG BIG issues with this site.
I got a large workload over the next few months haha.
-
I would add that there's is no text on any of those three pages - any "text" one would see there is actually just embedded in an image - which is a huge issue for a number of reasons:
- Search engines see that there's no text - a big no-no.
- You're getting practically no SEO value from the content that would be there, even if there isn't much.
- It's heavier this way - which makes load times slower.
I want to clarify that there are many, bigger issues with these pages - but as your question concerns only duplicate content, I'll leave all of that out for the time being. To summarize, Google, Yahoo, and Bing are just seeing some duplicate banners, sidebars, etc. and then some images in the body of your pages. Hence, duplicate content.
-
Thanks for that information.
It makes sense looking at the data and pages from that perspective.
-
Hi Ryan!
Our crawler will flag pages that have at least 90% similarity in the entire source code of the site so not just the body.
The way you want to interpret the report is the contact-us page has 35 duplicates, so "gabe" and "sale" are not dupes of each other in this section but are only each a duplicate of "contact-us". Those URLs might appear with their own duplicates of the same pages further down in the report.
While on the front end the pages do not appear to be similar. The issue is likely with the amount of javascript code on those pages.
Our crawler cannot read javascript so we are likely only able to see the template of the page. Other search tools are probably seeing the same thing as it returns 79% similarity using this tool: http://www.freebulkseotools.com/similar-page-checker-tool.php
I can't provide much insight from a dev perspective but hope this helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is the Content Suggestions section under Page Optimization a TF-IDF Analysis?
If you start a campaign in Moz, go to page optimization, enter a URL and keyword, and go to the bottom where it says "Content Suggestions" is that basically do a TF-IDF analysis? I want to make sure I understand how that works. Thanks!
Moz Bar | | brettmandoes0 -
Related topics / content suggestion
Hello, In the related topic feature now called content suggestions https://moz.com/blog/related-topics-in-moz-pro. Are the words indicated words to include in my content or are they "topics" to talk about using words that would describe those words turning these words into concepts ? Thank you,
Moz Bar | | seoanalytics0 -
Error 406 with crawler test
hi to all. I have a big problem with the crawler of moz on this website: www.edilflagiello.it. On july with the old version i have no problem and the crawler give me a csv report with all the url but after we changed the new magento theme and restyled the old version, each time i use the crawler, i receive a csv file with this error: "error 406" Can you help me to understan wich is the problem? I already have disabled .htacces and robots.txt but nothing. Website is working well, i have used also screaming frog as well.
Moz Bar | | ArchieDonnithorne1 -
Moz Crawl Report showing non-existent Duplicate Errors since new reporting layout
Hi Moz Community, Since Moz changed to the new style of Crawl report, we've seen a jump in duplicate errors for our site. These duplicate errors do not exist and were not present on the Crawl reports before the report change and also we have not made any changes to the flagged pages on our site since then either. When you download the report data in csv it appears that the Moz report is mixing up data for two or more pages on the site. e.g.in csv for 'Page1' data, it will show the meta description for 'Page2' and 'Page2' shows that for 'Page1', so this then gets flagged as duplicate, however looking at the actual Meta description assigned onsite, both Page 1 and Page 2 are completely unique. Has anyone else experienced this and Moz Team - are you looking into this? Thanks, V
Moz Bar | | WWTeam1 -
Can Moz use canconical links to prevent notices about duplicate content issues?
if so how do we enable this - we've an average size site with a few hundred products but they appear in multiple categories, canonical url points to it's primary category (but a new page exists for each section... so for /cat-a/abc there will be another page cat-b/abc and again but the canonical points to cat-a always for that product) basically I see this kind of duplication error / notice as a false positive... help me
Moz Bar | | SEOAndy0 -
My 301 Error and Duplicate Title Content Issue is Growing !
When i redirect some of my page - it shows error. not redirecting and i made this 3-4 months before, no effect. All Errors under each category make me gone sick.
Moz Bar | | Esaky0 -
Site Crawler Tool by the Company Formerly Known As SEOMoz
Moz had a tool I used that would crawl my site and send me a report of all pages, all errors, 301s 404s 505s, and a whole plethora of stuff. I used it to fix pesky errors quite a bit. Does this still exist? Was it replaced or am I just not finding it in the new design?
Moz Bar | | KJ-Rodgers0 -
Screaming Frog, Moz and other crawlers
Hi Ignorant question, but is it possible to use Screaming Frog or the Moz crawler or any other reputable crawler for a site still in development i.e. it is yet to be indexed? If so, could someone provide some quick instructions on how this can be done. Thanks in advance for any support. Neil
Moz Bar | | mccormackmorrison0