Craw Diagnostics Questions
-
SEO Moz is reporting that I have 50+ pages with a duplicate content issue based on this URL: http://www. f r e d aldous.co.uk/art-shop/art-supplies/art-canvas.html?manufacturer=178
But I have included this tag in the source: rel="canonical" href="http://www.f r e daldous.co.uk/art-shop/art-supplies/art-canvas.html"/>
(I have purposefully added white space to the URLs in this message as I'm not sure about the rules for posting links here)
I though this "canonical" tag prevented the duplicate content being indexed?
is the reporting by SEOMoz wrong or being over cautious?
-
Hi Niall,
This isn't a case of the canonical tag being properly applied, but a case where two or more pages are so similar in code that they are setting off the SEOmoz duplicate content flags.
First of all, those pages look different to us humans. But the SEOmoz web app uses a similarity threshold of 95% of the html code. This takes everything on the page, both hidden and visible into account.
In this case, it's counting all of the navigation and sidebar as well, which is significant. What's left of the unique content - the part that matters, makes up less than 5% of the code.
Here's a tool you can use to check the similarity: http://www.duplicatecontent.net/
I ran the pages through a couple of tools which showed 98% HTML similarity. And 99% text similarity.
For perspective, take a look at Google's cached versions of one of these pages. This is how googlebot sees the page: http://webcache.googleusercontent.com/search?q=cache:mdybPKIjOxUJ:www.fredaldous.co.uk/craft-shop/general-crafts.html+http://www.fredaldous.co.uk/craft-shop/general-crafts.html&hl=en&gl=us&strip=1
That, as we say, is a lot of links!
Since Panda, when I see a site with this many navigation links, I usually advise them to restructure their site architecture into more of a Pyramid shape, so that you reduce the overall navigation on each page.
Hope this helps! Best of luck with your SEO.
-
It claims that this is one of the duplicate URLS:
http://www.f r e daldous.co.uk/photo-gift/design-led-gifts.html?manufacturer=436
Now I am confused as page is no where near duplicate content of the URL I posted 1st.
Can anyone explain this?
-
Helo Niall,
It seems that you have inserted the rel="canonical" href= in the correct spot. I think the software is giving you the potentials which is always a bonus precaution. I really don't want to make a premature determination without knowing which 50 pages are showing up as duplicate. A deeper look will allow me to give you a more accurate response.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Hi - I have a question about IP addresses
- would it hurt link juice to host a blog on a different server to the rest of your website? I have a web host saying they can't run Wordpress as they won't support PHP for "security reasons" - one solution would be to set up Wordpress on a different server and redirect domain.com/blog there (I presume this is do-able?). But I don't know if that affects the SEO adversely?
Technical SEO | | abisti21 -
Sitemap Question - E-commerce - Magento
Good Morning... I have an ecommerce site running on Magento and the sitemap is automatically generated by Magento based on the categories and sub categories and products. I have recently created new categories that i want to replace the old categories, but they are both in the auto-generated sitemap. The old categories are "active" (as in still exist if you know the URL to type) but not visible (you can't find it just by navigating through the site). The new category pages are active and visible... If i want Google to rank one page (the new category page) and not the old page (old category page) should i remove the old page from the sitemap? Would removing the old page that used to target the same keywords improve my rankings on the newer category page? Sitemap currently contains: www.example.com/oldcategorypage www.example.com/newcategorypage Did I confuse you yet? Any help or guidance is appreciated. Thanks,
Technical SEO | | Prime850 -
Question About Using Disqus
I'm thinking about implementing Disqus on my blog. I'd like to know if the Disqus comments are indexed by search engines? It looks like they are displayed using Ajax or jQuery.
Technical SEO | | sbrault740 -
Questionable Referral Traffic
Hey SEOMozers, I'm working with a client that has a suspicious traffic pattern going on. In October, a referral domain called profitclicking.com started passing visits to the site. Almost, in parallel the overall visits decreased anywhere from 35 to 50%. After checking out profitclicking.com more, it promises more traffic "with no SEO knowledge". The client doesn't think that this service was signed up for internally. Regardless, it obviously smells pretty fishy, and I'm searching for a way I can disallow traffic from this site. Could I simply just write a simple disallow statement in the robots.txt and be done with it? Just wanted to see if anyone else had any other ideas before recommending a solution. Thanks!
Technical SEO | | kylehungate0 -
SEOMoz Crawler vs Googlebot Question
I read somewhere that SEOMoz’s crawler marks a page in its Crawl Diagnostics as duplicate content if it doesn’t have more than 5% unique content.(I can’t find that statistic anywhere on SEOMoz to confirm though). We are an eCommerce site, so many of our pages share the same sidebar, header, and footer links. The pages flagged by SEOMoz as duplicates have these same links, but they have unique URLs and category names. Because they’re not actual duplicates of each other, canonical tags aren’t the answer. Also because inventory might automatically come back in stock, we can’t use 301 redirects on these “duplicate” pages. It seems like it’s the sidebar, header, and footer links that are what’s causing these pages to be flagged as duplicates. Does the SEOMoz crawler mimic the way Googlebot works? Also, is Googlebot smart enough not to count the sidebar and header/footer links when looking for duplicate content?
Technical SEO | | ElDude0 -
Crawl Diagnostics - How to find where broken links are located?
Hi, One of my sites has a 4xx error that has been picked up in the crawl diagnostics section. It is a broken link. Does anybody know if it is possible for me to find out which page the broken link was found on? I have checked all of the pages on the site that I thought were linking to the page that seems to have a problem but all of these links are fine / not broken. Any ideas? Thanks
Technical SEO | | CherryK0 -
Sub-domains for keyword targeting? (specific example question)
Hey everyone, I have a question I believe is interesting and may help others as well. Our competitor heavily (over 100-200) uses sub-domains to rank in the search engines... and is doing quite well. What's strange, however, is that all of these sub-domains are just archives -- they're 100% duplicate content! An example can be seen here where they just have a bunch of relevant posts archived with excerpts. How is this ranking so well? Many of them are top 5 for keywords in the 100k+ range. In fact their #1 source of traffic is SEO for many of the pages. As an added question: is this effective if you were to actually have a quality/non-duplicate page? Thanks! Loving this community.
Technical SEO | | naturalsociety0 -
Question about an older more obsolete site
I have a website that I don't use much anymore but it ranks on the first page for one of my main keywords. I am using another few websites in different niches right now that are doing better and are more functional. It may cost around 1,300 or so to get the website that I don't use anymore, to look and function in the new ways of the internet. Would you suggest that I: Do a site redesign (which is more difficult because to make the site do what I want it needs to be out of a wordpress theme) or 301 redirect the site to another one of my sites? Would it make sense to do a 301? The domain is 5 years old but doesn't bring in any leads anymore because it would take a redesign for that to happen. How can I still benefit from the SEO that I have done on that site? Thanks and sorry if this message is hard to follow. If I need to clear anything up please let me know.
Technical SEO | | blake-766240