Craw Diagnostics Questions
-
SEO Moz is reporting that I have 50+ pages with a duplicate content issue based on this URL: http://www. f r e d aldous.co.uk/art-shop/art-supplies/art-canvas.html?manufacturer=178
But I have included this tag in the source: rel="canonical" href="http://www.f r e daldous.co.uk/art-shop/art-supplies/art-canvas.html"/>
(I have purposefully added white space to the URLs in this message as I'm not sure about the rules for posting links here)
I though this "canonical" tag prevented the duplicate content being indexed?
is the reporting by SEOMoz wrong or being over cautious?
-
Hi Niall,
This isn't a case of the canonical tag being properly applied, but a case where two or more pages are so similar in code that they are setting off the SEOmoz duplicate content flags.
First of all, those pages look different to us humans. But the SEOmoz web app uses a similarity threshold of 95% of the html code. This takes everything on the page, both hidden and visible into account.
In this case, it's counting all of the navigation and sidebar as well, which is significant. What's left of the unique content - the part that matters, makes up less than 5% of the code.
Here's a tool you can use to check the similarity: http://www.duplicatecontent.net/
I ran the pages through a couple of tools which showed 98% HTML similarity. And 99% text similarity.
For perspective, take a look at Google's cached versions of one of these pages. This is how googlebot sees the page: http://webcache.googleusercontent.com/search?q=cache:mdybPKIjOxUJ:www.fredaldous.co.uk/craft-shop/general-crafts.html+http://www.fredaldous.co.uk/craft-shop/general-crafts.html&hl=en&gl=us&strip=1
That, as we say, is a lot of links!
Since Panda, when I see a site with this many navigation links, I usually advise them to restructure their site architecture into more of a Pyramid shape, so that you reduce the overall navigation on each page.
Hope this helps! Best of luck with your SEO.
-
It claims that this is one of the duplicate URLS:
http://www.f r e daldous.co.uk/photo-gift/design-led-gifts.html?manufacturer=436
Now I am confused as page is no where near duplicate content of the URL I posted 1st.
Can anyone explain this?
-
Helo Niall,
It seems that you have inserted the rel="canonical" href= in the correct spot. I think the software is giving you the potentials which is always a bonus precaution. I really don't want to make a premature determination without knowing which 50 pages are showing up as duplicate. A deeper look will allow me to give you a more accurate response.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Website Migration - Very Technical Google "Index" Question
This is my understanding of how Google's search works, and I am unsure about one thing in specifc: Google continuously crawls websites and stores each page it finds (let's call it "page directory") Google's "page directory" is a cache so it isn't the "live" version of the page Google has separate storage called "the index" which contains all the keywords searched. These keywords in "the index" point to the pages in the "page directory" that contain the same keywords. When someone searches a keyword, that keyword is accessed in the "index" and returns all relevant pages in the "page directory" These returned pages are given ranks based on the algorithm The one part I'm unsure of is how Google's "index" connects to the "page directory". I'm thinking each page has a url in the "page directory", and the entries in the "index" contain these urls. Since Google's "page directory" is a cache, would the urls be the same as the live website? For example if webpage is found at wwww.website.com/page1, would the "page directory" store this page under that url in Google's cache? The reason I ask is I am starting to work with a client who has a newly developed website. The old website domain and files were located on a GoDaddy account. The new websites files have completely changed location and are now hosted on a separate GoDaddy account, but the domain has remained in the same account. The client has setup domain forwarding/masking to access the files on the separate account. From what I've researched domain masking and SEO don't get along very well. Not only can you not link to specific pages, but if my above assumption is true wouldn't Google have a hard time crawling and storing each page in the cache?
Technical SEO | | reidsteven750 -
Crawl Diagnostic: Notices about 301 redirects
There are detected five 301 redirects on my site and I want to understand why this is happening? And is this important to fix? http://domain.cl/subfolder ---- redirects to ----> http://domain.cl/subfolder/ What does this tell me "/" I am very curious 🙂 Thanks for every answer
Technical SEO | | inlinear
Holger0 -
Site command / Footprint Question
Hi All, I am looking for websites with keywords in the domain and I am using: inurl:keyword/s The results that come back include sub-pages and not only domains with the keywords in the root domain. example of what i mean: www.website.com/keyword/ What I want displayed only: www.keyword/s.com Does anyone know of a site command i can use to display URL's with keywords in the root domain only? Thanks in Advance Greg
Technical SEO | | AndreVanKets0 -
Panda recovery timeframe question
Site was hit by Panda Aug. 22nd. Lost 90% of Google traffic. I know 🙂 We think we found a reason and made few changes to landing pages structure. Updated sitemaps submitted. When can we expect effect (if any) - few days or after next Panda data refresh? Thank you!P.S. What is also interesting, similar traffic loss from Bing/Yahoo happened at exactly the same date. Does that mean Bing is "stealing" search results from Google when can't provide their own relevant results? 🙂
Technical SEO | | LocalLocal0 -
Schema address question
I have a website that has a contact us page... of course and on that page I have schema info pointing out the address and a few other points of data. I also have the address to the business location in the footer on every page. Would it be wiser to point to the schema address data on the footer instead of the contact page? And are there any best practices when it comes down to how many times you can point to the same data, and on which pages? So should I have schema address on the contact us page and the footer of that page, that would be twice, which could seem spammy. Haven't been able to find much best practices info on schema out there. Thanks, Cy
Technical SEO | | Nola5040 -
Crawl Diagnostics - How to find where broken links are located?
Hi, One of my sites has a 4xx error that has been picked up in the crawl diagnostics section. It is a broken link. Does anybody know if it is possible for me to find out which page the broken link was found on? I have checked all of the pages on the site that I thought were linking to the page that seems to have a problem but all of these links are fine / not broken. Any ideas? Thanks
Technical SEO | | CherryK0 -
Duplicate titles Question
Hi eveyone, I have around 1000 duplicate titles and meta description. The poblem was that I had pages in my home page and different pages had the same title. For example, index.php/site/articles/should_you_eat_protein_every_2-3_hours_for_muscle_growth/
Technical SEO | | anoopbal/index.php/site/articles/should_you_eat_protein_every_2-3_hours_for_muscle_growth/N12/
/index.php/site/articles/should_you_eat_protein_every_2-3_hours_for_muscle_growth/N1444/
/index.php/site/articles/should_you_eat_protein_every_2-3_hours_for_muscle_growth/N1448/
/index.php/site/articles/should_you_eat_protein_every_2-3_hours_for_muscle_growth/N1448/P6/
/index.php/site/articles/should_you_eat_protein_every_2-3_hours_for_muscle_growth/N1452/I have 172 of the same page!So I took off all the pagination on my home page and just added 'click fo more'. When they click more, it takes them to the category.So my question is will google slowly start deleting or non-indexing these duplicate titles or pages as I have removed it from my website? (Just so that you know I added a canonical link and figuring out how to add page numbers to met titles and meta description tags for categories with pages)
0 -
URL Structure Question
Hey folks, I have a weird problem and currently no idea how to fix it. We have a lot of pages showing up as duplicates although they are the same page, the only difference is the url structure. They seem to show up like: http://www.example.com/page/ and http://www.example.com/page What would I need to do to force the URLs into one format or the other to avoid having that one page counting as two? The same issue pops up with upper and lower case: http://www.example.com/Page and http://www.example.com/page Is there any solution to this or would I need to forward them with 301s or similar? Thanks, Mike
Technical SEO | | Malarowski0