Question about duplicate content in crawl reports
-
Okay, this one's a doozie:
My crawl report is listing all of these as separate URLs with identical duplicate content issues, even though they are all the home page and the one that is http://www.ccisolutions.com (the preferred URL) has a canonical tag of rel= http://www.ccisolutions.com:
http://www.ccisolutions.com/StoreFront/IAFDispatcher?iafAction=showMain
I will add that OSE is recognizing that there is a 301-redirect on http://ccisolutions.com, but the duplicate content report doesn't seem to recognize the redirect.
Also, every single one of our 404-error pages (we have set up a custom 404 page) is being identified as having duplicate content. The duplicate content on all of them is identical.
Where do I even begin sorting this out? Any suggestions on how/why this is happening?
Thanks!
-
Well- I confirmed it when a crawl came back with 12,500 errors ( all from email a friend url ) which is a no crawl page.
Over the last 2 weeks we made sure our site was 100% with a revalidation again with W3C and came back 100% and google now is crawling us 2 to 3 times a week.
So- I think the crawl at Moz went out and drank a bit to many cold ones....
Have a good holiday.
Chad -
Thanks very much Chad. Yes, I kinda thought the same thing, but it's good to hear from someone else. I think it's a perfect example of using common sense and "know-how" at the same time as using tools, and not to blindly trust all of the results the tools feed us. If something looks fishy, it probably is!
Sorry it took me so long to respond and mark this one as answered. I appreciate it!
Dana
-
Dana-
I was waiting for someone to step up and say something. It is happening to us. I was on a consultant call with Jason Dowdell related to another topic with our site and I brought this up. We then did several different investigations regarding this and discovered there has to be a gilt. We ran some quick analysis and discovered what I call- Bullshhhhht.
We think reviewed about 200 pages and discovered that not 1 single page had duplicate anything.
He told me to worry about other things- like real content created by humans.
Chad
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Hosted Wordpress Blog creating Duplicate Content
In my first report from SEOmoz, I see that there are a bunch of "duplicate content" errors that originate from our blog hosted on Wordpress. For example, it's showing that the following URLs all have duplicate content: http://blog.kultureshock.net/2012/11/20/the-secret-merger/ys/
Technical SEO | | TomHu
http://blog.kultureshock.net/2012/11/16/vendome-prize-website/gallery-7701/
http://blog.kultureshock.net/2012/11/20/the-secret-merger/sm/
http://blog.kultureshock.net/2012/11/26/top-ten-tips-to-mastering-the-twitterverse/unknown/
http://blog.kultureshock.net/2012/11/20/the-secret-merger/bv/ They all lead to the various images that have been used in various blog posts. But, I'm not sure why they are considered duplicate content because they have unique URLs and the title meta tag is unique for each one, too. But even so, I don't want these extraneous URLs cluttering up our search results, so, I'm removing all of the links that were automatically created when placing the images in the posts. But, once I do that, will these URLs eventually disappear, or continue to be there? Because our blog is hosted by Wordpress, I unfortunately can't add any of the SEO plugins I've read about, so, wondering how to fix this without special plugins. Thanks!
Tom0 -
What is the best practice to handle duplicate content?
I have several large sections that SEOMOZ is indicating has duplicate content, even though the content is not identical. For example: Leather Passport Section - Leather Passports - Black - Leather Passposts - Blue - Leather Passports - Tan - Etc. Each of the items has good content, but it is identical, since they are the same products. What is the best practice here: 1. Have only one product with a drop down (fear is that this is not best for the customer) 2. Make up content to have them sound different? 3. Put a do-no-follow on the passport section? 4. Use a rel canonical even though the sections are technically not identical? Thanks!
Technical SEO | | trophycentraltrophiesandawards0 -
Pages with different content and meta description marked as duplicate content
I am running into an issue where I have pages with completely different body and meta description but they are still being marked as having the same content (Duplicate Page Content error). What am I missing here? Examples: http://www.wallstreetoasis.com/forums/what-to-expect-in-the-summer-internship
Technical SEO | | WallStreetOasis.com
and
http://www.wallstreetoasis.com/blog/something-ventured http://www.wallstreetoasis.com/forums/im-in-the-long-run
and
http://www.wallstreetoasis.com/image/jhjpeg0 -
How to prevent duplicate content in archives?
My news site has a number of excerpts in the form of archives based on categories that is causing duplicate content problems. Here's an example with the nutrition archive. The articles here are already posts, so it creates the duplicate content. Should I nofollow/noindex this category page along with the rest and 2011,2012 archives etc (see archives here)? Thanks so much for any input!
Technical SEO | | naturalsociety0 -
API for testing duplicate content
Does anyone know a service or API or php lib to compare two (or more) pages and to return their similiarity (Level-3-Shingles). API would be greatly prefered.
Technical SEO | | Sebes0 -
Query string in url - duplicate content?
Hi everyone I would appreciate some advice on the following. I have a page which has some nice content on but it also has a search functionality. When a search is run a querystrong is run. So i will get something like mypage.php?id=20 etc. With many different url potentials, will each query string be seen as a different page? If so i don't want duplicate content. So am i best putting canonical tags in the head tags on mypage.php ? to avoid Google seeing potential duplicate content. Many thanks for all your advice.
Technical SEO | | pauledwards0 -
Crawl report showing only 1 crawled page
Hi, I´m really new to this and have just setup some Campaigns. I have setup a Campaign for the root domain: portaldeldiablo.com.uy which returned only 2 crawled pages.. As this page had a 301 redirect from the non-www to the www version, I deleted this Campaign and setup a new one for www.portaldeldiablo.com.uy which returned only 1 crawled page.. I really don´t know why is my website not being crawled..Thanks in advance for your help.
Technical SEO | | ceci27100 -
Canonical Link for Duplicate Content
A client of ours uses some unique keyword tracking for their landing pages where they append certain metrics in a query string, and pulls that information out dynamically to learn more about their traffic (kind of like Google's UTM tracking). Non-the-less these query strings are now being indexed as separate pages in Google and Yahoo and are being flagged as duplicate content/title tags by the SEOmoz tools. For example: Base Page: www.domain.com/page.html
Technical SEO | | kchandler
Tracking: www.domain.com/page.html?keyword=keyword#source=source Now both of these are being indexed even though it is only one page. So i suggested placing an canonical link tag in the header point back to the base page to start discrediting the tracking URLs: But this means that the base pages will be pointing to themselves as well, would that be an issue? Is their a better way to solve this issue without removing the query tracking all togther? Thanks - Kyle Chandler0