Duplicate content check picking up weird urls
-
Hi everyone,
I love the duplicate content feature; we have a lot of duplicate content issues due to the way our site is structured. So, we're working on them. However, I'm not fully understanding the results. For example, say I have an article on breast cancer symptoms. It shows up as duplicate content, by having two urls that point to the exact same page. http://www.healthchoices.ca/articles/breast cancer symptoms and http://www.healthchoices.ca/somerandomstringofcode. I fully understand why that is duplicate content.
I am not sure about this though, it picks up the same url twice and calls it duplicate content. For example, saying that http://www.healthchoices.ca/dr.-so-and-so and http://www.healthchoices.ca/dr.-so-and-so is duplicate...however is this not the same page? Is there something I'm missing? Many of the URL's are identical.
Thanks,
Erin
-
Hi Erin -
Is that a Google Webmaster file?
Looking at those URLs in SERPS, it seems you have some content causing duplicates (although the file doesnt seem to represent it that way).
Here's the URLs in Google search results for Term-Life-Insurance:
- http://www.healthchoices.ca/video/insurance-and-disability-planning/term-life-insurance
- http://www.healthchoices.ca/video/insurance-and-disability-planning/term-life-insurance/montreal/quebec (duplicate of previous)
- http://www.healthchoices.ca/video-link/insurance-and-disability-planning/Term-Life-Insurance
- http://www.healthchoices.ca/video/insurance-and-disability-planning/term-life-insurance/laval/quebec (duplicate of previous)
Looking at the first two as an example, when you look at th pages themselves they are currently not exact duplicates. The first one is a video of a guy talking about term life insurance with some other video links, and the second page is a page that has an error "Error: Video Category Page is currently unavailable." where the page content should be. But that page had previously been an exact duplicate of the first URL the last time Google visited the page.
Here is the first page again:
http://www.healthchoices.ca/video/insurance-and-disability-planning/term-life-insurance
Here is the cached version of the second (duplicate) page (as I'm currently seeing it, it was last cached on Apr 19, 2011):
To see these pages (or any potential duplicate URL issues), do this search in Google:
- site:www.healthchoices.ca
- To find pages with a specific URL pattern (like the term life insurance pages) try "site:www.healthchoices.ca inurl:Term-Life-Insurance" (without the quotation marks)
- Then at the end of the URL you see in the address bar, add "&filter=0" (without the quoutes).
So what is in your browser address bar would look like this (although it may have some additional thinkgs in your URL like your previous query and your browser and language for example - that's ok):
http://www.google.com/search?q=site:www.healthchoices.ca+inurl:Term-Life-Insurance&filter=0
I'm not sure what the URL issue is that you're referring to exactly based on the info you pasted and where you may have gotten it from - but I hope this is helpful.
-
Hi Erin,
Can I enquire a little more about where you are lifting these URLs from. I'm assuming you are downloading them from a Campaign? Are the URLs in question lifted from the same row in the CSV? What is the header of the columns they are lifted from? Just need a little more specificity about what we're looking at here in order to respond fully.
-
Thanks for your responses. Hmm...I'm not sure how to do a screen shot as the only way I could see the errors was to download the file. I've pasted a few below straight from the doc
<colgroup><col width="775"><col width="968"></colgroup>
| www.healthchoices.ca/video/ice-sports/default | www.healthchoices.ca/video/ice-sports/default |
| www.healthchoices.ca/video/insurance-and-disability-planning/Key-Man-Insurance | www.healthchoices.ca/video/insurance-and-disability-planning/Key-Man-Insurance |
| www.healthchoices.ca/video/insurance-and-disability-planning/Long-Term-Care-Coverage | www.healthchoices.ca/video/insurance-and-disability-planning/Long-Term-Care-Coverage |
| www.healthchoices.ca/video/insurance-and-disability-planning/Term-Life-Insurance | www.healthchoices.ca/video/insurance-and-disability-planning/Term-Life-Insurance |
| www.healthchoices.ca/video/insurance-and-disability-planning/default | www.healthchoices.ca/video/insurance-and-disability-planning/default | -
Erin, what tool are you using to find this? It might be something to do with the language that your CMS is written in - it might also be a matter of a trailing slash or a non www. version.
I'd be happy to help if you could provide a little more info, perhaps a screen shot?
Aaron
-
Duplicate content by definition is having the same content on different URL's. I've never had the tool tell me I have duplicate content on the same URL. You must be missing something. Is it www vs non-www perhaps? I don't know how you can get identical url's showing up in there.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Tags, Categories, & Duplicate Content
Looking for some advice on a duplicate content issue that we're having that definitely isn't unique to us. See, we are allowing all our tag and category pages, as well as our blog pagination be indexed and followed, but Moz is detecting that all as duplicate content, which is obvious since it is the same content that is on our blog posts. We've decided in the past to keep these pages the way they are as it hasn't seemed to hurt us specifically and we hoped it would help our overall ranking. We haven't seen positive or negative signals either way, just the warnings from Moz. We are wondering if we should noindex these pages and if that could cause a positive change, but we're worried it might cause a big negative change as well. Have you confronted this issue? What did you decide and what were the results? Thanks in advance!
Technical SEO | | bradhodson0 -
Https Duplicate Content
My previous host was using shared SSL, and my site was also working with https which I didn’t notice previously. Now I am moved to a new server, where I don’t have any SSL and my websites are not working with https version. Problem is that I have found Google have indexed one of my blog http://www.codefear.com with https version too. My blog traffic is continuously dropping I think due to these duplicate content. Now there are two results one with http version and another with https version. I searched over the internet and found 3 possible solutions. 1 No-Index https version
Technical SEO | | RaviAhuja
2 Use rel=canonical
3 Redirect https versions with 301 redirection Now I don’t know which solution is best for me as now https version is not working. One more thing I don’t know how to implement any of the solution. My blog is running on WordPress. Please help me to overcome from this problem, and after solving this duplicate issue, do I need Reconsideration request to Google. Thank you0 -
Tired of finding solution for duplicate contents.
Just my site was scanned by seomoz and seen lots of duplicate content and titles found. Well I am tired of finding solutions of duplicate content for a shopping site product category page. You can see the screenshot below. http://i.imgur.com/TXPretv.png You can see below in every link its showing "items_per_page=64, 128 etc.". This happened in every category in which I was created. I am already using Canonical add-on to avoid this problem but still it's there. You can check my domain here - http://www.plugnbuy.com/computer-software/pc-security/antivirus-internet-security/ and see if the add-on working correct. I recently submitted my sitemap to GWT, so that's why it's not showing me any report regarding duplicate issues. Please help ME
Technical SEO | | chandubaba0 -
Category URL Duplicate Content
I've recently been hired as the web developer for a company with an existing web site. Their web architecture includes category names in product urls, and of course we have many products in multiple categories thus generating duplicate content. According to the SEOMoz Site Crawl, we have roughly 1600 pages of duplicate content, I expect primarily from this issue. This is out of roughly 3600 pages crawled. My questions are: 1. Fixing this for the long term will obviously mean restructuring the URLs for the site. Is this worthwhile and what will the ramifications be of performing such a move? 2. How can I determine the level and extent of the effects of this duplicated content? 3. Is it possible the best course of action is to do nothing? The site has many, many other issues, and I'm not sure how highly to prioritize this problem. In addition, the IT man is highly doubtful this is causing an SEO issue, and I'm going to need to be able to back up any action I request. I do feel I will need to strongly justify any possible risks this level of site change could cause. Thanks in advance, and please let me know if any more information is needed.
Technical SEO | | MagnetsUSA0 -
Duplicate Content - Mobile Site
We think that a mobile version of our site is causing a duplicate content issue; what's the best way to stop the mobile version being indexed. Basically the site forwards mobile users to "/mobile" which is just a mobile optimised version of the original site. Is it best to block the /mobile folder from being crawled?
Technical SEO | | nsmith7870 -
Business/Personal Blog Duplicate Content
Quick Question. I am in the process of launching a new website for my IT business which will include a blog. I also want to start up my personal blog again. I want to publish some blog posts to both my business and personal blogs but I don't want to have any duplicate content issues. I am not concerned with building the SERPs of my personal blog but I am very focused on the business blog/site. I am looking for some ideas of how I can publish content to both sites without getting hurt by duplicate content. Again, I am not concerned with building up the placement of my personal site but I do want to have a strong personal site that helps build my name. Any help on this would be great. Thanks!
Technical SEO | | ZiaTG0 -
How to resolve this Duplicate content?
Hi , There is page i get when i do proper menu navigation Caratlane.com>jewellery>rings>casualsrings> http://www.caratlane.com/jewellery/rings/casual-rings/leaves-dew-diamond-0-03-ct-peridot-1-ct-ring-18k-yellow-gold.html When i do a site search in my search box by my product code number "JR00219" The same page is appears with different url http://www.caratlane.com/leaves-dew-diamond-0-03-ct-peridot-1-ct-ring-18k-yellow-gold.html So there is a duplicate content. How can we resolve it. Regards, kathir caratlane.com
Technical SEO | | kathiravan0 -
Duplicate Content Penalties, International Sites
We're in the process of rolling out a new domestic (US) website design. If we copy the same theme/content to our International subsidiaries, would the duplicate content penalty still apply? All International sites would carry the Country specific domain, .co.uk, .eu, etc. This question is for English only content, I'm assuming translated content would not carry a penalty.
Technical SEO | | endlesspools0