Duplicate content check picking up weird urls
-
Hi everyone,
I love the duplicate content feature; we have a lot of duplicate content issues due to the way our site is structured. So, we're working on them. However, I'm not fully understanding the results. For example, say I have an article on breast cancer symptoms. It shows up as duplicate content, by having two urls that point to the exact same page. http://www.healthchoices.ca/articles/breast cancer symptoms and http://www.healthchoices.ca/somerandomstringofcode. I fully understand why that is duplicate content.
I am not sure about this though, it picks up the same url twice and calls it duplicate content. For example, saying that http://www.healthchoices.ca/dr.-so-and-so and http://www.healthchoices.ca/dr.-so-and-so is duplicate...however is this not the same page? Is there something I'm missing? Many of the URL's are identical.
Thanks,
Erin
-
Hi Erin -
Is that a Google Webmaster file?
Looking at those URLs in SERPS, it seems you have some content causing duplicates (although the file doesnt seem to represent it that way).
Here's the URLs in Google search results for Term-Life-Insurance:
- http://www.healthchoices.ca/video/insurance-and-disability-planning/term-life-insurance
- http://www.healthchoices.ca/video/insurance-and-disability-planning/term-life-insurance/montreal/quebec (duplicate of previous)
- http://www.healthchoices.ca/video-link/insurance-and-disability-planning/Term-Life-Insurance
- http://www.healthchoices.ca/video/insurance-and-disability-planning/term-life-insurance/laval/quebec (duplicate of previous)
Looking at the first two as an example, when you look at th pages themselves they are currently not exact duplicates. The first one is a video of a guy talking about term life insurance with some other video links, and the second page is a page that has an error "Error: Video Category Page is currently unavailable." where the page content should be. But that page had previously been an exact duplicate of the first URL the last time Google visited the page.
Here is the first page again:
http://www.healthchoices.ca/video/insurance-and-disability-planning/term-life-insurance
Here is the cached version of the second (duplicate) page (as I'm currently seeing it, it was last cached on Apr 19, 2011):
To see these pages (or any potential duplicate URL issues), do this search in Google:
- site:www.healthchoices.ca
- To find pages with a specific URL pattern (like the term life insurance pages) try "site:www.healthchoices.ca inurl:Term-Life-Insurance" (without the quotation marks)
- Then at the end of the URL you see in the address bar, add "&filter=0" (without the quoutes).
So what is in your browser address bar would look like this (although it may have some additional thinkgs in your URL like your previous query and your browser and language for example - that's ok):
http://www.google.com/search?q=site:www.healthchoices.ca+inurl:Term-Life-Insurance&filter=0
I'm not sure what the URL issue is that you're referring to exactly based on the info you pasted and where you may have gotten it from - but I hope this is helpful.
-
Hi Erin,
Can I enquire a little more about where you are lifting these URLs from. I'm assuming you are downloading them from a Campaign? Are the URLs in question lifted from the same row in the CSV? What is the header of the columns they are lifted from? Just need a little more specificity about what we're looking at here in order to respond fully.
-
Thanks for your responses. Hmm...I'm not sure how to do a screen shot as the only way I could see the errors was to download the file. I've pasted a few below straight from the doc
<colgroup><col width="775"><col width="968"></colgroup>
| www.healthchoices.ca/video/ice-sports/default | www.healthchoices.ca/video/ice-sports/default |
| www.healthchoices.ca/video/insurance-and-disability-planning/Key-Man-Insurance | www.healthchoices.ca/video/insurance-and-disability-planning/Key-Man-Insurance |
| www.healthchoices.ca/video/insurance-and-disability-planning/Long-Term-Care-Coverage | www.healthchoices.ca/video/insurance-and-disability-planning/Long-Term-Care-Coverage |
| www.healthchoices.ca/video/insurance-and-disability-planning/Term-Life-Insurance | www.healthchoices.ca/video/insurance-and-disability-planning/Term-Life-Insurance |
| www.healthchoices.ca/video/insurance-and-disability-planning/default | www.healthchoices.ca/video/insurance-and-disability-planning/default | -
Erin, what tool are you using to find this? It might be something to do with the language that your CMS is written in - it might also be a matter of a trailing slash or a non www. version.
I'd be happy to help if you could provide a little more info, perhaps a screen shot?
Aaron
-
Duplicate content by definition is having the same content on different URL's. I've never had the tool tell me I have duplicate content on the same URL. You must be missing something. Is it www vs non-www perhaps? I don't know how you can get identical url's showing up in there.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate Page Content
Hello, After crawling our site Moz is detecting high priority duplicate page content for our product and article listing pages, For example http://store.bmiresearch.com/bangladesh/power and http://store.bmiresearch.com/newzealand/power are being listed as duplicate pages although they have seperate URLs, page titles and H1 tags. They have the same product listed but I would have thought the differentiation in other areas would be sufficient for these to not be deemed as duplicate pages. Is it likely this issue will be impacting on our search rankings? If so are there any recommendations as to how this issue can be overcome. Thanks
Technical SEO | | carlsutherland0 -
Duplicate page content & titles on the same domain
Hey, My website: http://www.electromarket.co.uk is running Magento Enterprise. The issue I'm running into is that the URLs can be shortened and modified to display different things on the website itself. Here's a few examples. Product Page URL: http://www.electromarket.co.uk/speakers-audio-equipment/dj-pa-speakers/studio-bedroom-monitors/bba0051 OR I could remove everything in the URL and just have: http://www.electromarket.co.uk/bba0051 and the link will work just as well. Now my problem is, these two URL's load the same page title, same content, same everything, because essentially they are the very same web page. But how do I tell Google that? Do I need to tell Google that? And would I benefit by using a redirect for the shorter URLs? Thanks!
Technical SEO | | tomhall900 -
Changed URL of all web pages to a new updated one - Keywords still pick the old URL
A month ago we updated our website and with that we created new URLs for each page. Under "On-Page", the keywords we put to check ranking on are still giving information on the old urls of our websites. Slowly, some new URLs are popping up. I'm wondering if there's a way I can manually make the keywords feedback information from the new urls.
Technical SEO | | Champions0 -
Duplicate Content - That Old Chestnut!!!
Hi Guys, Hope all is well, I have a question if I may? I have several articles which we have written and I want to try and find out the best way to post these but I have a number of concerns. I am hoping to use the content in an attempt to increase the Kudos of our site by providing quality content and hopefully receiving decent back links. 1. In terms of duplicate content should I only post it on one place or should I post the article in several places? Also where would you say the top 5 or 10 places would be? These are articles on XML, Social Media & Back Links. 2. Can I post the article on another blog or article directory and post it on my websites blog or is this a bad idea? A million thanks for any guidance. Kind Regards, C
Technical SEO | | fenwaymedia0 -
Duplicate content + wordpress tags
According to SEOMoz platform, one of my wordpress websites deals with duplicate content because of the tags I use. How should I fix it? Is it loyal to remove tag links from the post pages?
Technical SEO | | giankar0 -
Mod Rewrite question to prevent duplicate content
Hi, I'm having problems with a mod rewrite issue and duplicate content On my website I have Website.com Website.com/directory Website.com/directory/Sub_directory_more_stuff_here Both #1 and #2 are the same page (I can't change this). #3 is different pages. How can I use mod rewrite to to make #2 redirect to #1 so I don't have duplicate content WHILE #3 still works?
Technical SEO | | kat20 -
Duplicate Content from Google URL Builder
Hello to the SEOmoz community! I am new to SEOmoz, SEO implementation, and the community and recently set up a campaign on one of the sites I managed. I was surprised at the amount of duplicate content that showed up as errors and when I took a look in deeper, the majority of errors were caused by pages on the root domain I put through Google Analytics URL Builder. After this, I went into webmaster tools and changed the parameter handling to ignore all of the tags the URL Builder adds to the end of the domain. SEOmoz recently recrawled my site and the errors being caused by the URL Builder are still being shown as duplicates. Any suggestions on what to do?
Technical SEO | | joshuaopinion0 -
Duplicate content issues caused by our CMS
Hello fellow mozzers, Our in-house CMS - which is usually good for SEO purposes as it allows all the control over directories, filenames, browser titles etc that prevent unwieldy / meaningless URLs and generic title tags - seems to have got itself into a bit of a tiz when it comes to one of our clients. We have tried solving the problem to no avail, so I thought I'd throw it open and see if anyone has a soultion, or whether it's just a fault in our CMS. Basically, the SEs are indexing two identical pages, one ending with a / and the other ending /index.php, for one of our sites (www.signature-care-homes.co.uk). We have gone through the site and made sure the links all point to just one of these, and have done the same for off-site links, but there is still the duplicate content issue of both versions getting indexed. We also set up an htaccess file to redirect to the chosen version, but to no avail, and we're not sure canonical will work for this issue as / pages should redirect to /index.php anyway - and that's we can't work out. We have set the access file to point to index.php, and that should be what should be happening anyway, but it isn't. Is there an alternative way of telling the SE's to only look at one of these two versions? Also, we are currently rewriting the content and changing the structure - will this change the situation we find ourselves in?
Technical SEO | | themegroup0