Duplicate content check picking up weird urls
-
Hi everyone,
I love the duplicate content feature; we have a lot of duplicate content issues due to the way our site is structured. So, we're working on them. However, I'm not fully understanding the results. For example, say I have an article on breast cancer symptoms. It shows up as duplicate content, by having two urls that point to the exact same page. http://www.healthchoices.ca/articles/breast cancer symptoms and http://www.healthchoices.ca/somerandomstringofcode. I fully understand why that is duplicate content.
I am not sure about this though, it picks up the same url twice and calls it duplicate content. For example, saying that http://www.healthchoices.ca/dr.-so-and-so and http://www.healthchoices.ca/dr.-so-and-so is duplicate...however is this not the same page? Is there something I'm missing? Many of the URL's are identical.
Thanks,
Erin
-
Hi Erin -
Is that a Google Webmaster file?
Looking at those URLs in SERPS, it seems you have some content causing duplicates (although the file doesnt seem to represent it that way).
Here's the URLs in Google search results for Term-Life-Insurance:
- http://www.healthchoices.ca/video/insurance-and-disability-planning/term-life-insurance
- http://www.healthchoices.ca/video/insurance-and-disability-planning/term-life-insurance/montreal/quebec (duplicate of previous)
- http://www.healthchoices.ca/video-link/insurance-and-disability-planning/Term-Life-Insurance
- http://www.healthchoices.ca/video/insurance-and-disability-planning/term-life-insurance/laval/quebec (duplicate of previous)
Looking at the first two as an example, when you look at th pages themselves they are currently not exact duplicates. The first one is a video of a guy talking about term life insurance with some other video links, and the second page is a page that has an error "Error: Video Category Page is currently unavailable." where the page content should be. But that page had previously been an exact duplicate of the first URL the last time Google visited the page.
Here is the first page again:
http://www.healthchoices.ca/video/insurance-and-disability-planning/term-life-insurance
Here is the cached version of the second (duplicate) page (as I'm currently seeing it, it was last cached on Apr 19, 2011):
To see these pages (or any potential duplicate URL issues), do this search in Google:
- site:www.healthchoices.ca
- To find pages with a specific URL pattern (like the term life insurance pages) try "site:www.healthchoices.ca inurl:Term-Life-Insurance" (without the quotation marks)
- Then at the end of the URL you see in the address bar, add "&filter=0" (without the quoutes).
So what is in your browser address bar would look like this (although it may have some additional thinkgs in your URL like your previous query and your browser and language for example - that's ok):
http://www.google.com/search?q=site:www.healthchoices.ca+inurl:Term-Life-Insurance&filter=0
I'm not sure what the URL issue is that you're referring to exactly based on the info you pasted and where you may have gotten it from - but I hope this is helpful.
-
Hi Erin,
Can I enquire a little more about where you are lifting these URLs from. I'm assuming you are downloading them from a Campaign? Are the URLs in question lifted from the same row in the CSV? What is the header of the columns they are lifted from? Just need a little more specificity about what we're looking at here in order to respond fully.
-
Thanks for your responses. Hmm...I'm not sure how to do a screen shot as the only way I could see the errors was to download the file. I've pasted a few below straight from the doc
<colgroup><col width="775"><col width="968"></colgroup>
| www.healthchoices.ca/video/ice-sports/default | www.healthchoices.ca/video/ice-sports/default |
| www.healthchoices.ca/video/insurance-and-disability-planning/Key-Man-Insurance | www.healthchoices.ca/video/insurance-and-disability-planning/Key-Man-Insurance |
| www.healthchoices.ca/video/insurance-and-disability-planning/Long-Term-Care-Coverage | www.healthchoices.ca/video/insurance-and-disability-planning/Long-Term-Care-Coverage |
| www.healthchoices.ca/video/insurance-and-disability-planning/Term-Life-Insurance | www.healthchoices.ca/video/insurance-and-disability-planning/Term-Life-Insurance |
| www.healthchoices.ca/video/insurance-and-disability-planning/default | www.healthchoices.ca/video/insurance-and-disability-planning/default | -
Erin, what tool are you using to find this? It might be something to do with the language that your CMS is written in - it might also be a matter of a trailing slash or a non www. version.
I'd be happy to help if you could provide a little more info, perhaps a screen shot?
Aaron
-
Duplicate content by definition is having the same content on different URL's. I've never had the tool tell me I have duplicate content on the same URL. You must be missing something. Is it www vs non-www perhaps? I don't know how you can get identical url's showing up in there.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate Content for Locations on my Directory Site
I have a pretty big directory site using Wordpress with lots of "locations", "features", "listing-category" etc.... Duplicate Content: https://www.thecbd.co/location/california/ https://www.thecbd.co/location/canada/ referring URL is www.thecbd.co is it a matter of just putting a canonical URL on each location, or just on the main page? Would this be the correct code to put: on the main page? Thanks Everyone!
Technical SEO | | kay_nguyen0 -
Duplicate content w/ same URLs
I am getting high priority issues for our privacy & terms pages that have the same URL. Why would this show up as duplicate content? Thanks!
Technical SEO | | RanvirGujral0 -
150+ Pages of URL Parameters - Mass Duplicate Content Issue?
Hi we run a large e-commerce site and while doing some checking through GWT we came across these URL parameters and are now wondering if we have a duplicate content issue. If so, we are wodnering what is the best way to fix them, is this a task with GWT or a Rel:Canonical task? Many of the urls are driven from the filters in our category pages and are coming up like this: page04%3Fpage04%3Fpage04%3Fpage04%3F (See the image for more). Does anyone know if these links are duplicate content and if so how should we handle them? Richard I7SKvHS
Technical SEO | | Richard-Kitmondo0 -
Why are some pages now duplicate content?
It is probably a silly question, but all of a sudden, the following pages of one of my clients are reported as Duplicate content. I cannot understand why. They weren't before... http://www.ciaoitalia.nl/product/pizza-originale/mediterranea-halal
Technical SEO | | MarketingEnergy
http://www.ciaoitalia.nl/product/pizza-originale/gyros-halal
http://www.ciaoitalia.nl/product/pizza-originale/döner-halal
http://www.ciaoitalia.nl/product/pizza-originale/vegetariana
http://www.ciaoitalia.nl/product/pizza-originale/seizoen-pizza-estate
http://www.ciaoitalia.nl/product/pizza-originale/contadina
http://www.ciaoitalia.nl/product/pizza-originale/4-stagioni
http://www.ciaoitalia.nl/product/pizza-originale/shoarma Thanks for any help in the right direction 🙂 | |
| |
| |
| |
| |
| |
| |
| | <colgroup><col style="mso-width-source: userset; mso-width-alt: 17225; width: 353pt;" width="471"></colgroup>
| http://www.ciaoitalia.nl/product/pizza-originale/mediterranea-halal |
| http://www.ciaoitalia.nl/product/pizza-originale/gyros-halal |
| http://www.ciaoitalia.nl/product/pizza-originale/döner-halal |
| http://www.ciaoitalia.nl/product/pizza-originale/vegetariana |
| http://www.ciaoitalia.nl/product/pizza-originale/seizoen-pizza-estate |
| http://www.ciaoitalia.nl/product/pizza-originale/contadina |
| http://www.ciaoitalia.nl/product/pizza-originale/4-stagioni |
| http://www.ciaoitalia.nl/product/pizza-originale/shoarma |0 -
Avoiding Cannibalism and Duplication with content
Hi, For the example I will use a computers e-commerce store... I'm working on creating guides for the store -
Technical SEO | | BeytzNet
How to choose a laptop
How to choose a desktop I believe that each guide will be great on its own and that it answers a specific question (meaning that someone looking for a laptop will search specifically laptop info and the same goes for desktop). This is why I didn't creating a "How to choose a computer" guide. I also want each guide to have all information and not to start sending the user to secondary pages in order to fill in missing info. However, even though there are several details that are different between the laptops and desktops, like importance of weight, screen size etc., a lot of things the checklist (like deciding on how much memory is needed, graphic card, core etc.) are the same. Please advise on how to pursue it. Should I just write two guides and make sure that the same duplicated content ideas are simply written in a different way?0 -
What could be the cause of this duplicate content error?
I only have one index.htm and I'm seeing a duplicate content error. What could be causing this? IUJvfZE.png
Technical SEO | | ScottMcPherson1 -
Duplicate Page Content for sorted archives?
Experienced backend dev, but SEO newbie here 🙂 When SEOmoz crawls my site, I get notified of DPC errors on some list/archive sorted pages (appending ?sort=X to the url). The pages all have rel=canonical to the archive home. Some of the pages are shorter (have only one or two entries). Is there a way to resolve this error? Perhaps add rel=nofollow to the sorting menu? Or perhaps find a method that utilizes a non-link navigation method to sort / switch sorted pages? No issues with duplicate content are showing up on google webmaster tools. Thanks for your help!
Technical SEO | | jwondrusch0 -
Mapping Internal Links (Which are causing duplicate content)
I'm working on a site that is throwing off a -lot- of duplicate content for its size. A lot of it appears to be coming from bad links within the site itself, which were caused when it was ported over from static HTML to Expression Engine (by someone else). I'm finding EE an incredibly frustrating platform to work with, as it appears to be directing 404's on sub-pages to the page directly above that subpage, without actually providing a 404 response. It's very weird. Does anyone have any recommendations on software to clearly map out a site's internal link structure so that I can find what bad links are pointing to the wrong pages?
Technical SEO | | BedeFahey0