URL Capitalization Inconsistencies Registering Duplicate Content Crawl Errors
-
Hello,
I have a very large website that has a good amount of "Duplicate Content" issues according to MOZ. In reality though, it is not a problem with duplicate content, but rather a problem with URLs. For example: http://acme.com/product/features and http://acme.com/Product/Features both land on the same page, but MOZ is seeing them as separate pages, therefor assuming they are duplicates.
We have recently implemented a solution to automatically de-captialize all characters in the URL, so when you type acme.com/Products, the URL will automatically change to acme.com/products – but MOZ continues to flag multiple "Duplicate Content" issues. I noticed that many of the links on the website still have the uppercase letters in the URL even though when clicked, the URL changes to all lower case. Could this be causing the issue?
What is the best way to remove the "Duplicate Content" issues that are not actually duplicate content?
-
http://moz.com/learn/seo/canonicalization
"Another option for dealing with duplicate content is to utilize the rel=canonical tag. The rel=canonical tag passes the same amount of link juice (ranking power) as a 301 redirect, and often takes much less development time to implement."
-
If you check Google Analytics, GA is probably seeing it too. We had a similar problem. Canonicalization will help with duplicate content, but it won't help with rankings. Internally, you are sending link juice to multiple versions of the same page. In addition, you could have backlinks pointing at multiple duplicate pages, and splitting the link love.
Canonicalization does not transfer link juice the way a 301 Redirect does. All the canonical tag does is tell Google "Rank This Page". If you don't care about rankings the canonical is fine. If you do care, you need to 301 all of your pages to the lower case version.
If you decide to 301, first, build an HTML sitemap with all of the uppercase URLs. After you do the 301, have Google fetch the sitemap and submit it, This will help Googlebot wind all of the pages that were 301ed.
-
Hey man. If your store is a Magento store, there are settings for adding canonicalization tags to categories and products under:
System => Configuration => Catalog => Search Engine Optimization
h/t to Yoast for reminding me of the string to get there.
I am optimizing a Magento store that had a similar issue after a relaunch. Found that to be a very easy way to fix it.
Hope this helps.
-
I have a complex CMS for my main website; I just asked my developer to do it. On my Wordpress sites, I use an SEO plugin for this (Yoast).
-
Thank you so much Linda.
Do you know of a fast way to add a rel=canonical tag to all the pages, the website is quite large, and it would likely take months over months to do it manually.
-
Hi,
There is the same question here on moz: Duplicate Content and URL Capitalization
-
The best way to fix this is with a rel=canonical URL. Tag each page with the lower-case version. (I had this same problem.)
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Content Strategy/Duplicate Content Issue, rel=canonical question
Hi Mozzers: We have a client who regularly pays to have high-quality content produced for their company blog. When I say 'high quality' I mean 1000 - 2000 word posts written to a technical audience by a lawyer. We recently found out that, prior to the content going on their blog, they're shipping it off to two syndication sites, both of which slap rel=canonical on them. By the time the content makes it to the blog, it has probably appeared in two other places. What are some thoughts about how 'awful' a practice this is? Of course, I'm arguing to them that the ranking of the content on their blog is bound to be suffering and that, at least, they should post to their own site first and, if at all, only post to other sites several weeks out. Does anyone have deeper thinking about this?
Intermediate & Advanced SEO | | Daaveey0 -
Query based site; duplicate content; seo juice flow.
Hi guys, We're planning on starting a Saas based service where we'll be selling different skins. Let's say WordPress themes, though it's not about that. Say we have an url called site.com/ and we would like to direct all seo juice to the mother landing page /best-wp-themes/ but then have that juice flow towards our additional pages: /best-wp-themes/?id=Mozify
Intermediate & Advanced SEO | | andy.bigbangthemes
/best-wp-themes/?id=Fiximoz /best-wp-themes/?id=Mozicom Challenges: 1. our content would be formatted like this:
a. Same content - features b. Same content - price c. Different content - each theme will have its own set of features / design specs. d. Same content - testimonials. How would be go about not being penalised by SE's for the duplicate content, but still have the /?id=whatever pages be indexed with proper content? 2. How do we go about making sure SEO juice flows to the /?id pages too?Basically it's the same thing with different skins. Thanks for the help!0 -
Webmaster is giving errors of Duplicate Meta Descriptions and Duplicate Title Tags
Webmaster is giving errors of Duplicate Meta Descriptions and Duplicate Title Tags after I changes the permalinks structure in wordpress. It there a quick fix for this and how damaging is the above for seo. Thanks T
Intermediate & Advanced SEO | | Taiger0 -
Concerns of Duplicative Content on Purchased Site
Recently I purchased a site of 50+ DA (oldsite.com) that had been offline/404 for 9-12 months from the previous owner. The purchase included the domain and the content previously hosted on the domain. The backlink profile is 100% contextual and pristine. Upon purchasing the domain, I did the following: Rehosted the old site and content that had been down for 9-12 months on oldsite.com Allowed a week or two for indexation on oldsite.com Hosted the old content on my newsite.com and then performed 100+ contextual 301 redirects from the oldsite.com to newsite.com using direct and wild card htaccess rules Issued a Press Release declaring the acquisition of oldsite.com for newsite.com Performed a site "Change of Name" in Google from oldsite.com to newsite.com Performed a site "Site Move" in Bing/Yahoo from oldsite.com to newsite.com It's been close to a month and while organic traffic is growing gradually, it's not what I would expect from a domain with 700+ referring contextual domains. My current concern is around original attribution of content on oldsite.com shifting to scraper sites during the year or so that it was offline. For Example: Oldsite.com has full attribution prior to going offline Scraper sites scan site and repost content elsewhere (effort unsuccessful at time because google know original attribution) Oldsite.com goes offline Scraper sites continue hosting content Google loses consumer facing cache from oldsite.com (and potentially loses original attribution of content) Google reassigns original attribution to a scraper site Oldsite.com is hosted again and Google no longer remembers it's original attribution and thinks content is stolen Google then silently punished Oldsite.com and Newsite.com (which it is redirected to) QUESTIONS Does this sequence have any merit? Does Google keep track of original attribution after the content ceases to exist in Google's search cache? Are there any tools or ways to tell if you're being punished for content being posted else on the web even if you originally had attribution? Unrelated: Are there any other steps that are recommend for a Change of site as described above.
Intermediate & Advanced SEO | | PetSite0 -
Duplicate Content For Product Alternative listing
Hi I have a tricky one here. cloudswave is a directory of products and we are launching new pages called Alternatives to Product X This page displays 10 products that are an alternative to product X (Page A) Lets say now you want to have the alternatives to a similar product within the same industry, product Y (Page B), you will have 10 product alternatives, but this page will be almost identical to Page A as the products are in similar and in the same industry. Maybe one to two products will differ in the 2 listings. Now even SEO tags are different, aren't those two pages considered duplicate content? What are your suggestions to avoid this problem? thank you guys
Intermediate & Advanced SEO | | RSedrati0 -
Magento products and eBay - duplicate content risk?
Hi, We are selling about 1000 sticker products in our online store and would like to expand a large part of our products lineup to eBay as well. There are pretty good modules for this as I've heard. I'm just wondering if there will be duplicate content problems if I sync the products between Magento and eBay and they get uploaded to eBay with identical titles, descriptions and images? What's the workaround in this case? Thanks!
Intermediate & Advanced SEO | | speedbird12290 -
Why are these pages considered duplicate content?
I have a duplicate content warning in our PRO account (well several really) but I can't figure out WHY these pages are considered duplicate content. They have different H1 headers, different sidebar links, and while a couple are relatively scant as far as content (so I might believe those could be seen as duplicate), the others seem to have a substantial amount of content that is different. It is a little perplexing. Can anyone help me figure this out? Here are some of the pages that are showing as duplicate: http://www.downpour.com/catalogsearch/advanced/byNarrator/narrator/Seth+Green/?bioid=5554 http://www.downpour.com/catalogsearch/advanced/byAuthor/author/Solomon+Northup/?bioid=11758 http://www.downpour.com/catalogsearch/advanced/byNarrator/?mediatype=audio+books&bioid=3665 http://www.downpour.com/catalogsearch/advanced/byAuthor/author/Marcus+Rediker/?bioid=10145 http://www.downpour.com/catalogsearch/advanced/byNarrator/narrator/Robin+Miles/?bioid=2075
Intermediate & Advanced SEO | | DownPour0 -
Duplicate Content | eBay
My client is generating templates for his eBay template based on content he has on his eCommerce platform. I'm 100% sure this will cause duplicate content issues. My question is this.. and I'm not sure where eBay policy stands with this but adding the canonical tag to the template.. will this work if it's coming from a different page i.e. eBay? Update: I'm not finding any information regarding this on the eBay policy's: http://ocs.ebay.com/ws/eBayISAPI.dll?CustomerSupport&action=0&searchstring=canonical So it does look like I can have rel="canonical" tag in custom eBay templates but I'm concern this can be considered: "cheating" since rel="canonical is actually a 301 but as this says: http://googlewebmastercentral.blogspot.com/2009/12/handling-legitimate-cross-domain.html it's legitimately duplicate content. The question is now: should I add it or not? UPDATE seems eBay templates are embedded in a iframe but the snap shot on google actually shows the template. This makes me wonder how they are handling iframes now. looking at http://www.webmaster-toolkit.com/search-engine-simulator.shtml does shows the content inside the iframe. Interesting. Anyone else have feedback?
Intermediate & Advanced SEO | | joseph.chambers1