URL Capitalization Inconsistencies Registering Duplicate Content Crawl Errors
-
Hello,
I have a very large website that has a good amount of "Duplicate Content" issues according to MOZ. In reality though, it is not a problem with duplicate content, but rather a problem with URLs. For example: http://acme.com/product/features and http://acme.com/Product/Features both land on the same page, but MOZ is seeing them as separate pages, therefor assuming they are duplicates.
We have recently implemented a solution to automatically de-captialize all characters in the URL, so when you type acme.com/Products, the URL will automatically change to acme.com/products – but MOZ continues to flag multiple "Duplicate Content" issues. I noticed that many of the links on the website still have the uppercase letters in the URL even though when clicked, the URL changes to all lower case. Could this be causing the issue?
What is the best way to remove the "Duplicate Content" issues that are not actually duplicate content?
-
http://moz.com/learn/seo/canonicalization
"Another option for dealing with duplicate content is to utilize the rel=canonical tag. The rel=canonical tag passes the same amount of link juice (ranking power) as a 301 redirect, and often takes much less development time to implement."
-
If you check Google Analytics, GA is probably seeing it too. We had a similar problem. Canonicalization will help with duplicate content, but it won't help with rankings. Internally, you are sending link juice to multiple versions of the same page. In addition, you could have backlinks pointing at multiple duplicate pages, and splitting the link love.
Canonicalization does not transfer link juice the way a 301 Redirect does. All the canonical tag does is tell Google "Rank This Page". If you don't care about rankings the canonical is fine. If you do care, you need to 301 all of your pages to the lower case version.
If you decide to 301, first, build an HTML sitemap with all of the uppercase URLs. After you do the 301, have Google fetch the sitemap and submit it, This will help Googlebot wind all of the pages that were 301ed.
-
Hey man. If your store is a Magento store, there are settings for adding canonicalization tags to categories and products under:
System => Configuration => Catalog => Search Engine Optimization
h/t to Yoast for reminding me of the string to get there.
I am optimizing a Magento store that had a similar issue after a relaunch. Found that to be a very easy way to fix it.
Hope this helps.
-
I have a complex CMS for my main website; I just asked my developer to do it. On my Wordpress sites, I use an SEO plugin for this (Yoast).
-
Thank you so much Linda.
Do you know of a fast way to add a rel=canonical tag to all the pages, the website is quite large, and it would likely take months over months to do it manually.
-
Hi,
There is the same question here on moz: Duplicate Content and URL Capitalization
-
The best way to fix this is with a rel=canonical URL. Tag each page with the lower-case version. (I had this same problem.)
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate URLs on eCommerce site caused by parameters
Hi there, We have a client with a large eCommerce site with about 1500 duplicate URLs caused by the parameters in the URLs (such as the sort parameter where the list of products are then sorted by price, age etc.) Example: www.example.com/cars/toyota First duplicate URL: www.example.com/cars/toyota?sort=price-ascending Second duplicate URL: www.example.com/cars/toyota?sort=price-descending Third duplicate URL: www.example.com/cars/toyota?sort=age-descending Originally we had advised to add a robots.txt file to block search engines from crawling the URLs with parameters but this hasn't been done. My question: If we add the robots.txt now and exclude all URLs with filters - how long will it take for Google to disregard the duplicate URLs? We could ask the developers to add canonical tags to all the duplicates but these are about 1500... Thanks in advance for any advice!
Intermediate & Advanced SEO | | Gabriele_Layoutweb0 -
How bad is duplicate content for ecommerce sites?
We have multiple eCommerce sites which not only share products across domains but also across categories within a single domain. Examples: http://www.artisancraftedhome.com/sinks-tubs/kitchen-sinks/two-tone-sinks/medium-rounded-front-farmhouse-sink-two-tone-scroll http://www.coppersinksonline.com/copper-kitchen-and-farmhouse-sinks/two-tone-kitchen-farmhouse-sinks/medium-rounded-front-farmhouse-sink-two-tone-scroll http://www.coppersinksonline.com/copper-sinks-on-sale/medium-rounded-front-farmhouse-sink-two-tone-scroll We have selected canonical links for each domain but I need to know if this practice is having a negative impact on my SEO.
Intermediate & Advanced SEO | | ArtisanCrafted0 -
Duplicate content issues from mirror subdomain : facebook.domianname.com
Hey Guys,
Intermediate & Advanced SEO | | b2bmarketer
Need your suggestions.
I have got a website that has duplicate content issue.
a sub-domain called facebook.asherstrategies .com comes from no where and is getting indexed.
Website Link : asherstrategies .com
subdomain link: facebook.asherstrategies .com This sub domain is actually a mirror of the website and i have no idea how is is created.
trying to resolve the issue but could not find the clue.0 -
URL Parameter & crawl stats
Hey Guys,I recently used the URL parameter tool in WBT to mark different urls that offers the same content.I have the parameter "?source=site1" , "?source=site2", etc...It looks like this: www.example.com/article/12?source=site1The "source parameter" are feeds that we provide to partner sites and this way we can track the referral site with our internal analytics platform.Although, pages like:www.example.com/article/12?source=site1 have canonical to the original page www.example.com/article/12, Google indexed both of the URLs
Intermediate & Advanced SEO | | Mr.bfz
www.example.com/article/12?source=site1andwww.example.com/article/12Last week I used the URL parameter tool to mark "source" parameter "No, this parameter doesnt effect page content (track usage)" and today I see a 40% decrease in my crawl stats.In one hand, It makes sense that now google is not crawling the repeated urls with different sources but in the other hand I thought that efficient crawlability would increase my crawl stats.In additional, google is still indexing same pages with different source parameters.I would like to know if someone have experienced something similar and by increasing crawl efficiency I should expect my crawl stats to go up or down?I really appreciate all the help!Thanks!0 -
Duplicate content on sites from different countries
Hi, we have a client who currently has a lot of duplicate content with their UK and US website. Both websites are geographically targeted (via google webmaster tools) to their specific location and have the appropriate local domain extension. Is having duplicate content a major issue, since they are in two different countries and geographic regions of the world? Any statement from Google about this? Regards, Bill
Intermediate & Advanced SEO | | MBASydney0 -
Syndicating duplicate content descriptions - Can these be canonicalised?
Hi there, I have a site that contains descriptions of accommodation and we also use this content to syndicate to our partner sites. They then use this content to fill their descriptions on the same accommodation locations. I have looked at copyscape and Google and this does appear as duplicate content across these partnered sites. I do understand as well that certain kinds of content will not impact Google's duplication issue such as locations, addresses, opening times those kind of things, but would actual descriptions of a location around 250 words long be seen and penalised as duplicate content? Also is there a possible way to canonicalise this content so that Google can see it relates back to our original site? The only other way I can think of getting round a duplicate content issue like this is ordering the external sites to use tags like blockquotes and cite tags around the content.
Intermediate & Advanced SEO | | MalcolmGibb0 -
What constitutes duplicate content?
I have a website that lists various events. There is one particular event at a local swimming pool that occurs every few months -- for example, once in December 2011 and again in March 2012. It will probably happen again sometime in the future too. Each event has its own 'event' page, which includes a description of the event and other details. In the example above the only thing that changes is the date of the event, which is in an H2 tag. I'm getting this as an error in SEO Moz Pro as duplicate content. I could combine these pages, since the vast majority of the content is duplicate, but this will be a lot of work. Any suggestions on a strategy for handling this problem?
Intermediate & Advanced SEO | | ChatterBlock0 -
Accepting RSS feeds. Does it = duplicate content?
Hi everyone, for a few years now I've allowed school clients to pipe their news RSS feed to their public accounts on my site. The result is a daily display of the most recent news happening on their campuses that my site visitors can browse. We don't republish the entire news item; just the headline, and the first 150 characters of their article along with a Read more link for folks to click if they want the full story over on the school's site. Each item has it's own permanent URL on my site. I'm wondering if this is a wise practice. Does this fall into the territory of duplicate content even though we're essentially providing a teaser for the school? What do you think?
Intermediate & Advanced SEO | | peterdbaron0