URL Capitalization Inconsistencies Registering Duplicate Content Crawl Errors
-
Hello,
I have a very large website that has a good amount of "Duplicate Content" issues according to MOZ. In reality though, it is not a problem with duplicate content, but rather a problem with URLs. For example: http://acme.com/product/features and http://acme.com/Product/Features both land on the same page, but MOZ is seeing them as separate pages, therefor assuming they are duplicates.
We have recently implemented a solution to automatically de-captialize all characters in the URL, so when you type acme.com/Products, the URL will automatically change to acme.com/products – but MOZ continues to flag multiple "Duplicate Content" issues. I noticed that many of the links on the website still have the uppercase letters in the URL even though when clicked, the URL changes to all lower case. Could this be causing the issue?
What is the best way to remove the "Duplicate Content" issues that are not actually duplicate content?
-
http://moz.com/learn/seo/canonicalization
"Another option for dealing with duplicate content is to utilize the rel=canonical tag. The rel=canonical tag passes the same amount of link juice (ranking power) as a 301 redirect, and often takes much less development time to implement."
-
If you check Google Analytics, GA is probably seeing it too. We had a similar problem. Canonicalization will help with duplicate content, but it won't help with rankings. Internally, you are sending link juice to multiple versions of the same page. In addition, you could have backlinks pointing at multiple duplicate pages, and splitting the link love.
Canonicalization does not transfer link juice the way a 301 Redirect does. All the canonical tag does is tell Google "Rank This Page". If you don't care about rankings the canonical is fine. If you do care, you need to 301 all of your pages to the lower case version.
If you decide to 301, first, build an HTML sitemap with all of the uppercase URLs. After you do the 301, have Google fetch the sitemap and submit it, This will help Googlebot wind all of the pages that were 301ed.
-
Hey man. If your store is a Magento store, there are settings for adding canonicalization tags to categories and products under:
System => Configuration => Catalog => Search Engine Optimization
h/t to Yoast for reminding me of the string to get there.
I am optimizing a Magento store that had a similar issue after a relaunch. Found that to be a very easy way to fix it.
Hope this helps.
-
I have a complex CMS for my main website; I just asked my developer to do it. On my Wordpress sites, I use an SEO plugin for this (Yoast).
-
Thank you so much Linda.
Do you know of a fast way to add a rel=canonical tag to all the pages, the website is quite large, and it would likely take months over months to do it manually.
-
Hi,
There is the same question here on moz: Duplicate Content and URL Capitalization
-
The best way to fix this is with a rel=canonical URL. Tag each page with the lower-case version. (I had this same problem.)
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Same content different URL - Google Analytics other Options
One of my clients came to me today asking to mirror their current website and host it on a different domain (i.e domain.com and domain.ca). Their reasoning is that there is no actual way to confirm via form entries from their website which visitors are submitting the form from an organic perspective and which ones are submitting the form through an Adwords ad campaign referral they are running (which I have nothing to do with). I know Google Analytics will show you visits to pages on a site and then you can find out which source (organic vs. cpc) they came from, but it won't confirm the source on the actual form entry. So my client feels the only way to find out this information would be to mirror the website so that the separate analytics would validate their Ad spend. Does anyone know of any tools that I could use for something like this? I do NOT want to mirror the website as I am fearful of duplicate content and the YEARS of organic SEO work I have put into this website going to waste. The other element I should mention is the client only wants to have this "mirorred" site up for 2 months. Any thoughts, suggestions and arguments for a mirorred website are welcomed! Thanks!
Intermediate & Advanced SEO | | MainstreamMktg0 -
Duplicate content issue
Hello! We have a lot of duplicate content issues on our website. Most of the pages with these issues are dictionary pages (about 1200 of them). They're not exactly duplicate, but they contain a different word with a translation, picture and audio pronunciation (example http://anglu24.lt/zodynas/a-suitcase-lagaminas). What's the better way of solving this? We probably shouldn't disallow dictionary pages in robots.txt, right? Thanks!
Intermediate & Advanced SEO | | jpuzakov0 -
Duplicating content from manufacturer for client site and using canonical reference.
We manage content for many clients in the same industry, and many of them wish to keep their customers on their individualized websites (understandably). In order to do this, we have duplicated content in part from the manufacturers' pages for several "models" on the client's sites. We have put in a Canonical reference at the start of the content directing back to the manufacturer's page where we duplicated some of the content. We have only done a handful of pages while we figure out the canonical reference potential issue. So, my questions are: Is this necessary? Does this hurt, help or not do anything SEO-wise for our ranking of the site? Thanks!
Intermediate & Advanced SEO | | moz1admin1 -
Noindex Valuable duplicate content?
How could duplicate content be valuable and why question no indexing it? My new client has a clever african safari route builder that you can use to plan your safari. The result is 100's of pages that have different routes. Each page inevitably has overlapping content / destination descriptions. see link examples. To the point - I think it is foolish to noindex something like this. But is Google's algo sophisticated enough to not get triggered by something like this? http://isafari.nathab.com/routes/ultimate-tanzania-kenya-uganda-safari-july-november
Intermediate & Advanced SEO | | Rich_Coffman
http://isafari.nathab.com/routes/ultimate-tanzania-kenya-uganda-safari-december-june0 -
Duplicate Content for Deep Pages
Hey guys, For deep, deep pages on a website, does duplicate content matter? The pages I'm talk about are image pages associated with products and will never rank in Google which doesn't concern me. What I'm interested to know though is whether the duplicate content would have an overall effect on the site as a whole? Thanks in advance Paul
Intermediate & Advanced SEO | | kevinliao1 -
Duplicate content for hotel websites - the usual nightmare? is there any solution other than producing unique content?
Hiya Mozzers I often work for hotels. A common scenario is the hotel / resort has worked with their Property Management System to distribute their booking availability around the web... to third party booking sites - with the inventory goes duplicate page descriptions sent to these "partner" websites. I was just checking duplication on a room description - 20 loads of duplicate descriptions for that page alone - there are 200 rooms - so I'm probably looking at 4,000 loads of duplicate content that need rewriting to prevent duplicate content penalties, which will cost a huge amount of money. Is there any other solution? Perhaps ask booking sites to block relevant pages from search engines?
Intermediate & Advanced SEO | | McTaggart0 -
I try to apply best duplicate content practices, but my rankings drop!
Hey, An audit of a client's site revealed that due to their shopping cart, all their product pages were being duplicated. http://www.domain.com.au/digital-inverter-generator-3300w/ and http://www.domain.com.au/shop/digital-inverter-generator-3300w/ The easiest solution was to just block all /shop/ pages in Google Webmaster Tools (redirects were not an easy option). This was about 3 months ago, and in months 1 and 2 we undertook some great marketing (soft social book marking, updating the page content, flickr profiles with product images, product manuals onto slideshare etc). Rankings went up and so did traffic. In month 3, the changes in robots.txt finally hit and rankings decreased quite steadily over the last 3 weeks. Im so tempted to take off the robots restriction on the duplicate content.... I know I shouldnt but, it was working so well without it? Ideas, suggestions?
Intermediate & Advanced SEO | | LukeyJamo0 -
Duplicate Content Through Sorting
I have a website that sells images. When you search you're given a page like this: http://www.andertoons.com/search-cartoons/santa/ I also give users the option to resort results by date, views and rating like this: http://www.andertoons.com/search-cartoons/santa/byrating/ I've seen in SEOmoz that Google might see these as duplicate content, but it's a feature I think is useful. How should I address this?
Intermediate & Advanced SEO | | andertoons0