URL Capitalization Inconsistencies Registering Duplicate Content Crawl Errors
-
Hello,
I have a very large website that has a good amount of "Duplicate Content" issues according to MOZ. In reality though, it is not a problem with duplicate content, but rather a problem with URLs. For example: http://acme.com/product/features and http://acme.com/Product/Features both land on the same page, but MOZ is seeing them as separate pages, therefor assuming they are duplicates.
We have recently implemented a solution to automatically de-captialize all characters in the URL, so when you type acme.com/Products, the URL will automatically change to acme.com/products – but MOZ continues to flag multiple "Duplicate Content" issues. I noticed that many of the links on the website still have the uppercase letters in the URL even though when clicked, the URL changes to all lower case. Could this be causing the issue?
What is the best way to remove the "Duplicate Content" issues that are not actually duplicate content?
-
http://moz.com/learn/seo/canonicalization
"Another option for dealing with duplicate content is to utilize the rel=canonical tag. The rel=canonical tag passes the same amount of link juice (ranking power) as a 301 redirect, and often takes much less development time to implement."
-
If you check Google Analytics, GA is probably seeing it too. We had a similar problem. Canonicalization will help with duplicate content, but it won't help with rankings. Internally, you are sending link juice to multiple versions of the same page. In addition, you could have backlinks pointing at multiple duplicate pages, and splitting the link love.
Canonicalization does not transfer link juice the way a 301 Redirect does. All the canonical tag does is tell Google "Rank This Page". If you don't care about rankings the canonical is fine. If you do care, you need to 301 all of your pages to the lower case version.
If you decide to 301, first, build an HTML sitemap with all of the uppercase URLs. After you do the 301, have Google fetch the sitemap and submit it, This will help Googlebot wind all of the pages that were 301ed.
-
Hey man. If your store is a Magento store, there are settings for adding canonicalization tags to categories and products under:
System => Configuration => Catalog => Search Engine Optimization
h/t to Yoast for reminding me of the string to get there.
I am optimizing a Magento store that had a similar issue after a relaunch. Found that to be a very easy way to fix it.
Hope this helps.
-
I have a complex CMS for my main website; I just asked my developer to do it. On my Wordpress sites, I use an SEO plugin for this (Yoast).
-
Thank you so much Linda.
Do you know of a fast way to add a rel=canonical tag to all the pages, the website is quite large, and it would likely take months over months to do it manually.
-
Hi,
There is the same question here on moz: Duplicate Content and URL Capitalization
-
The best way to fix this is with a rel=canonical URL. Tag each page with the lower-case version. (I had this same problem.)
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate Content
Let's say a blog is publishing original content. Now let's say a second blog steals that original content via bot and publishes it as it's own. Now further assume the original blog doesn't notice this for several years. How much damage could this do to blog A for Google results? Any opinions?
Intermediate & Advanced SEO | | CYNOT0 -
I added an SSL certificate this morning and now I noticed duplicate content
Ok, so Im a newbie, therefor I make mistakes! Lots of them. I added an SSL certificate this morning bc it was free and I read it can help my rankings. Now I just checked it in screaming frog and saw two duplicate content pages due to the https. So im panicking! What's the easiest way to fix this?? Can I undue an SSL certificate? I guess what's the easiest that will also be best for ranking. Thank you!! Rena
Intermediate & Advanced SEO | | palila0 -
How to solve this issue and avoid duplicated content?
My marketing team would like to serve up 3 pages of similar content; www.example.com/one, www.example.com/two and www.example.com/three; however the challenge here is, they'd like to have only one page whith three different titles and images based on the user's entry point (one, two, or three). To avoid duplicated pages, how would suggest this best be handled?
Intermediate & Advanced SEO | | JoelHer0 -
Duplicate Content Question With New Domain
Hey Everyone, I hope your day is going well. I have a question regarding duplicate content. Let's say that we have Website A and Website B. Website A is a directory for multiple stores & brands. Website B is a new domain that will satisfy the delivery niche for these multiple stores & brands (where they can click on a "Delivery" anchor on Website A and it'll redirect them to Website B). We want Website B to rank organically when someone types in " <brand>delivery" in Google. Website B has NOT been created yet. The Issue Website B has to be a separate domain than Website A (no getting around this). Website B will also pull all of the content from Website A (menus, reviews, about, etc). Will we face any duplicate content issues on either Website A or Website B in the future? Should we rel=canonical to the main website even though we want Website B to rank organically?</brand>
Intermediate & Advanced SEO | | imjonny0 -
URL Errors Help - 350K Page Not Founds in 22 days
Got a good one for you all this time... For our site, Google Search Console is reporting 436,758 "Page Not Found" errors within the Crawl Error report. This is an increase of 350,000 errors in just 22 days (on Sept 21 we had 87,000 errors which was essentially consistently at that number for the previous 4 months or more). Then on August 22nd the errors jumped to 140,000, then climbed steadily from the 26th until the 31st reaching 326,000 errors, and then climbed again slowly from Sept 2nd until today's 436K. Unfortunately I can only see the top 1,000 erroneous URLs in the console, of which they seem to be custom Google tracking URLs my team uses to track our pages. A few questions: 1. Is there anyway to see the full list of 400K URLs Google is reporting they cannot find?
Intermediate & Advanced SEO | | usnseomoz
2. Should we be concerned at all about these?
3. Any other advice? thanks in advance! C0 -
How would you handle this duplicate content - noindex or canonical?
Hello Just trying look at how best to deal with this duplicated content. On our Canada holidays page we have a number of holidays listed (PAGE A)
Intermediate & Advanced SEO | | KateWaite
http://www.naturalworldsafaris.com/destinations/north-america/canada/suggested-holidays.aspx We also have a more specific Arctic Canada holidays page with different listings (PAGE B)
http://www.naturalworldsafaris.com/destinations/arctic-and-antarctica/arctic-canada/suggested-holidays.aspx Of the two, the Arctic Canada page (PAGE B) receives a far higher number of visitors from organic search. From a user perspective, people expect to see all holidays in Canada (PAGE A), including the Arctic based ones. We can tag these to appear on both, however it will mean that the PAGE B content will be duplicated on PAGE A. Would it be the best idea to set up a canonical link tag to stop this duplicate content causing an issue. Alternatively would it be best to no index PAGE A? Interested to see others thoughts. I've used this (Jan 2011 so quite old) article for reference in case anyone else enters this topic in search of information on a similar thing: Duplicate Content: Block, Redirect or Canonical - SEO Tips0 -
Potential Pagination Issue/ Duplicate content issue
Hi All, We upgraded our framework , relaunched our site with new url structures etc and re did our site map to Google last week. However, it's now come to light that the rel=next, rel=Prev tags we had in place on many of our pages are missing. We are putting them back in now but my worry is , as they were previously missing when we submitted the , will I have duplicate content issues or will it resolve itself , as Google re-crawls the site over time ?.. Any advice would be greatly appreciated? thanks Pete
Intermediate & Advanced SEO | | PeteC120 -
Is all duplication of HTML title content bad?
In light of Hummingbird and that HTML titles are the main selling point in SERPs, is my approach to keyword rich HTML titles bad? Where possible I try to include the top key phrase to descripe a page and then a second top keyphrase describing what the company/ site as a whole is or does. For instance an estate agents site could consist of HTML title such as this Buy Commercial Property in Birmingham| Commercial Estate Agents Birmingham Commercial Property Tips | Commercial Estate Agents In order to preserve valuable characters I have also been omitting brand names other than on the home page... is this also poor form?
Intermediate & Advanced SEO | | SoundinTheory0