Caps in URL creating duplicate content
-
Im getting a bunch of duplicate content errors where the crawl is saying
www.url.com/abc has duplicate at www.url.com/ABC
The content is in magento and the url settings are lowercase, and I cant figure out why it thinks there is duplicate consent. These are pages with a decent number of inbound links.
-
I checked and it is a magento feature to rewrite caps to lower case.
I added this to htaccess anyway
<code>RewriteMap lc int:tolower RewriteCond %{REQUEST_URI} [A-Z] RewriteRule (.*) ${lc:$1} [R=301,L]</code>
One last question before I take this question to a magento forum - how can I look at a page with a caps URL and lower URL and see if they are really different pages or link to the same address.
When you change random letters to caps in our site it sends you to the right page but my browser still shows the mixed caps url instead of replacing with an all lower url - but is that really a different page or is the browser just not changing the caps display when it is really getting the lower case page ```
-
Hi John,
I checked the URL you sent me. You do have duplicate pages:
http://www.madebysurvivors.com/destiny
http://www.madebysurvivors.com/DESTINY
both work and return the same page..
I also tried clicking on other links on your site, and then just changing a few letters to the upper case something like this
http://www.madebysurvivors.com/LEArn-human-trafficking-slavery
and it returns the same page
From what I can tell its one of the features in Magento that is making this possible. I would go into settings and disable that setting that forces Magento to use lower case.
Then test it make sure that you DO get a 404 page if you change the letter case on any of your links. Once you test it and you do get a 404 page.
I'm not familiar with Magento so not sure if it has that option or not, but many CMS and ecommerce platforms have a field where you can specify the URL for that page, I would change that field to all lower case.
Test it again, if it works there is one more step that you have to do if you want to keep the same juice from the pages that had the uppercase URL.
You need to duplicate your pages, but you need to make sure that the URL address is the same as it was before (in all CAPS) and then do a 301 redirect to the new page which is in lower case.
Hope this helps and makes sense.
-
This is intended functionality in Magento. It's supposed to help the user experience, as a user can navigate to a page even if they aren't sure on the casing of the words.
Of course that's bad for SEO. You'll need to put in the concept of canonicalization. Here's a free extension by Yoast:
http://www.magentocommerce.com/magento-connect/canonical-url-for-magento.html
Cheers.
Update: seeing your response, your solution of putting in redirects wouldn't be possible. You'd have to cover all combinations of caps/non-caps, and well, that's more work than you should want :). As for why this happens, the uppercase character is being lowercased when checking if something in the database matches the URL. Again, this is intended functionality.
-
Looks like I do need some more help.
I get a redirect loop if I enter a redirect from
http://www.madebysurvivors.com/DESTINY
to
http://www.madebysurvivors.com/destiny
but I checked and there is no redirect the other way in our database or htaccess.
If I leave the redirect off I get duplicate content - but in the CMS parts of magento there is only one table for this page.
-
I actually moved all the content from a drupal install so I dont have that many URLs that have the problem. It looks like the faster way to do this is just redirects the caps to lower case as thats what we use elsewhere..
I dug into the underlying database and cant find any duplicate entries for these pages or odd redirects so I have no idea of the cause.
For some of the pages I think you are right that magento is moving caps down to lower, but there are a few others where it is lower to caps - but it was caps in the drupal site.
Anyway -good to know google sees them differently so Ill put in redirects. Its only about 20 pages
-
Hello John,
If you can provide us with a URL we might be able to dig in to see what is going on. Without it its almost impossible to tell. Also it doesn't matter if you have a decent number of inbound links, duplicate content only refers to pages with similar content. I'm not familiar with Magento platform so this is just a guess, when you created (or imported) pages or categories in Magento originally were they lowercased? If not its possible that Magento added them as all in CAPS and Magento might be forcing it to lower case, therefore you might have duplicates, but once again this is just a guess and without a URL to your site I doubt that someone will be able to help you further.
-
www.url.com/abc and www.url.com/ABC are two completely different pages according to Google
I would redirect any and all pages with capitals to the corresponding lower case URL's.
Dont worry about the link juice as it will pass over via the redirect. It will also be much better than having 2 identical pages competing with eachother (according to Google)
Greg
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to handle one section of duplicate content
Hi guys, i'm wondering if I can get some best practice advice in preparation for launching our new e-commerce website. For the new website we are creating location pages with a description and things to do which will lead the user to hotels in the location. For each hotel page which relates to the location we will have the same 'Things to do' content. This is what the content will look like on each page: Location page Location title (1-3 words) Location description (150-200 words) Things to do (200-250 words) Reasons to visit location (15 words) Hotel page Hotel name and address (10 words) Short description (25 words) Reasons to book hotel (15 words) Hotel description (100-200 words) Friendly message why to visit (15 words) Hotel reviews feed from trust pilot Types of break and information (100-200 words) Things to do (200-250 words) My question is how much will we penalised for having the same 'Things to do' content on say up to 10 hotels + 1 location page? In an ideal world we want to develop a piece of code which tells search engines that the original content lies on the location page but this will not be possible before we go live. I'm unsure whether we should just go and take the potential loss in traffic or remove the 'Things to do' section on hotel pages until we develop the piece of code?
Technical SEO | | CHGLTD1 -
Duplicates - How to know if trailing slashes are creating duplicate pages?
Hi, How do you determine whether trailing slashes are creating duplicate pages? Search Console is showing both /about and about/ for example but how do I know whether this is a problem? Thanks James
Technical SEO | | CamperConnect140 -
WordPress Duplicate Content Caused By Categories
Hello, We have a wordpress blog that has around 250 categories. Due to our platform we have a hierarchy structure for 3 separate stores. For example iPhone > Apps > Books. Placing a blog post in the books category automatically places it into iPhone and iPhone/Apps category, causing 3 instances of any blog post in this category. Is this an issue? I have seen 2 schools of thought on categories, 1 index follow and 2 noindex follow. I know some of our categories get indexed, but with so many, maybe it is better to noindex them. We also considered reducing our categories to 10 to 12 and use tags to provide the indexed site navigation as follows: Reviews (category) iPhone Book App, iPhone App Store (tags) but this seems a little redundant? Anyone want to take this on? thank you Mike
Technical SEO | | crazymikesapps10 -
Duplicate Content Issues on Product Pages
Hi guys Just keen to gauge your opinion on a quandary that has been bugging me for a while now. I work on an ecommerce website that sells around 20,000 products. A lot of the product SKUs are exactly the same in terms of how they work and what they offer the customer. Often it is 1 variable that changes. For example, the product may be available in 200 different sizes and 2 colours (therefore 400 SKUs available to purchase). Theese SKUs have been uploaded to the website as individual entires so that the customer can purchase them, with the only difference between the listings likely to be key signifiers such as colour, size, price, part number etc. Moz has flagged these pages up as duplicate content. Now I have worked on websites long enough now to know that duplicate content is never good from an SEO perspective, but I am struggling to work out an effective way in which I can display such a large number of almost identical products without falling foul of the duplicate content issue. If you wouldnt mind sharing any ideas or approaches that have been taken by you guys that would be great!
Technical SEO | | DHS_SH0 -
Joomla: content accesible through all kinds of other links >> duplicate content?!
When i did a site: search on Google i've noticed all kind of URL's on my site were indexed, while i didn't add them to the Joomla navigation (or they were not linked anywhere on the site). Some examples: www.domain.com/1-articlename >> that way ALL articles are publicly visible, even if they are not linked to a menu-item... If by accident such a link get's shared it will be indexed in google, you can have 2 links with same content... www.domain.com/2-uncategorised >> same with categories, automatically these overview pages are visible to people who know this URL. On it you see all the articles that belong to that category. www.domain.com/component/content >> this gives an overview of all the categories inside your Joomla CMS I think most will agree this is not good for your site's SEO? But how can this be solved? Is this some kind of setting within Joomla? Anyone who dealt with these problems already?
Technical SEO | | conversal0 -
GWT Duplicate Content and Canonical Tag - Annoying
Hello everyone! I run an e-commerce site and I had some problems with duplicate meta descriptions for product pages. I implemented the rel=canonical in order to address this problem, but after more than a week the number of errors showing in google webmaster tools hasn't changed and the site has been crawled already three times since I put the rel canonical. I didn't change any description as each error regards a set of pages that are identical, same products, same descriptions just different length/colour. I am pretty sure the rel=canonical has been implemented correctly so I can't understand why I still have these errors coming up. Any suggestions? Cheers
Technical SEO | | PremioOscar0 -
Snippets on every page considered duplicate content?
If I create a page that pulls a 10 snippets of information from various external site, would that content be considered duplicate content? If I link to the source, would it be recommended to use a "nofollow" tag?
Technical SEO | | nicole.healthline0 -
Complex duplicate content question
We run a network of three local web sites covering three places in close proximity. Each sitehas a lot of unique content (mainly news) but there is a business directory that is shared across all three sites. My plan is that the search engines only index the business in the directory that are actually located in the place the each site is focused on. i.e. Listing pages for business in Alderley Edge are only indexed on alderleyedge.com and businesses in Prestbury only get indexed on prestbury.com - but all business have a listing page on each site. What would be the most effective way to do this? I have been using rel canonical but Google does not always seem to honour this. Will using meta noindex tags where appropriate be the way to go? or would be changing the urls structure to have the place name in and using robots.txt be a better option. As an aside my current url structure is along the lines of: http://dev.alderleyedge.com/directory/listing/138/the-grill-on-the-edge Would changing this have any SEO benefit? Thanks Martin
Technical SEO | | mreeves0