Capitals in url creates duplicate content?
-
Hey Guys,
I had a quick look around however I couldn't find a specific answer to this.
Currently, the SEOmoz tools come back and show a heap of duplicate content on my site. And there's a fair bit of it.
However, a heap of those errors are relating to random capitals in the urls.
for example.
"www.website.com.au/Home/information/Stuff" is being treated as duplicate content of "www.website.com.au/home/information/stuff" (Note the difference in capitals).
Anyone have any recommendations as to how to fix this server side(keeping in mind it's not practical or possible to fix all of these links) or to tell Google to ignore the capitalisation?
Any help is greatly appreciated.
LM.
-
The IIS url-rewrite addon works great!
-
From my memory Google does treat urls as case sensitive.
Best to keep al urls as lower case.
-
Thanks for your reply Alan!
Bing is irrelevant in Belgium Maybe marketshare of 0,00005 or so
When I look at the SEOMoz crawling reports I panic, but when I look at GWT, I'm happy... The difference is huge.
So, no sure I will keep on using these reports..
-
I don't know that Google does ignore it. anyhow Bing does not http://perthseocompany.com.au/seo/reports/violation/the-page-contains-multiple-canonical-formats
-
If Google ignores the mixed usage of capitals in URL's, then why is the SEOMoz reporting it? If it is irrelevant, why not leaving it out?? It takes quite some work to filter out the irrelevant stuff!
-
Thanks Semil - The same duplicates are not showing in Google Webmaster Tools, for instance SEOMoz is showing 639 duplicate page content and 646 duplicate page titles. Webmaster tools is 88 and 37 respectively.
Looking into the numbers in SEOmoz again (and they've risen since the original post) there's a huge number which fall under the capitalisation discussed but also some which seem to register as HTTPS and HTTP.
-
Thanks Alan - I'll get on this...
-
Yes its seen as too different urls
http://perthseocompany.com.au/seo/reports/violation/the-page-contains-multiple-canonical-formats
If you are uisng a windows server (IIS), you can fix this easy by using the IIS url-rewrite addon. it had a rewite as lowercase preset
-
Google does count this as duplicate content. Semil is right. You want to have someone do url rewrites on the server side to 301 these to lowercase.
-
Hi LucasM,
Yes its possible by server side that you cant open a url with capital letters if you are using small letters.
But I dont think google will talke capitalisation in consideration.
Is it showing you in Google webmaster tool in duplicate titles and duplicate descriptions ?
If its showing then ask your coder to play with .htaccess to stop opening a url with different small - capital letter combination.
Thanks,
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Penalty for duplicate content on the same website?
Is it possible to get a penalty for duplicate content on the same website? I have a old custom-built site with a large number of filters that are pre-generated for speed. Basically the only difference is the meta title and H1 tag, with a few text differences here and there. Obviously I could no-follow all the filter links but it would take an enormous amount of work. The site is performing well in the search. I'm trying to decide whether if there is a risk of a penalty, if not I'm loath to do anything in case it causes other issues.
Intermediate & Advanced SEO | | seoman100 -
Questions about duplicate photo content?
I know that Google is a mystery, so I am not sure if there are answers to these questions, but I'm going to ask anyway! I recently realized that Google is not happy with duplicate photo content. I'm a photographer and have sold many photos in the past (but retained the rights for) that I am now using on my site. My recent revelations means that I'm now taking down all of these photos. So I've been reverse image searching all of my photos to see if I let anyone else use it first, and in the course of this I found out that there are many of my photos being used by other sites on the web. So my questions are: With photos that I used first and others have stolen, If I edit these photos (to add copyright info) and then re-upload them, will the sites that are using these images then get credit for using the original image first? If I have a photo on another one of my own sites and I take it down, can I safely use that photo on my main site, or will Google retain the knowledge that it's been used somewhere else first? If I sold a photo and it's being used on another site, can I safely use a different photo from the same series that is almost exactly the same? I am unclear what data from the photo Google is matching, and if they can tell the difference between photos that were taken a few seconds apart.
Intermediate & Advanced SEO | | Lina5000 -
Robots.txt & Duplicate Content
In reviewing my crawl results I have 5666 pages of duplicate content. I believe this is because many of the indexed pages are just different ways to get to the same content. There is one primary culprit. It's a series of URL's related to CatalogSearch - for example; http://www.careerbags.com/catalogsearch/result/index/?q=Mobile I have 10074 of those links indexed according to my MOZ crawl. Of those 5349 are tagged as duplicate content. Another 4725 are not. Here are some additional sample links: http://www.careerbags.com/catalogsearch/result/index/?dir=desc&order=relevance&p=2&q=Amy
Intermediate & Advanced SEO | | Careerbags
http://www.careerbags.com/catalogsearch/result/index/?color=28&q=bellemonde
http://www.careerbags.com/catalogsearch/result/index/?cat=9&color=241&dir=asc&order=relevance&q=baggallini All of these links are just different ways of searching through our product catalog. My question is should we disallow - catalogsearch via the robots file? Are these links doing more harm than good?0 -
Need help with huge spike in duplicate content and page title errors.
Hi Mozzers, I come asking for help. I've had a client who's reported a staggering increase in errors of over 18,000! The errors include duplicate content and page titles. I think I've found the culprit and it's the News & Events calender on the following page: http://www.newmanshs.wa.edu.au/news-events/events/07-2013 Essentially each day of the week is an individual link, and events stretching over a few days get reported as duplicate content. Do you have any ideas how to fix this issue? Any help is much appreciated. Cheers
Intermediate & Advanced SEO | | bamcreative0 -
Duplicate Content and Titles
Hi Mozzers, I saw a considerable amount of duplicate content and page titles on our clients website. We are just implementing a fix in the CMS to make sure that these are all fixed. What changes do you think I could see in terms of rankings?
Intermediate & Advanced SEO | | KarlBantleman0 -
Duplicate content on subdomains.
Hi Mozer's, I have a site www.xyz.com and also geo targeted sub domains www.uk.xyz.com, www.india.xyz.com and so on. All the sub domains have the content which is same as the content on the main domain that is www.xyz.com. So, I want to know how can i avoid content duplication. Many Thanks!
Intermediate & Advanced SEO | | HiteshBharucha0 -
Duplicate blog content and NOINDEX
Suppose the "Home" page of your blog at www.example.com/domain/ displays your 10 most recent posts. Each post has its own permalink page (where you have comments/discussion, etc.). This obviously means that the last 10 posts show up as duplicates on your site. Is it good practice to use NOINDEX, FOLLOW on the blog root page (blog/) so that only one copy gets indexed? Thanks, Akira
Intermediate & Advanced SEO | | ahirai0 -
301 redirect for duplicate content
Hey, I have just started working on a site which is a video based city guide, with promotional videos for restaurants, bars, activities,etc. The first thing that I have noticed is that every video on the site has two possible urls:- http://www.domain.com/venue.php?url=rosemarino
Intermediate & Advanced SEO | | AdeLewis
http://www.domain.com/venue/rosemarino I know that I can write a .htaccess line to redirect one to the other:- redirect 301 /venue.php?url=rosemarino http://www.domain.com/venue/rosemarino but this would involve creating a .htaccess line for every video on the site and new videos that get added may get missed. Does anyone know a way of creating a rule to rewrite these urls? Any help would be most gratefully received. Thanks. Ade.0