Capitals in url creates duplicate content?
-
Hey Guys,
I had a quick look around however I couldn't find a specific answer to this.
Currently, the SEOmoz tools come back and show a heap of duplicate content on my site. And there's a fair bit of it.
However, a heap of those errors are relating to random capitals in the urls.
for example.
"www.website.com.au/Home/information/Stuff" is being treated as duplicate content of "www.website.com.au/home/information/stuff" (Note the difference in capitals).
Anyone have any recommendations as to how to fix this server side(keeping in mind it's not practical or possible to fix all of these links) or to tell Google to ignore the capitalisation?
Any help is greatly appreciated.
LM.
-
The IIS url-rewrite addon works great!
-
From my memory Google does treat urls as case sensitive.
Best to keep al urls as lower case.
-
Thanks for your reply Alan!
Bing is irrelevant in Belgium Maybe marketshare of 0,00005 or so
When I look at the SEOMoz crawling reports I panic, but when I look at GWT, I'm happy... The difference is huge.
So, no sure I will keep on using these reports..
-
I don't know that Google does ignore it. anyhow Bing does not http://perthseocompany.com.au/seo/reports/violation/the-page-contains-multiple-canonical-formats
-
If Google ignores the mixed usage of capitals in URL's, then why is the SEOMoz reporting it? If it is irrelevant, why not leaving it out?? It takes quite some work to filter out the irrelevant stuff!
-
Thanks Semil - The same duplicates are not showing in Google Webmaster Tools, for instance SEOMoz is showing 639 duplicate page content and 646 duplicate page titles. Webmaster tools is 88 and 37 respectively.
Looking into the numbers in SEOmoz again (and they've risen since the original post) there's a huge number which fall under the capitalisation discussed but also some which seem to register as HTTPS and HTTP.
-
Thanks Alan - I'll get on this...
-
Yes its seen as too different urls
http://perthseocompany.com.au/seo/reports/violation/the-page-contains-multiple-canonical-formats
If you are uisng a windows server (IIS), you can fix this easy by using the IIS url-rewrite addon. it had a rewite as lowercase preset
-
Google does count this as duplicate content. Semil is right. You want to have someone do url rewrites on the server side to 301 these to lowercase.
-
Hi LucasM,
Yes its possible by server side that you cant open a url with capital letters if you are using small letters.
But I dont think google will talke capitalisation in consideration.
Is it showing you in Google webmaster tool in duplicate titles and duplicate descriptions ?
If its showing then ask your coder to play with .htaccess to stop opening a url with different small - capital letter combination.
Thanks,
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What to do with dynamically translated content sharing same urls?
We've just added to an originally English website, Italian and German translations. User can switch between them with right hand drop down language selection menu; then the entire page will be translated (including menu, body, footer) but the urls remain the same. The Italian page have some meta data (titles and descriptions) translated as well. Is it going to be a significantly negative effect on SEO to have the translated pages sharing the same urls?
Intermediate & Advanced SEO | | D2i0 -
Duplicate content based on filters
Hi Community, There have probably been a few answers to this and I have more or less made up my mind about it but would like to pose the question or as that you post a link to the correct article for this please. I have a travel site with multiple accommodations (for example), obviously there are many filter to try find exactly what you want, youcan sort by region, city, rating, price, type of accommodation (hotel, guest house, etc.). This all leads to one invevitable conclusion, many of the results would be the same. My question is how would you handle this? Via a rel canonical to the main categories (such as region or town) thus making it the successor, or no follow all the sub-category pages, thereby not allowing any search to reach deeper in. Thanks for the time and effort.
Intermediate & Advanced SEO | | ProsperoDigital0 -
How do I use public content without being penalized for duplication?
The NHTSA produces a list of all recalls for automobiles. In their "terms of use" it states that the information can be copied. I want to add that to our site, so there is an up-to-date list for our audience to see. However, I'm just copying and pasting. I'm allowed to according to NHTSA, but google will probably flag it right? Is there a way to do this without being penalized? Thanks, Ruben
Intermediate & Advanced SEO | | KempRugeLawGroup1 -
Capitals in URLs
Hello Mozzers. I've just been looking at a site with capitals in the URL - capitals are used in the product descriptions, so you'll have a URL structure like this: www.company.com/directory1/Double-Beds-Luxury (such URLs do not work if I lower the case of the capitals). There are 50,000 such products on the site. Clearly one drawback is potential customers might type in, or link to, the lower case of the URL and get a "not found" result (though the urls are relatively long so not that likely I'm thinking). Are there any additional drawbacks with the use of capitals outlined here?
Intermediate & Advanced SEO | | McTaggart0 -
Is it possible to "undo" canonical tags as unique content is created?
We will soon be launching an education site that teaches people how to drive (not really the topic, but it will do). We plan on being content rich and have plans to expand into several "schools" of driving. Currently, content falls into a number of categories, for example rules of the road, shifting gears, safety, etc. We are going to group content into general categories that apply broadly, and then into "schools" where the content is meant to be consumed in a specific order. So, for example, some URLs in general categories may be: drivingschool.com/safety drivingschool.com/rules-of-the-road drivingschool.com/shifting-gears etc. Then, schools will be available for specific types of vehicles. For example, drivingschool.com/cars drivingschool.com/motorbikes etc. We will provide lessons at the school level, and in the general categories. This is where it gets tricky. If people are looking for general content, then we want them to find pages in the general categories (for example, drivingschool.com/rules-of-the-road/traffic-signs). However, we have very similar content within each of the schools (for example, drivingschool.com/motorbikes/rules-of-the-road/traffic-signs). As you could imagine, sometimes the content is very unique between the various schools and the general category (such as in shifting), but often it is very similar or even nearly duplicate (as in the example above). The problem is that in the schools we want to say at the end of the lesson, "after this lesson, take the next lesson about speed limits for motorcycles" so there is a very logical click-path through the school. Unfortunately this creates potential duplicate content issues. The best solution I've come up with is to include a canonical tag (pointing to the general version of the page) whenever there is content that is virtually identical. There will be cases though where we adjust the content "down the road" 🙂 to be more unique and more specific for the school. At that time we'd want to remove the canonical tag. So two questions: Does anyone have any better ideas of how to handle this duplicate content? If we implement canonical tags now, and in 6 months update content to be more school-specific, will "undoing" the canonical tag (and even adding a self-referential tag) work for SEO? I really hope someone has some insight into this! Many thanks (in advance).
Intermediate & Advanced SEO | | JessicaB0 -
Duplicate content even with 301 redirects
I know this isn't a developer forum but I figure someone will know the answer to this. My site is http://www.stadriemblems.com and I have a 301 redirect in my .htaccess file to redirect all non-www to www and it works great. But SEOmoz seems to think this doesn't apply to my blog, which is located at http://www.stadriemblems.com/blog It doesn't seem to make sense that I'd need to place code in every .htaccess file of every sub-folder. If I do, what code can I use? The weirdest part about this is that the redirecting works just fine; it's just SEOmoz's crawler that doesn't seem to be with the program here. Does this happen to you?
Intermediate & Advanced SEO | | UnderRugSwept0 -
Duplicate content ramifications for country TLDs
We have a .com site here in the US that is ranking well for targeted phrases. The client is expanding its sales force into India and South Africa. They want to duplicate the site entirely, twice. Once for each country. I'm not well-versed in international SEO. Will this cause a duplicate content filter? Would google.co.in and google.co.za look at google.com's index for duplication? Thanks. Long time lurker, first time question poster.
Intermediate & Advanced SEO | | Alter_Imaging0 -
Pop Up Pages Being Indexed, Seen As Duplicate Content
I offer users the opportunity to email and embed images from my website. (See this page http://www.andertoons.com/cartoon/6246/ and look under the large image for "Email to a Friend" and "Get Embed HTML" links.) But I'm seeing the ensuing pop-up pages (Ex: http://www.andertoons.com/embed/5231/?KeepThis=true&TB_iframe=true&height=370&width=700&modal=true and http://www.andertoons.com/email/6246/?KeepThis=true&TB_iframe=true&height=432&width=700&modal=true) showing up in Google. Even worse, I think they're seen as duplicate content. How should I deal with this?
Intermediate & Advanced SEO | | andertoons0