Capitals in url creates duplicate content?
-
Hey Guys,
I had a quick look around however I couldn't find a specific answer to this.
Currently, the SEOmoz tools come back and show a heap of duplicate content on my site. And there's a fair bit of it.
However, a heap of those errors are relating to random capitals in the urls.
for example.
"www.website.com.au/Home/information/Stuff" is being treated as duplicate content of "www.website.com.au/home/information/stuff" (Note the difference in capitals).
Anyone have any recommendations as to how to fix this server side(keeping in mind it's not practical or possible to fix all of these links) or to tell Google to ignore the capitalisation?
Any help is greatly appreciated.
LM.
-
The IIS url-rewrite addon works great!
-
From my memory Google does treat urls as case sensitive.
Best to keep al urls as lower case.
-
Thanks for your reply Alan!
Bing is irrelevant in Belgium Maybe marketshare of 0,00005 or so
When I look at the SEOMoz crawling reports I panic, but when I look at GWT, I'm happy... The difference is huge.
So, no sure I will keep on using these reports..
-
I don't know that Google does ignore it. anyhow Bing does not http://perthseocompany.com.au/seo/reports/violation/the-page-contains-multiple-canonical-formats
-
If Google ignores the mixed usage of capitals in URL's, then why is the SEOMoz reporting it? If it is irrelevant, why not leaving it out?? It takes quite some work to filter out the irrelevant stuff!
-
Thanks Semil - The same duplicates are not showing in Google Webmaster Tools, for instance SEOMoz is showing 639 duplicate page content and 646 duplicate page titles. Webmaster tools is 88 and 37 respectively.
Looking into the numbers in SEOmoz again (and they've risen since the original post) there's a huge number which fall under the capitalisation discussed but also some which seem to register as HTTPS and HTTP.
-
Thanks Alan - I'll get on this...
-
Yes its seen as too different urls
http://perthseocompany.com.au/seo/reports/violation/the-page-contains-multiple-canonical-formats
If you are uisng a windows server (IIS), you can fix this easy by using the IIS url-rewrite addon. it had a rewite as lowercase preset
-
Google does count this as duplicate content. Semil is right. You want to have someone do url rewrites on the server side to 301 these to lowercase.
-
Hi LucasM,
Yes its possible by server side that you cant open a url with capital letters if you are using small letters.
But I dont think google will talke capitalisation in consideration.
Is it showing you in Google webmaster tool in duplicate titles and duplicate descriptions ?
If its showing then ask your coder to play with .htaccess to stop opening a url with different small - capital letter combination.
Thanks,
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
SEM Rush & Duplicate content
Hi SEMRush is flagging these pages as having duplicate content, but we have rel = next etc implemented: https://www.key.co.uk/en/key/brand/bott https://www.key.co.uk/en/key/brand/bott?page=2 Or is it being flagged as they're just really similar pages?
Intermediate & Advanced SEO | | BeckyKey0 -
Thin Content to Quality Content
How should i modify content from thin to high quality content. Somehow i realized that my pages where targetted keywords didn't had the keyword density lost a massive ranking after the last update whereas all pages which had the keyword density are ranking good. But my concern is all pages which are ranking good had all the keyword in a single statement like. Get ABC pens, ABC pencils, ABC colors, etc. at the end of a 300 word content describing ABC. Whereas the pages which dropped the rankings had a single keyword repeated just twice in a 500 word article. Can this be the reason for a massive drop. Should i add the single statement like the one which is there on pages ranking good? Is it good to add just a single line once the page is indexed or do i need to get a fresh content once again along with a sentence of keyword i mentioned above?
Intermediate & Advanced SEO | | welcomecure1 -
Best method for blocking a subdomain with duplicated content
Hello Moz Community Hoping somebody can assist. We have a subdomain, used by our CMS, which is being indexed by Google.
Intermediate & Advanced SEO | | KateWaite
http://www.naturalworldsafaris.com/
https://admin.naturalworldsafaris.com/ The page is the same so we can't add a no-index or no-follow.
I have both set up as separate properties in webmaster tools I understand the best method would be to update the robots.txt with a user disallow for the subdomain - but the robots text is only accessible on the main domain. http://www.naturalworldsafaris.com/robots.txt Will this work if we add the subdomain exclusion to this file? It means it won't be accessible on https://admin.naturalworldsafaris.com/robots.txt (where we can't create a file). Therefore won't be seen within that specific webmaster tools property. I've also asked the developer to add a password protection to the subdomain but this does not look possible. What approach would you recommend?0 -
Scraping / Duplicate Content Question
Hi All, I understanding the way to protect content such as a feature rich article is to create authorship by linking to your Google+ account. My Question
Intermediate & Advanced SEO | | Mark_Ch
You have created a webpage that is informative but not worthy to be an article, hence no need create authorship in Google+
If a competitor comes along and steals this content word for word, something similar, creates their own Google+ page, can you be penalised? Is there any way to protect yourself without authorship and Google+? Regards Mark0 -
Blog Duplicate Content
Hi, I have a blog, and like most blogs I have various search options (subject matter, author, archive, etc) which produce the same content via different URLs. Should I implement the rel-canonical tag AND the meta robots tag (noindex, follow) on every page of duplicate blog content, or simply choose one or the other? What's best practice? Thanks Mozzers! Luke
Intermediate & Advanced SEO | | McTaggart0 -
Which is more effective: JQuery + CSS for Tabbed Content or Create Unique Pages for each tab.
We are building a from-scratch directory site and trying to determine the best way to structure our pages. Each general listing page has four sections of specific information. What is a better strategy for SEO: Using tabs (e.g. JQuery + CSS) and putting all content on one page (and will all of the content still be indexible using JQuery?) OR creating unique pages for each section. JQuery: sitename.com/listing-name#section1 Unique Pages: sitename.com/listing-name/section1 If I go with option one, I can risk not being crawlable by google if they can't read through the scripting. However, I feel like the individual pages will not rank if there's a small amount of content for each section. Is it better to keep all the content on one page and focus on building links to that? Or better to build out the section pages and worry about adding quality content to them so that long term there is more specificity for long tail search and better quality search experience on Google? We are also set up to have "../listing-type/listing-name" but are considering removing 'listing type and just having "../listing-name/". Do you think this more advantageous for boosting rankings? I know that was like five questions. I've been doing a lot of research and these are the things that I'm still scratching my head about. Some general direction would be really great! Thank You!
Intermediate & Advanced SEO | | knowyourbank0 -
How to Resolve Duplication of HTTPS & HTPP URLs?
Right now, I am working on eCommerce website. [Lamps Lighting and More] I can find out both URLs in website as follow. HTTP Version: http://www.lampslightingandmore.com/ HTTPS Version: https://www.lampslightingandmore.com/ I have check one of my competitor who has implemented following canonical on both pages. Please, view source code for both URLs. http://www.wayfair.com ** https://www.wayfair.com** Then, I checked similar thing in SEOmoz website. 🙂 Why should I not check in SEOmoz because, They are providing best SEO information so may be using best practice to deal with HTTPS & HTTP. LOL I tried to load following URL so it redirect to home page. https://www.seomoz.org is redirecting to http://www.seomoz.org But, following URL is not redirecting any where as well as not set canonical over there. https://www.seomoz.org/users/settings I can find out following code on http://www.seomoz.org/robots.txt **User-agent: *** ** Disallow: /api/user?*** So, I am quite confuse to solve issue. Which one is best 301 redirect or canonical tag? If any live example to see so that's good for me and make me more confident.
Intermediate & Advanced SEO | | CommercePundit0 -
I have a duplicate content problem
The website guy that made the website for my business Premier Martial Arts Austin disappeared and didn't set up that www. was to begin each URL, so I now have a duplicate content problem and don't want to be penalized for it. I tried to show in Webmaster tools the preferred setup but can't get it to OK that I'm the website owner. Any idea as what to do?
Intermediate & Advanced SEO | | OhYeahSteve0