Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Capitals in url creates duplicate content?
-
Hey Guys,
I had a quick look around however I couldn't find a specific answer to this.
Currently, the SEOmoz tools come back and show a heap of duplicate content on my site. And there's a fair bit of it.
However, a heap of those errors are relating to random capitals in the urls.
for example.
"www.website.com.au/Home/information/Stuff" is being treated as duplicate content of "www.website.com.au/home/information/stuff" (Note the difference in capitals).
Anyone have any recommendations as to how to fix this server side(keeping in mind it's not practical or possible to fix all of these links) or to tell Google to ignore the capitalisation?
Any help is greatly appreciated.
LM.
-
The IIS url-rewrite addon works great!
-
From my memory Google does treat urls as case sensitive.
Best to keep al urls as lower case.
-
Thanks for your reply Alan!
Bing is irrelevant in Belgium
Maybe marketshare of 0,00005 or so
When I look at the SEOMoz crawling reports I panic, but when I look at GWT, I'm happy... The difference is huge.
So, no sure I will keep on using these reports..
-
I don't know that Google does ignore it. anyhow Bing does not http://perthseocompany.com.au/seo/reports/violation/the-page-contains-multiple-canonical-formats
-
If Google ignores the mixed usage of capitals in URL's, then why is the SEOMoz reporting it? If it is irrelevant, why not leaving it out?? It takes quite some work to filter out the irrelevant stuff!
-
Thanks Semil - The same duplicates are not showing in Google Webmaster Tools, for instance SEOMoz is showing 639 duplicate page content and 646 duplicate page titles. Webmaster tools is 88 and 37 respectively.
Looking into the numbers in SEOmoz again (and they've risen since the original post) there's a huge number which fall under the capitalisation discussed but also some which seem to register as HTTPS and HTTP.
-
Thanks Alan - I'll get on this...
-
Yes its seen as too different urls
http://perthseocompany.com.au/seo/reports/violation/the-page-contains-multiple-canonical-formats
If you are uisng a windows server (IIS), you can fix this easy by using the IIS url-rewrite addon. it had a rewite as lowercase preset
-
Google does count this as duplicate content. Semil is right. You want to have someone do url rewrites on the server side to 301 these to lowercase.
-
Hi LucasM,
Yes its possible by server side that you cant open a url with capital letters if you are using small letters.
But I dont think google will talke capitalisation in consideration.
Is it showing you in Google webmaster tool in duplicate titles and duplicate descriptions ?
If its showing then ask your coder to play with .htaccess to stop opening a url with different small - capital letter combination.
Thanks,
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate content in Shopify - subsequent pages in collections
Hello everyone! I hope an expert in this community can help me verify the canonical codes I'll add to our store is correct. Currently, in our Shopify store, the subsequent pages in the collections are not indexed by Google, however the canonical URL on these pages aren't pointing to the main collection page (page 1), e.g. The canonical URL of page 2, page 3 etc are used as canonical URLs instead of the first page of the collections. I have the canonical codes attached below, it would be much appreciated if an expert can urgently verify these codes are good to use and will solve the above issues? Thanks so much for your kind help in advance!! -----------------CODES BELOW--------------- <title><br /> {{ page_title }}{% if current_tags %} – tagged "{{ current_tags | join: ', ' }}"{% endif %}{% if current_page != 1 %} – Page {{ current_page }}{% endif %}{% unless page_title contains shop.name %} – {{ shop.name }}{% endunless %}<br /></title>
Intermediate & Advanced SEO | | ycnetpro101
{% if page_description %} {% endif %} {% if current_page != 1 %} {% else %} {% endif %}
{% if template == 'collection' %}{% if collection %}
{% if current_page == 1 %} {% endif %}
{% if template == 'product' %}{% if product %} {% endif %}
{% if template == 'collection' %}{% if collection %} {% endif %}0 -
Duplicate content on URL trailing slash
Hello, Some time ago, we accidentally made changes to our site which modified the way urls in links are generated. At once, trailing slashes were added to many urls (only in links). Links that used to send to
Intermediate & Advanced SEO | | yacpro13
example.com/webpage.html Were now linking to
example.com/webpage.html/ Urls in the xml sitemap remained unchanged (no trailing slash). We started noticing duplicate content (because our site renders the same page with or without the trailing shash). We corrected the problematic php url function so that now, all links on the site link to a url without trailing slash. However, Google had time to index these pages. Is implementing 301 redirects required in this case?1 -
Duplicate content on recruitment website
Hi everyone, It seems that Panda 4.2 has hit some industries more than others. I just started working on a website, that has no manual action, but the organic traffic has dropped massively in the last few months. Their external linking profile seems to be fine, but I suspect usability issues, especially the duplication may be the reason. The website is a recruitment website in a specific industry only. However, they posts jobs for their clients, that can be very similar, and in the same time they can have 20 jobs with the same title and very similar job descriptions. The website currently have over 200 pages with potential duplicate content. Additionally, these jobs get posted on job portals, with the same content (Happens automatically through a feed). The questions here are: How bad would this be for the website usability, and would it be the reason the traffic went down? Is this the affect of Panda 4.2 that is still rolling What can be done to resolve these issues? Thank you in advance.
Intermediate & Advanced SEO | | iQi0 -
Removing duplicate content
Due to URL changes and parameters on our ecommerce sites, we have a massive amount of duplicate pages indexed by google, sometimes up to 5 duplicate pages with different URLs. 1. We've instituted canonical tags site wide. 2. We are using the parameters function in Webmaster Tools. 3. We are using 301 redirects on all of the obsolete URLs 4. I have had many of the pages fetched so that Google can see and index the 301s and canonicals. 5. I created HTML sitemaps with the duplicate URLs, and had Google fetch and index the sitemap so that the dupes would get crawled and deindexed. None of these seems to be terribly effective. Google is indexing pages with parameters in spite of the parameter (clicksource) being called out in GWT. Pages with obsolete URLs are indexed in spite of them having 301 redirects. Google also appears to be ignoring many of our canonical tags as well, despite the pages being identical. Any ideas on how to clean up the mess?
Intermediate & Advanced SEO | | AMHC0 -
International SEO - cannibalisation and duplicate content
Hello all, I look after (in house) 3 domains for one niche travel business across three TLDs: .com .com.au and co.uk and a fourth domain on a co.nz TLD which was recently removed from Googles index. Symptoms: For the past 12 months we have been experiencing canibalisation in the SERPs (namely .com.au being rendered in .com) and Panda related ranking devaluations between our .com site and com.au site. Around 12 months ago the .com TLD was hit hard (80% drop in target KWs) by Panda (probably) and we began to action the below changes. Around 6 weeks ago our .com TLD saw big overnight increases in rankings (to date a 70% averaged increase). However, almost to the same percentage we saw in the .com TLD we suffered significant drops in our .com.au rankings. Basically Google seemed to switch its attention from .com TLD to the .com.au TLD. Note: Each TLD is over 6 years old, we've never proactively gone after links (Penguin) and have always aimed for quality in an often spammy industry. **Have done: ** Adding HREF LANG markup to all pages on all domain Each TLD uses local vernacular e.g for the .com site is American Each TLD has pricing in the regional currency Each TLD has details of the respective local offices, the copy references the lacation, we have significant press coverage in each country like The Guardian for our .co.uk site and Sydney Morning Herlad for our Australia site Targeting each site to its respective market in WMT Each TLDs core-pages (within 3 clicks of the primary nav) are 100% unique We're continuing to re-write and publish unique content to each TLD on a weekly basis As the .co.nz site drove such little traffic re-wrting we added no-idex and the TLD has almost compelte dissapread (16% of pages remain) from the SERPs. XML sitemaps Google + profile for each TLD **Have not done: ** Hosted each TLD on a local server Around 600 pages per TLD are duplicated across all TLDs (roughly 50% of all content). These are way down the IA but still duplicated. Images/video sources from local servers Added address and contact details using SCHEMA markup Any help, advice or just validation on this subject would be appreciated! Kian
Intermediate & Advanced SEO | | team_tic1 -
Duplicate content on subdomains.
Hi Mozer's, I have a site www.xyz.com and also geo targeted sub domains www.uk.xyz.com, www.india.xyz.com and so on. All the sub domains have the content which is same as the content on the main domain that is www.xyz.com. So, I want to know how can i avoid content duplication. Many Thanks!
Intermediate & Advanced SEO | | HiteshBharucha0 -
Is SEOmoz.org creating duplicate content with their CDN subdomain?
Example URL: http://cdn.seomoz.org/q/help-with-getting-no-conversions Canonical is a RELATIVE link, should be an absolute link pointing to main domain: http://www.seomoz.org/q/help-with-getting-no-conversions <link href='[/q/help-with-getting-no-conversions](view-source:http://cdn.seomoz.org/q/help-with-getting-no-conversions)' rel='<a class="attribute-value">canonical</a>' /> 13,400 pages indexed in Google under cdn subdomain go to google > site:http://cdn.seomoz.org https://www.google.com/#hl=en&output=search&sclient=psy-ab&q=site:http%3A%2F%2Fcdn.seomoz.org%2F&oq=site:http%3A%2F%2Fcdn.seomoz.org%2F&gs_l=hp.2...986.6227.0.6258.28.14.0.0.0.5.344.3526.2-10j2.12.0.les%3B..0.0...1c.Uprw7ko7jnU&pbx=1&bav=on.2,or.r_gc.r_pw.r_cp.r_qf.&fp=97577626a0fb6a97&biw=1920&bih=936
Intermediate & Advanced SEO | | irvingw1 -
Duplicate Content on Press Release?
Hi, We recently held a charity night in store. And had a few local celebs turn up etc... We created a press release to send out to various media outlets, within the press release were hyperlinks to our site and links on certain keywords to specific brands on our site. My question is, should we be sending a different press release to each outlet to stop the duplicate content thing, or is sending the same release out to everyone ok? We will be sending approx 20 of these out, some going online and some not. So far had one local paper website, a massive football website and a local magazine site. All pretty much same content and a few pics. Any help, hints or tips on how to go about this if I am going to be sending out to a load of other sites/blogs? Cheers
Intermediate & Advanced SEO | | YNWA0