Capitals in url creates duplicate content?
-
Hey Guys,
I had a quick look around however I couldn't find a specific answer to this.
Currently, the SEOmoz tools come back and show a heap of duplicate content on my site. And there's a fair bit of it.
However, a heap of those errors are relating to random capitals in the urls.
for example.
"www.website.com.au/Home/information/Stuff" is being treated as duplicate content of "www.website.com.au/home/information/stuff" (Note the difference in capitals).
Anyone have any recommendations as to how to fix this server side(keeping in mind it's not practical or possible to fix all of these links) or to tell Google to ignore the capitalisation?
Any help is greatly appreciated.
LM.
-
The IIS url-rewrite addon works great!
-
From my memory Google does treat urls as case sensitive.
Best to keep al urls as lower case.
-
Thanks for your reply Alan!
Bing is irrelevant in Belgium Maybe marketshare of 0,00005 or so
When I look at the SEOMoz crawling reports I panic, but when I look at GWT, I'm happy... The difference is huge.
So, no sure I will keep on using these reports..
-
I don't know that Google does ignore it. anyhow Bing does not http://perthseocompany.com.au/seo/reports/violation/the-page-contains-multiple-canonical-formats
-
If Google ignores the mixed usage of capitals in URL's, then why is the SEOMoz reporting it? If it is irrelevant, why not leaving it out?? It takes quite some work to filter out the irrelevant stuff!
-
Thanks Semil - The same duplicates are not showing in Google Webmaster Tools, for instance SEOMoz is showing 639 duplicate page content and 646 duplicate page titles. Webmaster tools is 88 and 37 respectively.
Looking into the numbers in SEOmoz again (and they've risen since the original post) there's a huge number which fall under the capitalisation discussed but also some which seem to register as HTTPS and HTTP.
-
Thanks Alan - I'll get on this...
-
Yes its seen as too different urls
http://perthseocompany.com.au/seo/reports/violation/the-page-contains-multiple-canonical-formats
If you are uisng a windows server (IIS), you can fix this easy by using the IIS url-rewrite addon. it had a rewite as lowercase preset
-
Google does count this as duplicate content. Semil is right. You want to have someone do url rewrites on the server side to 301 these to lowercase.
-
Hi LucasM,
Yes its possible by server side that you cant open a url with capital letters if you are using small letters.
But I dont think google will talke capitalisation in consideration.
Is it showing you in Google webmaster tool in duplicate titles and duplicate descriptions ?
If its showing then ask your coder to play with .htaccess to stop opening a url with different small - capital letter combination.
Thanks,
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Possible duplicate content issue
Hi, Here is a rather detailed overview of our problem, any feedback / suggestions is most welcome. We currently have 6 sites targeting the various markets (countries) we operate in all websites are on one wordpress install but are separate sites in a multisite network, content and structure is pretty much the same barring a few regional differences. The UK site has held a pretty strong position in search engines the past few years. Here is where we have the problem. Our strongest page (from an organic point of view) has dropped off the search results completely for Google.co.uk, we've picked this up through a drop in search visibility in SEMRush, and confirmed this by looking at our organic landing page traffic in Google Analytics and Search Analytics in Search Console. Here are a few of the assumptions we've made and things we've checked: Checked for any Crawl or technical issues, nothing serious found Bad backlinks, no new spammy backlinks Geotarggetting, this was fine for the UK site, however the US site a .com (not a cctld) was not set to the US (we suspect this to be the issue, but more below) On-site issues, nothing wrong here - the page was edited recently which coincided with the drop in traffic (more below), but these changes did not impact things such as title, h1, url or body content - we replaced some call to action blocks from a custom one to one that was built into the framework (Div) Manual or algorithmic penalties: Nothing reported by search console HTTPs change: We did transition over to http at the start of june. The sites are not too big (around 6K pages) and all redirects were put in place. Here is what we suspect has happened, the https change triggered google to re-crawl and reindex the whole site (we anticipated this), during this process, an edit was made to the key page, and through some technical fault the page title was changed to match the US version of the page, and because geotargetting was not turned on for the US site, Google filtered out the duplicate content page on the UK site, there by dropping it off the index. What further contributes to this theory is that a search of Google.co.uk returns the US version of the page. With country targeting on (ie only return pages from the UK) that UK version of the page is not returned. Also a site: query from google.co.uk DOES return the Uk version of that page, but with the old US title. All these factors leads me to believe that its a duplicate content filter issue due to incorrect geo-targetting - what does surprise me is that the co.uk site has much more search equity than the US site, so it was odd that it choose to filter out the UK version of the page. What we have done to counter this is as follows: Turned on Geo targeting for US site Ensured that the title of the UK page says UK and not US Edited both pages to trigger a last modified date and so the 2 pages share less similarities Recreated a site map and resubmitted to Google Re-crawled and requested a re-index of the whole site Fixed a few of the smaller issues If our theory is right and our actions do help, I believe its now a waiting game for Google to re-crawl and reindex. Unfortunately, Search Console is still only showing data from a few days ago, so its hard to tell if there has been any changes in the index. I am happy to wait it out, but you can appreciate that some of snr management are very nervous given the impact of loosing this page and are keen to get a second opinion on the matter. Does the Moz Community have any further ideas or insights on how we can speed up the indexing of the site? Kind regards, Jason
Intermediate & Advanced SEO | | Clickmetrics0 -
Different language with direct translation: duplicate content, meta?
For a site that does NOT want a separate subdomain, or directory, or TLD for a country/language would the directly translated page (static) content/meta be duplicate? (NOT considering a translation of the term/acronym which could exist in another language) i.e. /SEO-city-state in English vs. /SEO-city-state Spanish -In this example a term/acronym that is the same in any language. Outside of duplicate content, are their other conflict potentials in rankings you can think of?
Intermediate & Advanced SEO | | bozzie3110 -
Galleries and duplicate content
Hi! I am now studing a website, and I have detected that they are maybe generating duplicate content because of image galleries. When they want to show details of some of their products, they link to a gallery url
Intermediate & Advanced SEO | | teconsite
something like this www.domain.com/en/gallery/slide/101 where you can find the logotype, a full image and a small description. There is a next and a prev button over the slider. The next goes to the next picture www.domain.com/en/gallery/slide/102 and so on. But the next picture is in a different URL!!!! The problem is that they are generating lots of urls with very thin content inside.
The pictures have very good resolution, and they are perfect for google images searchers, so we don't want to use the noindex tag. I thought that maybe it would be best to work with a single url with the whole gallery inside it (for example, the 6 pictures working with a slideshow in the same url ), but as the pictures are very big, the page weight would be greater than 7 Mb. If we keep the pictures working that way (different urls per picture), we will be generating duplicate content each time they want to create a gallery. What is your recommendation? Thank you!0 -
Opinion on Duplicate Content Scenario
So there are 2 pest control companies owned by the same person - Sovereign and Southern. (The two companies serve different markets) They have two different website URLs, but the website code is actually all the same....the code is hosted in one place....it just uses an if/else structure with dynamic php which determines whether the user sees the Sovereign site or the Southern site....know what I am saying? Here are the two sites: www.sovereignpestcontrol.com and www.southernpestcontrol.com. This is a duplicate content SEO nightmare, right?
Intermediate & Advanced SEO | | MeridianGroup0 -
Does duplicate content penalize the whole site or just the pages affected?
I am trying to assess the impact of duplicate content on our e-commerce site and I need to know if the duplicate content is affecting only the pages that contain the dupe content or does it affect the whole site? In Google that is. But of course. Lol
Intermediate & Advanced SEO | | bjs20100 -
Same content pages in different versions of Google - is it duplicate>
Here's my issue I have the same page twice for content but on different url for the country, for example: www.example.com/gb/page/ and www.example.com/us/page So one for USA and one for Great Britain. Or it could be a subdomain gb. or us. etc. Now is it duplicate content is US version indexes the page and UK indexes other page (same content different url), the UK search engine will only see the UK page and the US the us page, different urls but same content. Is this bad for the panda update? or does this get away with it? People suggest it is ok and good for localised search for an international website - im not so sure. Really appreciate advice.
Intermediate & Advanced SEO | | pauledwards0 -
Copying my Facebook content to website considered duplicate content?
I write career advice on Facebook on a daily basis. On my homepage users can see the most recent 4-5 feeds (using FB social media plugin). I am thinking to create a page on my website where visitors can see all my previous FB feeds. Would this be considered duplicate content if I copy paste the info, but if I use a Facebook social media plugin then it is not considered duplicate content? I am working on increasing content on my website and feel incorporating FB feeds would make sense. thank you
Intermediate & Advanced SEO | | knielsen0 -
Duplicate page Content
There has been over 300 pages on our clients site with duplicate page content. Before we embark on a programming solution to this with canonical tags, our developers are requesting the list of originating sites/links/sources for these odd URLs. How can we find a list of the originating URLs? If you we can provide a list of originating sources, that would be helpful. For example, our the following pages are showing (as a sample) as duplicate content: www.crittenton.com/Video/View.aspx?id=87&VideoID=11 www.crittenton.com/Video/View.aspx?id=87&VideoID=12 www.crittenton.com/Video/View.aspx?id=87&VideoID=15 www.crittenton.com/Video/View.aspx?id=87&VideoID=2 "How did you get all those duplicate urls? I have tried to google the "contact us", "news", "video" pages. I didn't get all those duplicate pages. The page id=87 on the most of the duplicate pages are not supposed to be there. I was wondering how the visitors got to all those duplicate pages. Please advise." Note, the CMS does not create this type of hybrid URLs. We are as curious as you as to where/why/how these are being created. Thanks.
Intermediate & Advanced SEO | | dlemieux0