Duplicate content
-
I have just ran a report in seomoz on my domain and has noticed that there are duplicate content issues, the issues are:
www.domainname/directory-name/
www.domainname/directory-name/index.php
All my internal links and external links point to the first domain, as i prefer this style as it looks clear & concise, however doing this has created duplicate content as within the site itself i have an index.php page inside this /directory-name/ to show the page.
Could anyone give me some advice on what i should do please?
Kind Regards
-
Hi Gary,
Here's some code from an htaccess file I've used before that solves the issue you've got with index.php at the end of all your urls:
#remove /index.php and ensure admin works okay
RewriteCond %{REQUEST_URI} !^/administrator
RewriteCond %{THE_REQUEST} ^.*/index.php\ HTTP/
RewriteRule ^(.*)index.php$ /$1 [R=301,L]
notice the line that contains ^/administrator , in Joomla, admin login is usuall on http://site.com/administrator/index.php
so, removing the index.php from the admin url would prevent any access to the admin screens! If your cms has a similar url, be sure to replace 'administrator' with the relevant url.
-
Hi Ade,
Thanks for all your help.
I will post a new question on the Q&A Forum regarding the .htaccess rule.
Kind Regards
-
Hi Gary,
That one is a bit beyond me I'm afraid and I am not familiar with WebEdition at all.
With most CMS there are normally either built-in or add-on extensions to help with re-writing your urls but you need to be really careful that you don't end up with a completely new set of urls that don't match either of your originals.
A .htaccess rewrite rule may be your best option but I don't know what the coding for it would be.
-
We are using a CMS, its called WebEdition, is there a technical question i should ask them in what i need to do?
Kind Regards
-
Ahhhh. No definitely not practical, I thought that it was just the one url.
Are you using a content management system for your site such as Joomla?
-
Do you think that's practical to do that?
As i will need to do a 301 on literally every page if i don't want to show the /index.php
Is this what seomoz.org website does? for example:
-
In that case you can just add a 301 redirect in to your .htaccess file below the code you added earlier.
redirect 301 /football-teams/index.php http://www.mydomain.com/football-teams/
-
Hi Ade,
Yes, i tested http://www.mydomain.com/football-teams//index.php however it did not resolve to http://www.mydomain.com/football-teams/
Any ideas?
-
Hi Gary.
Have you tried visiting the url http://www.mydomain.com/football-teams/index.php to see if it now resolves to http://www.mydomain.com/football-teams/ ?
If it does then the issue is fixed, the next time SEOMoz crawls your site the error will dissapear.
Cheers.
Ade.
-
Hi Ade,
Thanks for the speedy reply.
I have now implemented this and works fantastic on the http://www.mydomain.com/
Thank you very much.
There is another issue however, i hope i can make sense here, here goes:
seomoz tool gives me back duplicate content on both these URL's
http://www.mydomain.com/football-teams/
http://www.mydomain.com/football-teams/index.php
I want to use http://www.mydomain.com/football-teams/ as this just look nice & clean.
What would be best practice to fix this issue?
Kind Regards
-
Hey Gary.
Here's the solution that I use.
All my sites are hosted on a linux server so this won't be relevant if your site is hosted on a windows server.
1. create/modify your .htaccess file in your site's root directory.
2. Add the following code to the top of the file:-
RewriteEngine On
RewriteBase /
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index.php\ HTTP/
RewriteRule ^index.php$ http://www.yourdomain.com/ [R=301,L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]RewriteCond %{HTTP_HOST} ^yourdomain.com [NC]
RewriteRule ^(.*)$ http://www.yourdomain.com/$1 [L,R=301]This will ensure that any requests sent to http://yourdomain.com are redirected to http://www.yourdomain.com and that the index.php part of the url is removed.
If you need more help on creating or modifying your .htaccess file then you can find more info here - http://httpd.apache.org/docs/1.3/howto/htaccess.html
All the best.
Ade.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How do I avoid this issue of duplicate content with Google?
I have an ecommerce website which sells a product that has many different variations based on a vehicle’s make, model, and year. Currently, we sell this product on one page “www.cargoliner.com/products.php?did=10001” and we show a modal to sort through each make, model, and year. This is important because based on the make, model, and year, we have different prices/configurations for each. For example, for the Jeep Wrangler and Jeep Cherokee, we might have different products: Ultimate Pet Liner - Jeep Wrangler 2011-2013 - $350 Ultimate Pet Liner - Jeep Wrangler 2014 - 2015 - $350 Utlimate Pet Liner - Jeep Cherokee 2011-2015 - $400 Although the typical consumer might think we have 1 product (the Ultimate Pet Liner), we look at these as many different types of products, each with a different configuration and different variants. We do NOT have unique content for each make, model, and year. We have the same content and images for each. When the customer selects their make, model, and year, we just search and replace the text to make it look like the make, model, and year. For example, when a custom selects 2015 Jeep Wrangler from the modal, we do a search and replace so the page will have the same url (www.cargoliner.com/products.php?did=10001) but the product title will say “2015 Jeep Wrangler”. Here’s my problem: We want all of these individual products to have their own unique urls (cargoliner.com/products/2015-jeep-wrangler) so we can reference them in emails to customers and ideally we start creating unique content for them. Our only problem is that there will be hundreds of them and they don’t have unique content other than us switching in the product title and change of variants. Also, we don’t want our url www.cargoliner.com/products.php?did=10001 to lose its link juice. Here’s my question(s): My assumption is that I should just keep my url: www.cargoliner.com/products.php?did=10001 and be able to sort through the products on that page. Then I should go ahead and make individual urls for each of these products (i.e. cargoliner.com/products/2015-jeep-wrangler) but just add a “nofollow noindex” to the page. Is this what I should do? How secure is a “no-follow noindex” on a webpage? Does Google still index? Am I at risk for duplicate content penalties? Thanks!
Technical SEO | | kirbyfike0 -
Duplicate content on charity website
Hi Mozers, We are working on a website for a UK charity – they are a hospice and have two distinct brands, one for their adult services and another for their children’s services. They currently have two different websites which have a large number of pages that contain identical text. We spoke with them and agreed that it would be better to combine the websites under one URL – that way a number of the duplicate pages could be reduced as they are relevant to both brands. What seamed like a good idea initially is beginning to not look so good now. We had planned to use CSS to load different style sheets for each brand – depending on the referring URL (adult / Child) the page would display the appropriate branding. This will will work well up to a point. What we can’t work out is how to style the page if it is the initial landing page – the brands are quite different and we need to get this right. It is not such an issue for the management type pages (board of trustees etc) as they govern both identities. The issue is the donation, fundraising pages – they need to be found, and we are concerned that users will be confused if one of those pages is the initial landing page and they are served the wrong brand. We have thought of making one page the main page and using rel canonical on the other one, but that will affect its ability to be found in the search engines. Really not sure what the best way to move forward would be, any suggestions / guidance would be much appreciated. Thanks Fraser .
Technical SEO | | fraserhannah0 -
Duplicate Content from Multiple Sources Cross-Domain
Hi Moz Community, We have a client who is legitimately repurposing, or scraping, content from site A to site B. I looked into it and Google recommends the cross-domain rel=canonical tag below: http://googlewebmastercentral.blogspot.com/2009/12/handling-legitimate-cross-domain.html The issue is it is not a one to one situation. In fact site B will have several pages of content from site A all on one URL. Below is an example of what they are trying to accomplish. EX - www.siteB.com/apples-and-oranges is made up of content from www.siteA.com/apples & www.siteB.com/oranges So with that said, are we still in fear of getting hit for duplicate content? Should we add multiple rel=canonical tags to reflect both pages? What should be our course of action.
Technical SEO | | SWKurt0 -
Looking for a technical solution for duplicate content
Hello, Are there any technical solutions to duplicate content similar to the nofollow tag? A tag which can indicate to Google that we know that this is duplicate content but we want it there because it makes sense to the user. Thank you.
Technical SEO | | FusionMediaLimited0 -
Container Page/Content Page Duplicate Content
My client has a container page on their website, they are using SiteFinity, so it is called a "group page", in which individual pages appear and can be scrolled through. When link are followed, they first lead to the group page URL, in which the first content page is shown. However, when navigating through the content pages, the URL changes. When navigating BACK to the first content page, the URL is that for the content page, but it appears to indexers as a duplicate of the group page, that is, the URL that appeared when first linking to the group page. The client updates this on the regular, so I need to find a solution that will allow them to add more pages, the new one always becoming the top page, without requiring extra coding. For instance, I had considered integrating REL=NEXT and REL=PREV, but they aren't going to keep that up to date.
Technical SEO | | SpokeHQ1 -
Duplicate Footer Content
A client I just took over is having some duplicate content issues. At the top of each page he has about 200 words of unique content. Below this is are three big tables of text that talks about his services, history, etc. This table is pulled into the middle of every page using php. So, he has the exact same three big table of text across every page. What should I do to eliminate the dup content. I thought about removing the script then just rewriting the table of text on every page... Is there a better solution? Any ideas would be greatly appreciated. Thanks!
Technical SEO | | BigStereo0 -
Ways of Helping Reducing Duplicate Content.
Hi I am looking to no of anyway there is at helping to reduce duplicate content on a website with out breaking link and affecting Google rankings.
Technical SEO | | Feily0 -
Duplicate content question with PDF
Hi, I manage a property listing website which was recently revamped, but which has some on-site optimization weaknesses and issues. For each property listing like http://www.selectcaribbean.com/property/147.html there is an equivalent PDF version spidered by google. The page looks like this http://www.selectcaribbean.com/pdf1.php?pid=147 my question is: Can this create a duplicate content penalty? If yes, should I ban these pages from being spidered by google in the robots.txt or should I make these link nofollow?
Technical SEO | | multilang0