Duplicate content
-
I have just ran a report in seomoz on my domain and has noticed that there are duplicate content issues, the issues are:
www.domainname/directory-name/
www.domainname/directory-name/index.php
All my internal links and external links point to the first domain, as i prefer this style as it looks clear & concise, however doing this has created duplicate content as within the site itself i have an index.php page inside this /directory-name/ to show the page.
Could anyone give me some advice on what i should do please?
Kind Regards
-
Hi Gary,
Here's some code from an htaccess file I've used before that solves the issue you've got with index.php at the end of all your urls:
#remove /index.php and ensure admin works okay
RewriteCond %{REQUEST_URI} !^/administrator
RewriteCond %{THE_REQUEST} ^.*/index.php\ HTTP/
RewriteRule ^(.*)index.php$ /$1 [R=301,L]
notice the line that contains ^/administrator , in Joomla, admin login is usuall on http://site.com/administrator/index.php
so, removing the index.php from the admin url would prevent any access to the admin screens! If your cms has a similar url, be sure to replace 'administrator' with the relevant url.
-
Hi Ade,
Thanks for all your help.
I will post a new question on the Q&A Forum regarding the .htaccess rule.
Kind Regards
-
Hi Gary,
That one is a bit beyond me I'm afraid and I am not familiar with WebEdition at all.
With most CMS there are normally either built-in or add-on extensions to help with re-writing your urls but you need to be really careful that you don't end up with a completely new set of urls that don't match either of your originals.
A .htaccess rewrite rule may be your best option but I don't know what the coding for it would be.
-
We are using a CMS, its called WebEdition, is there a technical question i should ask them in what i need to do?
Kind Regards
-
Ahhhh. No definitely not practical, I thought that it was just the one url.
Are you using a content management system for your site such as Joomla?
-
Do you think that's practical to do that?
As i will need to do a 301 on literally every page if i don't want to show the /index.php
Is this what seomoz.org website does? for example:
-
In that case you can just add a 301 redirect in to your .htaccess file below the code you added earlier.
redirect 301 /football-teams/index.php http://www.mydomain.com/football-teams/
-
Hi Ade,
Yes, i tested http://www.mydomain.com/football-teams//index.php however it did not resolve to http://www.mydomain.com/football-teams/
Any ideas?
-
Hi Gary.
Have you tried visiting the url http://www.mydomain.com/football-teams/index.php to see if it now resolves to http://www.mydomain.com/football-teams/ ?
If it does then the issue is fixed, the next time SEOMoz crawls your site the error will dissapear.
Cheers.
Ade.
-
Hi Ade,
Thanks for the speedy reply.
I have now implemented this and works fantastic on the http://www.mydomain.com/
Thank you very much.
There is another issue however, i hope i can make sense here, here goes:
seomoz tool gives me back duplicate content on both these URL's
http://www.mydomain.com/football-teams/
http://www.mydomain.com/football-teams/index.php
I want to use http://www.mydomain.com/football-teams/ as this just look nice & clean.
What would be best practice to fix this issue?
Kind Regards
-
Hey Gary.
Here's the solution that I use.
All my sites are hosted on a linux server so this won't be relevant if your site is hosted on a windows server.
1. create/modify your .htaccess file in your site's root directory.
2. Add the following code to the top of the file:-
RewriteEngine On
RewriteBase /
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index.php\ HTTP/
RewriteRule ^index.php$ http://www.yourdomain.com/ [R=301,L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]RewriteCond %{HTTP_HOST} ^yourdomain.com [NC]
RewriteRule ^(.*)$ http://www.yourdomain.com/$1 [L,R=301]This will ensure that any requests sent to http://yourdomain.com are redirected to http://www.yourdomain.com and that the index.php part of the url is removed.
If you need more help on creating or modifying your .htaccess file then you can find more info here - http://httpd.apache.org/docs/1.3/howto/htaccess.html
All the best.
Ade.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Canonical Tags - Do they only apply to internal duplicate content?
Hi Moz, I've had a complaint from a company who we use a feed from to populate a restaurants product list.They are upset that on our products pages we have canonical tags linking back to ourselves. These are in place as we have international versions of the site. They believe because they are the original source of content we need to canonical back to them. Can I please confirm that canonical tags are purely an internal duplicate content strategy. Canonical isn't telling google that from all the content on the web that this is the original source. It's just saying that from the content on our domains, this is the original one that should be ranked. Is that correct? Furthermore, if we implemented a canonical tag linking to Best Restaurants it would de-index all of our restaurants listings and pages and pass the authority of these pages to their site. Is this correct? Thanks!
Technical SEO | | benj20341 -
Why are some pages now duplicate content?
It is probably a silly question, but all of a sudden, the following pages of one of my clients are reported as Duplicate content. I cannot understand why. They weren't before... http://www.ciaoitalia.nl/product/pizza-originale/mediterranea-halal
Technical SEO | | MarketingEnergy
http://www.ciaoitalia.nl/product/pizza-originale/gyros-halal
http://www.ciaoitalia.nl/product/pizza-originale/döner-halal
http://www.ciaoitalia.nl/product/pizza-originale/vegetariana
http://www.ciaoitalia.nl/product/pizza-originale/seizoen-pizza-estate
http://www.ciaoitalia.nl/product/pizza-originale/contadina
http://www.ciaoitalia.nl/product/pizza-originale/4-stagioni
http://www.ciaoitalia.nl/product/pizza-originale/shoarma Thanks for any help in the right direction 🙂 | |
| |
| |
| |
| |
| |
| |
| | <colgroup><col style="mso-width-source: userset; mso-width-alt: 17225; width: 353pt;" width="471"></colgroup>
| http://www.ciaoitalia.nl/product/pizza-originale/mediterranea-halal |
| http://www.ciaoitalia.nl/product/pizza-originale/gyros-halal |
| http://www.ciaoitalia.nl/product/pizza-originale/döner-halal |
| http://www.ciaoitalia.nl/product/pizza-originale/vegetariana |
| http://www.ciaoitalia.nl/product/pizza-originale/seizoen-pizza-estate |
| http://www.ciaoitalia.nl/product/pizza-originale/contadina |
| http://www.ciaoitalia.nl/product/pizza-originale/4-stagioni |
| http://www.ciaoitalia.nl/product/pizza-originale/shoarma |0 -
Duplicate Content - Captcha on Contact Form
I am going to be working on a site where the contact form is being flagged as duplicate content the URL is the same apart from having: /contact/10119 contact/31010 ...at the end of it. The only difference in the content of the page that I can see is the Captcha numbers? Is there a way to overcome this to stop duplicate content? Thanks in advance
Technical SEO | | J_Sinclair0 -
Duplicate content in Magento
Hi all We got some serious issues with duplicate content on a Magento site that we are marketing. For example: http://www.citcop.se/varmepumpar-luft-luft/panasonic/panasonic-nordic-ce9nke-5-0kw http://www.citcop.se/panasonic/panasonic-nordic-ce9nke-5-0kw http://www.citcop.se/panasonic-nordic-ce9nke-5-0kw All of the above seem to work just fine as it is now but since they are excatly the same product they should ofcourse do a 301 redirect to the main page. Any ideas on how to sort this out in Magnto without having to resort to manual work in .htaccess? Have a great day Fredrik
Technical SEO | | Resultify0 -
Question about duplicate content in crawl reports
Okay, this one's a doozie: My crawl report is listing all of these as separate URLs with identical duplicate content issues, even though they are all the home page and the one that is http://www.ccisolutions.com (the preferred URL) has a canonical tag of rel= http://www.ccisolutions.com: http://www.ccisolutions.com http://ccisolutions.com http://www.ccisolutions.com/StoreFront/IAFDispatcher?iafAction=showMain I will add that OSE is recognizing that there is a 301-redirect on http://ccisolutions.com, but the duplicate content report doesn't seem to recognize the redirect. Also, every single one of our 404-error pages (we have set up a custom 404 page) is being identified as having duplicate content. The duplicate content on all of them is identical. Where do I even begin sorting this out? Any suggestions on how/why this is happening? Thanks!
Technical SEO | | danatanseo1 -
URL query considered duplicate content?
I have a Magento site. In order to reduce duplicate content for products of the same style but with different colours I have combined them on to 1 product page. I would like to allow the pictures to be dynamic, i.e. allow a user to search for a colour and all the products that offer that colour appear in the results, but I dont want the default product image shown but the product image for that colour applying to the query. Therefore to do this I have to append a query string to the end of the URL to produce this result: www.website.com/category/product-name.html?=red My question is, will the query variations then be picked up as duplicate content: www.website.com/category/product-name.html www.website.com/category/product-name.html?=red www.website.com/category/product-name.html?=yellow Google suggest it has contingencies in its algorithm and I will not be penalised: http://googlewebmastercentral.blogspot.co.uk/2007/09/google-duplicate-content-caused-by-url.html But other sources suggest this is not accurate. Note the article was written in 2007.
Technical SEO | | BlazeSunglass0 -
Url rewrites / shortcuts - Are they considered duplicate content?
When creating a url rewrite or shortcut, does this create duplicate content issues? split your rankings / authority with google/search engines? Scenario 1 wwwlwhatthehellisahoneybooboo.com/dqotd/ -> www.whatthehellisahoneybooboo.com/08/12/2012/deep-questions-of-the-day.html Scenario 2 bitly.com/hbb -> www.whatthehellisahoneybooboo.com/08/12/2012/deep-questions-of-the-day.html (or to make it more compicated...directs to the above mentioned scenario 1 url rewrite) www.whatthehellisahoneybooboo.com/dqotd/ *note well- there's no server side access so mentions of optimizing .htacess are useless in this situation. To be clear, I'm only referring to rewrites, not redirects...just trying to understand the implications of rewrites. Thanks!
Technical SEO | | seosquared0 -
Duplicate content
I'm getting an error showing that two separate pages have duplicate content. The pages are: | Help System: Domain Registration Agreement - Registrar Register4Less, Inc. http://register4less.com/faq/cache/11.html 1 27 1 Help System: Domain Registration Agreement - Register4Less Reseller (Tucows) http://register4less.com/faq/cache/7.html | These are both registration agreements, one for us (Register4Less, Inc.) as the registrar, and one for Tucows as the registrar. The pages are largely the same, but are in fact different. Is there a way to flag these pages as not being duplicate content? Thanks, Doug.
Technical SEO | | R4L0