Duplicate content
-
I have just ran a report in seomoz on my domain and has noticed that there are duplicate content issues, the issues are:
www.domainname/directory-name/
www.domainname/directory-name/index.php
All my internal links and external links point to the first domain, as i prefer this style as it looks clear & concise, however doing this has created duplicate content as within the site itself i have an index.php page inside this /directory-name/ to show the page.
Could anyone give me some advice on what i should do please?
Kind Regards
-
Hi Gary,
Here's some code from an htaccess file I've used before that solves the issue you've got with index.php at the end of all your urls:
#remove /index.php and ensure admin works okay
RewriteCond %{REQUEST_URI} !^/administrator
RewriteCond %{THE_REQUEST} ^.*/index.php\ HTTP/
RewriteRule ^(.*)index.php$ /$1 [R=301,L]
notice the line that contains ^/administrator , in Joomla, admin login is usuall on http://site.com/administrator/index.php
so, removing the index.php from the admin url would prevent any access to the admin screens! If your cms has a similar url, be sure to replace 'administrator' with the relevant url.
-
Hi Ade,
Thanks for all your help.
I will post a new question on the Q&A Forum regarding the .htaccess rule.
Kind Regards
-
Hi Gary,
That one is a bit beyond me I'm afraid and I am not familiar with WebEdition at all.
With most CMS there are normally either built-in or add-on extensions to help with re-writing your urls but you need to be really careful that you don't end up with a completely new set of urls that don't match either of your originals.
A .htaccess rewrite rule may be your best option but I don't know what the coding for it would be.
-
We are using a CMS, its called WebEdition, is there a technical question i should ask them in what i need to do?
Kind Regards
-
Ahhhh. No definitely not practical, I thought that it was just the one url.
Are you using a content management system for your site such as Joomla?
-
Do you think that's practical to do that?
As i will need to do a 301 on literally every page if i don't want to show the /index.php
Is this what seomoz.org website does? for example:
-
In that case you can just add a 301 redirect in to your .htaccess file below the code you added earlier.
redirect 301 /football-teams/index.php http://www.mydomain.com/football-teams/
-
Hi Ade,
Yes, i tested http://www.mydomain.com/football-teams//index.php however it did not resolve to http://www.mydomain.com/football-teams/
Any ideas?
-
Hi Gary.
Have you tried visiting the url http://www.mydomain.com/football-teams/index.php to see if it now resolves to http://www.mydomain.com/football-teams/ ?
If it does then the issue is fixed, the next time SEOMoz crawls your site the error will dissapear.
Cheers.
Ade.
-
Hi Ade,
Thanks for the speedy reply.
I have now implemented this and works fantastic on the http://www.mydomain.com/
Thank you very much.
There is another issue however, i hope i can make sense here, here goes:
seomoz tool gives me back duplicate content on both these URL's
http://www.mydomain.com/football-teams/
http://www.mydomain.com/football-teams/index.php
I want to use http://www.mydomain.com/football-teams/ as this just look nice & clean.
What would be best practice to fix this issue?
Kind Regards
-
Hey Gary.
Here's the solution that I use.
All my sites are hosted on a linux server so this won't be relevant if your site is hosted on a windows server.
1. create/modify your .htaccess file in your site's root directory.
2. Add the following code to the top of the file:-
RewriteEngine On
RewriteBase /
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index.php\ HTTP/
RewriteRule ^index.php$ http://www.yourdomain.com/ [R=301,L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]RewriteCond %{HTTP_HOST} ^yourdomain.com [NC]
RewriteRule ^(.*)$ http://www.yourdomain.com/$1 [L,R=301]This will ensure that any requests sent to http://yourdomain.com are redirected to http://www.yourdomain.com and that the index.php part of the url is removed.
If you need more help on creating or modifying your .htaccess file then you can find more info here - http://httpd.apache.org/docs/1.3/howto/htaccess.html
All the best.
Ade.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate Content Issues - Where to start???
Dear All I have recently joined a new company Just Go Holidays - www.justgoholidays.com I have used the SEO Moz tools (yesterday) to review the site and see that I have lots of duplicate content/pages and also lots of duplicate titles all of which I am looking to deal with. Lots of the duplicate pages appear to be surrounding, additional parameters that are used on our site to refine and or track various marketing campaigns. I have therefore been into Google Webmaster Tools and defined each of these parameters. I have also built a new XML sitemap and submitted that too. It looks as is we have two versions of the site, one being at www.justgoholidays.com and the other without the www It appears that there are no redirects from the latter to the former, do I need to use 301's here or is it ok to use canonicalisation instead? Any thoughts on an action plan to try to address these issues in the right order and the right way would be very gratefully received as I am feeling a little overwhelmed at the moment. (we also use a CMS system that is not particularly friendly and I think I will have to go directly to the developers to make lots of the required changes which is sure to cost - therefore really don't want to get this wrong) All the best Matt
Technical SEO | | MattByrne0 -
Partially duplicated content on separate pages
TL;DR: I am writing copy for some web pages. I am duplicating some bits of copy exactly on separate web pages. And in other cases I am using the same bits of copy with slight alterations. Is this bad for SEO? Details: We sell about 10 different courses. Each has a separate page. I'm currently writing copy for those pages. Some of the details identical for each course. So I can duplicate the content and it will be 100% applicable. For example, when we talk about where we can run courses (we go to a company and run it on their premises) – that's applicable to every course. Other bits are applicable with minor alterations. So where we talk about how we'll tailor the course, I will say for example: "We will the tailor the course to the {technical documents|customer letters|reports} your company writes." Or where we have testimonials, the headline reads "Improving {customer writing|reports|technical documents} in every sector and industry". There is original content on each page. The duplicate stuff may seem spammy, but the alternative is me finding alternative re-wordings for exactly the same information. This is tedious and time-consuming and bizarre given that the user won't notice any difference. Do I need to go ahead and re-write these bits ten slightly different ways anyway?
Technical SEO | | JacobFunnell0 -
Duplicate Content - Captcha on Contact Form
I am going to be working on a site where the contact form is being flagged as duplicate content the URL is the same apart from having: /contact/10119 contact/31010 ...at the end of it. The only difference in the content of the page that I can see is the Captcha numbers? Is there a way to overcome this to stop duplicate content? Thanks in advance
Technical SEO | | J_Sinclair0 -
Duplicate content due to credit card testing
I recently launched a site - http://www.footballtriviaquestions.co.uk and the site uses Paypal. In order to test the PayPal functionality I set up a zapto.org domain via a permanent IP service that points directly to the computer I've written the website on. It appears that Google has now indexed the zapto.org website. Will this cause problems to my main website, as the zapto.org website will pretty much contain content that is an exact duplicate of what is held on the main website. I've looked in Google webmaster tools for the main website and it doesn't mention any duplicate content, but I'm currently not in the top 50 ranking for "football trivia questions' on Google despite SEOMoz ranking my home page with an A rating. The page does rank at position 16 in Yahoo and Bing. This seems odd to me, although I do have very few back links pointing to my site. If the duplicate content is likely to be causing me problems what would be the best way to knock the zapto.org results out of Google
Technical SEO | | ipr1010 -
Duplicate Content?
My site has been archiving our newsletters since 2001. It's been helpful because our site visitors can search a database for ideas from those newsletters. (There are hundreds of pages with similar titles: archive1-Jan2000, archive2-feb2000, archive3-mar2000, etc.) But, I see they are being marked as "similar content." Even though the actual page content is not the same. Could this adversely affect SEO? And if so, how can I correct it? Would a separate folder of archived pages with a "nofollow robot" solve this issue? And would my site visitors still be able to search within the site with a nofollow robot?
Technical SEO | | sakeith0 -
How do i deal with duplicate content on the same domain?
I'm trying to find out if there's a way we can combat similar content on different pages on the same site, without having to re write the whole lot? Any ideas?
Technical SEO | | indurain0 -
Duplicate content issues caused by our CMS
Hello fellow mozzers, Our in-house CMS - which is usually good for SEO purposes as it allows all the control over directories, filenames, browser titles etc that prevent unwieldy / meaningless URLs and generic title tags - seems to have got itself into a bit of a tiz when it comes to one of our clients. We have tried solving the problem to no avail, so I thought I'd throw it open and see if anyone has a soultion, or whether it's just a fault in our CMS. Basically, the SEs are indexing two identical pages, one ending with a / and the other ending /index.php, for one of our sites (www.signature-care-homes.co.uk). We have gone through the site and made sure the links all point to just one of these, and have done the same for off-site links, but there is still the duplicate content issue of both versions getting indexed. We also set up an htaccess file to redirect to the chosen version, but to no avail, and we're not sure canonical will work for this issue as / pages should redirect to /index.php anyway - and that's we can't work out. We have set the access file to point to index.php, and that should be what should be happening anyway, but it isn't. Is there an alternative way of telling the SE's to only look at one of these two versions? Also, we are currently rewriting the content and changing the structure - will this change the situation we find ourselves in?
Technical SEO | | themegroup0 -
Up to my you-know-what in duplicate content
Working on a forum site that has multiple versions of the URL indexed. The WWW version is a top 3 and 5 contender in the google results for the domain keyword. All versions of the forum have the same PR, but but the non-WWW version has 3,400 pages indexed in google, and the WWW has 2,100. Even worse yet, there's a completely seperate domain (PR4) that has the forum as a subdomain with 2,700 pages indexed in google. The dupe content gets completely overwhelming to think about when it comes to the PR4 domain, so I'll just ask what you think I should do with the forum. Get rid of the subdomain version, and sometimes link between two obviously related sites or get rid of the highly targeted keyword domain? Also what's better, having the targeted keyword on the front of Google with only 2,100 indexed pages or having lower rankings with 3,400 indexed pages? Thanks.
Technical SEO | | Hondaspeder0