Duplicate Content for index.html
-
In the Crawl Diagnostics Summary, it says that I have two pages with duplicate content which are:
I read in a Dream Weaver tutorial that you should name your home page "index.html" and then you can let www.mywebsite.com automatically direct the user to index.html. Is this a bug in SEOMoz's crawler or is it a real problem with my site?
Thank you,
Dan
-
The code should definitely go into the websites root directory's .htaccess, however .htaccess can be weird, a few days ago I ran into a similar issue with a client's website, and I was able to remedy the issue with a variation of the code.
index Redirect RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)index.(php|html|htm|asp)\ HTTP/ RewriteRule ^(([^/]+/))index.(php|html|htm|asp)$ http://yoursite.com/$1 [R=301,L]
If you give me the URL for the site I will take a look at it and let you know what would be feasible.
-
Hi Daniel, can you share with us the URL of your site? We can take a look at it and give you a more precise answer that way. Thanks!
-
I eventually figured out that your method was a 301 redirect and I definitely broke my site trying to use the code you posted. .. haha. Its ok though. I just removed the code and it went back to normal. At first, I was editing the .htaccess file in the public_html folder which wasnt working. Then I tried the root folder for the site (I created the .htaccess file since it did not exist.) Neither of those worked. (I am using Bluehost so I do not think that I have root access and I am not sure if it is a Linux server or not.)
If there is an easy way to explain what I am doing wrong, please do so. Otherwise, I will use canonical.
Thanks for everything!
-
@Dan
Thanks for your reply. It seems like there are lots of different ways to solve this problem. I just watched this video on Matt Cutt's blog where he discusses his preference for 301 redirects over rel canonical tag.
Where would you say your solution fits in?
sorry about the delay of this response, i didn't realize the that you were asking me a question right away. When placing the code I provided in my previous answer this will cause a 301 perminant redirect to the original URL. That's actually what the
[R=301,L]
portion of the code is stating (R) redirect (301) status is referring to. After reviewing the Matt Cutts video, I realize that I should have asked you if you were operating on a Linux server that you had root access to. We actually utilize both redirects and canonical tags since it was recommended by the on-page optimization reports. Heck Google uses them, I would assume because it's easier for the user to be referred to a single page URL. Obviously though if you don't have server header access, and are not familiar with .htaccess (you can accidentally break your site) then the canonical solution is appropriate
-
Josh,
Thanks for your reply. It seems like there are lots of different ways to solve this problem. I just watched this video on Matt Cutt's blog where he discusses his preference for 301 redirects over rel canonical tag.
Where would you say your solution fits in?
Thanks,
Dan -
use the link rel tag for all my homepages for the http://www.yoursite.com
-
Odd enough I just recently answered this question. The SEOmoz crawler is correct, because without a redirect you will be able to access both versions of the page in your browser.
To resolve this issue simply rewrite the index.html to the root url by placing the following code into your .htaccess file into your root directory.
Options +FollowSymlinks RewriteEngine on
Index Rewrite RewriteRule ^index.(htm|html|php) http://www.yoursite.com/ [R=301,L] RewriteRule ^(.*)/index.(htm|html|php) http://www.yoursite.com/$1/ [R=301,L]
You can also do the same with the index file in any subdirectories that you might create, by simply placing a .htaccess into those sub directories and using variations of the above code. This is how you create nice tight URLs without the duplicate content issue that look like - http://www.semclix.com/design/business/
-
It is a problem which you need to fix. You need to canonicalize your pages.
Those are all various URLs which most likely lead to the same web page. I say "most likely" because these URLs can actually lead to different pages.
You need to tell crawlers and search engines how you organize your site. There are several ways to achieve canonicalization. The method I prefer is to add the following line of code to each page:
The URL provided should be the preferred URL for your page.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate content on websites for multiple countries
I have a client who has a website for their U.S. based customers. They are currently adding a Canadian dealer and would like a second website with much of the same info as their current website, but with Canadian contact info etc. What is the best way to do this without creating duplicate content that will get us penalized? If we create a website at ABCcompany.com and ABCCompany.ca or something like that, will that get us around the duplicate content penalty?
Web Design | | InvoqMarketing0 -
CMS dynamicly created pages indexed?
Hey Moz'erz, Looking at the indexed pages of my clients eCommerce website I noticed that dynamically created pages are being indexed. For example this page does not "exist" but is created by a drop down filter menu that sorts by product tag: /collections/tools/TAG I can only conclude that this page got indexed either through a backlink or once upon a time there was an internal link pointing to this URL and got indexed (currently there is not). Are either of these cases possibilities? In either case before considering removal or any action I would of-course reference analytics to check for conversions, traffic and any backlinks for those "pages". I believe at the end of the day is recommend a drop down filer that doesn't create new pages as the best solution. Thoughts, comments and experience is greatly welcomed 🙂
Web Design | | paul-bold0 -
Can anyone recommend a tool that will identify unused and duplicate CSS across an entire site?
Hi all, So far I have found this one: http://unused-css.com/ It looks like it identifies unused, but perhaps not duplicates? It also has a 5,000 page limit and our site is 8,000+ pages....so we really need something that can handle a site larger than their limit. I do have Screaming Frog. Is there a way to use Screaming Frog to locate unused and duplicate CSS? Any recommendations and/or tips would be great. I am also aware of the Firefix extensions, but to my knowledge they will only do one page at a time? Thanks!
Web Design | | danatanseo0 -
Is it bad to have /index.php at the end of a uri?
Is it bad for SEO if traffic is directed to "http://www.example.com/someuri/index.php" instead of "http://www.example.com/someuri/" and would it be works setting up a redirect rule at htaccess level?
Web Design | | NoisyLittleMonkey1 -
Next Google Index..?
Hi Guys, Does anybody have an idea when the next Google index is due roughly and if there is anyway I can tell approx when these are due to happen and how would I know? Thanks In advance, Craig Fenton IT
Web Design | | craigyboy0 -
Websites with only one "html file" and page href # is good for SEO?
I bought one website from templatemonster that contains only one HTML and the pages are generated by links (PROGRAMACAO) My website: www.nextformaturas.com.br This is good in term of SEO? or it is better an website with deveral pages with diferent contents? What are the pros and cons? I really lost on this.
Web Design | | Naghirniac0 -
Do drop caps impact the search value of your content?
A client of mine wants to include drop caps at the start of the first paragraph on the page because they think it looks nice. I found some css techniques for implementing this using a span on the first character to enlarge the size of just that character. First word of the first paragraph. Are there any seo concerns I should have for adding drop caps?
Web Design | | fivelinesmedia0 -
Why is our sitemap not being indexed on Webmaster Tools?
Hi there, We have been having a problem with one of our websites. It appears as though someone has stolen our template and used it for themselves, but in the process also stole our analytics information. We have problems with the analytics, but are fixing that ourselves. The problem we have now is that when we tried to put in a sitemap into Google Webmaster Tools the URLs are submitted but have yet to be indexed. We have tried pinging them, but there has been no change. This is not a problem for our other websites which are very similar. What could be the problem here? For reference, the url is http://www.dentistinlittlerock.com Thank you for your responses in advance!
Web Design | | jid0