Duplicate Content for index.html
-
In the Crawl Diagnostics Summary, it says that I have two pages with duplicate content which are:
I read in a Dream Weaver tutorial that you should name your home page "index.html" and then you can let www.mywebsite.com automatically direct the user to index.html. Is this a bug in SEOMoz's crawler or is it a real problem with my site?
Thank you,
Dan
-
The code should definitely go into the websites root directory's .htaccess, however .htaccess can be weird, a few days ago I ran into a similar issue with a client's website, and I was able to remedy the issue with a variation of the code.
index Redirect RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)index.(php|html|htm|asp)\ HTTP/ RewriteRule ^(([^/]+/))index.(php|html|htm|asp)$ http://yoursite.com/$1 [R=301,L]
If you give me the URL for the site I will take a look at it and let you know what would be feasible.
-
Hi Daniel, can you share with us the URL of your site? We can take a look at it and give you a more precise answer that way. Thanks!
-
I eventually figured out that your method was a 301 redirect and I definitely broke my site trying to use the code you posted. .. haha. Its ok though. I just removed the code and it went back to normal. At first, I was editing the .htaccess file in the public_html folder which wasnt working. Then I tried the root folder for the site (I created the .htaccess file since it did not exist.) Neither of those worked. (I am using Bluehost so I do not think that I have root access and I am not sure if it is a Linux server or not.)
If there is an easy way to explain what I am doing wrong, please do so. Otherwise, I will use canonical.
Thanks for everything!
-
@Dan
Thanks for your reply. It seems like there are lots of different ways to solve this problem. I just watched this video on Matt Cutt's blog where he discusses his preference for 301 redirects over rel canonical tag.
Where would you say your solution fits in?
sorry about the delay of this response, i didn't realize the that you were asking me a question right away. When placing the code I provided in my previous answer this will cause a 301 perminant redirect to the original URL. That's actually what the
[R=301,L]
portion of the code is stating (R) redirect (301) status is referring to. After reviewing the Matt Cutts video, I realize that I should have asked you if you were operating on a Linux server that you had root access to. We actually utilize both redirects and canonical tags since it was recommended by the on-page optimization reports. Heck Google uses them, I would assume because it's easier for the user to be referred to a single page URL. Obviously though if you don't have server header access, and are not familiar with .htaccess (you can accidentally break your site) then the canonical solution is appropriate
-
Josh,
Thanks for your reply. It seems like there are lots of different ways to solve this problem. I just watched this video on Matt Cutt's blog where he discusses his preference for 301 redirects over rel canonical tag.
Where would you say your solution fits in?
Thanks,
Dan -
use the link rel tag for all my homepages for the http://www.yoursite.com
-
Odd enough I just recently answered this question. The SEOmoz crawler is correct, because without a redirect you will be able to access both versions of the page in your browser.
To resolve this issue simply rewrite the index.html to the root url by placing the following code into your .htaccess file into your root directory.
Options +FollowSymlinks RewriteEngine on
Index Rewrite RewriteRule ^index.(htm|html|php) http://www.yoursite.com/ [R=301,L] RewriteRule ^(.*)/index.(htm|html|php) http://www.yoursite.com/$1/ [R=301,L]
You can also do the same with the index file in any subdirectories that you might create, by simply placing a .htaccess into those sub directories and using variations of the above code. This is how you create nice tight URLs without the duplicate content issue that look like - http://www.semclix.com/design/business/
-
It is a problem which you need to fix. You need to canonicalize your pages.
Those are all various URLs which most likely lead to the same web page. I say "most likely" because these URLs can actually lead to different pages.
You need to tell crawlers and search engines how you organize your site. There are several ways to achieve canonicalization. The method I prefer is to add the following line of code to each page:
The URL provided should be the preferred URL for your page.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why Is Google Showing My Images Upside Down in the Index?
Hi, My client has PDFs of their catalog on the site which google is indexing. However, it seems that google is taking an image from the catalog and then showing it upside in the index for images/search results. The images are not upside down on the site. Has anyone heard of this happening before or does anyone know a way to fix it? Thanks
Web Design | | AliMac260 -
Spanish website indexed in English, redirect to spanish or english version if i do a new website design?
Hi MOZ users, i have this problem. We have a website in Spanish Language but Google crawls it on English (it is not important the reasons). We re made the entire website and now we are planning the move. The new website will have different language versions, english, spanish and portuguese. Somebody tells me that we have to redirect the old urls (crawled on english) to the new english versions, not to the spanish (the real language of the firsts). Example: URL1 Language: Spanish - Crawled on English --> redirect to Language English version. the other option will be redirect to the spanish new version, which the visitor is waiting to find. URL1 Language: Spanish - Crawled on English --> redirect to Language Spanish version. What do you think? Which is the better option?
Web Design | | NachoRetta0 -
Crawl Diagnostics Summary - Duplicate Content
Hello SEO Experts, I am a developer at www.bowanddrape.com and we are working on improving the SEO of the website. The SEOMoz Crawl Diagnostics Summary shows that following 2 URL have duplicate content. http://www.bowanddrape.com/clothing/Tan+Accessories+Calfskin+Belt/50_5142 http://www.bowanddrape.com/clothing/Black+Accessories+Calfskin+Belt/50_5143 Can you please suggest me ways to fix this problem? Is the duplicate content error because of same "The Details", "Size Chart" and "The Silhouette" and "You may also like" ? Thanks, Chirag
Web Design | | ChiragNirmal0 -
Duplicate Titles for Large Lists
Our blog (www.cowleyweb.com/blog) has recently been given topic categories so we can utilize our old blogs. Otherwise, users would only see what's new and never look back (our blogs are organized by the month they were published) and all that hard work would kind of be a waste after a while. So we came up with a few topics (i.e. social media, internet marketing, etc.) and adding those as tags to blogs. Now, users can click the topics and get a results page on our blog of all the previously published blogs related to that topic. Sounds great. BUT, it's hurting our SEO crawl report. If the list goes beyond one page of search results, the 2nd and subsequent pages get dinged as "duplicate title" b/c they share the same title (i.e. "Social Media"). How can I fix this? I'm not the web designer but something tells me maybe some sort of tag that says "Page 2" or something would do the trick. We use Drupal which is good for customization. I assume tons of bloggers and websites have dealt with this problem. Please help. Want to give the web guy some solutions. Thank you.
Web Design | | JCunningham0 -
Question #1: Does Google index https:// pages? I thought they didn't because....
generally the difference between https:// and http:// is that the s (stands for secure I think) is usually reserved for payment pages, and other similar types of pages that search engines aren't supposed to index. (like any page where private data is stored) My site that all of my questions are revolving around is built with Volusion (i'm used to wordpress) and I keep finding problems like this one. The site was hardcoded to have all MENU internal links (which was 90% of our internal links) lead to **https://**www.example.com/example-page/ instead of **http://**www.example.com/example-page/ To double check that this was causing a loss in Link Juice. I jumped over to OSE. Sure enough, the internal links were not being indexed, only the links that were manually created and set to NOT include the httpS:// were being indexed. So if OSE wasn't counting the links, and based on the general ideology behind secure http access, that would infer that no link juice is being passed... Right?? Thanks for your time. Screens are available if necessary, but the OSE has already been updated since then and the new internal links ARE STILL NOT being indexed. The problem is.. is this a volusion problem? Should I switch to Wordpress? here's the site URL (please excuse the design, it's pretty ugly considering how basic volusion is compared to wordpress) http://www.uncommonthread.com/
Web Design | | TylerAbernethy0 -
How can i write content rich descriptions?
we have recently started using seomoz. how can i make descriptions more content rich?
Web Design | | WCGAdmin0 -
Hi Everybody. I have a large site that is made up of the main site then a large support site. The support site has a lot of overlapping content and similar titles. Would it be beneficial to separate the two? Thank you. All answers appreciated.
Hi Everybody. I have a large site that is made up of the main site then a large support site. The support site has a lot of overlapping content and similar titles. Would it be beneficial to separate the two? Thank you. All answers appreciated.
Web Design | | arithon0 -
Real Estate and Duplicate Content
Currently we use an MLS which is an iFrame of property listings. We plan to pay an extra fee and have the crawlable version. But one problem is that many real estate firms have access to the same data, which makes our content duplicate of theirs. Is there any way around this ? Thanks
Web Design | | SGMan0