Duplicate Content for index.html
-
In the Crawl Diagnostics Summary, it says that I have two pages with duplicate content which are:
I read in a Dream Weaver tutorial that you should name your home page "index.html" and then you can let www.mywebsite.com automatically direct the user to index.html. Is this a bug in SEOMoz's crawler or is it a real problem with my site?
Thank you,
Dan
-
The code should definitely go into the websites root directory's .htaccess, however .htaccess can be weird, a few days ago I ran into a similar issue with a client's website, and I was able to remedy the issue with a variation of the code.
index Redirect RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)index.(php|html|htm|asp)\ HTTP/ RewriteRule ^(([^/]+/))index.(php|html|htm|asp)$ http://yoursite.com/$1 [R=301,L]
If you give me the URL for the site I will take a look at it and let you know what would be feasible.
-
Hi Daniel, can you share with us the URL of your site? We can take a look at it and give you a more precise answer that way. Thanks!
-
I eventually figured out that your method was a 301 redirect and I definitely broke my site trying to use the code you posted. .. haha. Its ok though. I just removed the code and it went back to normal. At first, I was editing the .htaccess file in the public_html folder which wasnt working. Then I tried the root folder for the site (I created the .htaccess file since it did not exist.) Neither of those worked. (I am using Bluehost so I do not think that I have root access and I am not sure if it is a Linux server or not.)
If there is an easy way to explain what I am doing wrong, please do so. Otherwise, I will use canonical.
Thanks for everything!
-
@Dan
Thanks for your reply. It seems like there are lots of different ways to solve this problem. I just watched this video on Matt Cutt's blog where he discusses his preference for 301 redirects over rel canonical tag.
Where would you say your solution fits in?
sorry about the delay of this response, i didn't realize the that you were asking me a question right away. When placing the code I provided in my previous answer this will cause a 301 perminant redirect to the original URL. That's actually what the
[R=301,L]
portion of the code is stating (R) redirect (301) status is referring to. After reviewing the Matt Cutts video, I realize that I should have asked you if you were operating on a Linux server that you had root access to. We actually utilize both redirects and canonical tags since it was recommended by the on-page optimization reports. Heck Google uses them, I would assume because it's easier for the user to be referred to a single page URL. Obviously though if you don't have server header access, and are not familiar with .htaccess (you can accidentally break your site) then the canonical solution is appropriate
-
Josh,
Thanks for your reply. It seems like there are lots of different ways to solve this problem. I just watched this video on Matt Cutt's blog where he discusses his preference for 301 redirects over rel canonical tag.
Where would you say your solution fits in?
Thanks,
Dan -
use the link rel tag for all my homepages for the http://www.yoursite.com
-
Odd enough I just recently answered this question. The SEOmoz crawler is correct, because without a redirect you will be able to access both versions of the page in your browser.
To resolve this issue simply rewrite the index.html to the root url by placing the following code into your .htaccess file into your root directory.
Options +FollowSymlinks RewriteEngine on
Index Rewrite RewriteRule ^index.(htm|html|php) http://www.yoursite.com/ [R=301,L] RewriteRule ^(.*)/index.(htm|html|php) http://www.yoursite.com/$1/ [R=301,L]
You can also do the same with the index file in any subdirectories that you might create, by simply placing a .htaccess into those sub directories and using variations of the above code. This is how you create nice tight URLs without the duplicate content issue that look like - http://www.semclix.com/design/business/
-
It is a problem which you need to fix. You need to canonicalize your pages.
Those are all various URLs which most likely lead to the same web page. I say "most likely" because these URLs can actually lead to different pages.
You need to tell crawlers and search engines how you organize your site. There are several ways to achieve canonicalization. The method I prefer is to add the following line of code to each page:
The URL provided should be the preferred URL for your page.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why Is Google Showing My Images Upside Down in the Index?
Hi, My client has PDFs of their catalog on the site which google is indexing. However, it seems that google is taking an image from the catalog and then showing it upside in the index for images/search results. The images are not upside down on the site. Has anyone heard of this happening before or does anyone know a way to fix it? Thanks
Web Design | | AliMac260 -
Duplicate items across different pages?
On our new website we have a testimonials page which you can cycle through them. We also have the testimonial on the our work / project page. Essentially this is duplicate content from another page, what's the best thing to do here? In the sake of SEO, remove the duplicate content and only have one? Or won't it make much difference?
Web Design | | vortexuk0 -
How to add SEO Content to this site
Hi Great community and hope you guys can help! I have just started on a SEO project for http://bit.ly/clientsite , the clients required initial KPI is Search Engine Rankings at a fairly low budget. The term I use for the site is a "blurb site", the content is thin and the initial strategy I want to employ to get the keyword rankings is to utilize content. The plan is to: add targeted, quality (user experience & useful) and SEO content on the page itself by adding a "read more" link/button to the "blurb" on the right of the page (see pink text in image) when someone clicks on the "read more", a box of content will slide out styled much the same as the blurb itself and appear next to and/or overlay over the blurb and most of the page (see pink rectangle in image) Question: Is this layer of targeted , quality (user experience & useful) and SEO content (which requires an extra click to get to it) going to get the same SEO power/value as if it were displayed traditionally on the initial display? If not, would it be better to create a second page (2<sup>nd</sup> layer) and have the read more link to that and then rel-canonical the blurb to that 2<sup>nd</sup> page, so that all the SEO passes to this expanded content and the second page/layer is what will show up in the rankings? Thanks in advance qvDgZNE
Web Design | | Torean0 -
Using content from other sites without duplicate content penalties?
Hi there, I am setting up a website, where i believe it would substantially benefit users experience if i setup a database of information on artists. I am torn because to feasibly do this correctly, i would have content that is built from multiple sources, but has no real unique content. It would have parts from Wikipedia, parts from other websites etc. All would be sourced of-course. My concern is that if i do this, am i risking in devaluing my website because of this. Is there a way i can handle this without taking a hit?
Web Design | | BorisD0 -
Duplicate content issue
I have recently built a site that has a main page intended to rank for national coverage. This site also has a number of pages targeted at local searches, these pages are slight variations of each other with town specific keywords. Does anyone know if google will see this as spam and quarantine my site from ranking? Thanks
Web Design | | stebutty0 -
Panda and Penquin Fall - Could HTML Design an Issue?
Hi, We were hit hard by Panda 3.4 on March 23rd 2012. Then Penguin came along and slapped us down a little farther on April 24th. White hat SEO for 13 years on the site. I have been trying to discover the reason we got hit so hard, to date 90% down. We ae wiped. I have a couple of keywords still #2 and #3 and we see up and down changes in Google webmaster tools, i.e. a keyword is supposedly up 50 points then another down 50. All other 150 keywords that we used to rank on the first page for are not even showing up. I have a person that is about to do a full link analysis but since we never went after links I just never had the feeling that is where our problem is at, but definitely going to explore it. The reason for my post is that last night I spoke with an SEO person that has some pretty good credentials (9 years experience and works currently at large online marketing company with seo with clients like Honda) and he was nice enough to just take a quick look at the site. He said he saw nothing really wrong and did not think that we were hit for any of the normal issues people are listing, i.e. duplicate content, backlinks. His first impression was that we were knocked down because the site is "hard to index". He said the site still uses tables and a lot of our Doc Statements were for HTML 4.01 from 1999. As we all know, there are 'many' experts in this industry. So I wanted a little feedback from the community. Our main site was built in Dreamweaver using tables. We do have a Wordpress blog that is very small and just now posting to add fresh content. (posts seem to rank pretty good, this is why I thought, you know he may be right) Would an older site be penalized like this for using tables? What would you do at this stage if you had a site that is not recovering? I have now reached panic mode and have to do something, just not sure of the next step. I will be happy to post the URL if anyone wants to help with advice. Thanks,
Web Design | | Force7
Force70 -
Google indexing Quickview popups
Hi Guys I can't seem to find any info on this. Maybe you can help. We are using xcart as our shopping cart. When you land on a product page you have the option to "Quickview" the item. Google is picking up the quickview urls" and the vote on product urls. I have added the following to the robots.txt file but not sure if this will work. Any help on this would be great. Disallow: /?popup=Y Disallow: /?mode=add Undesired URL Examples: <colgroup><col width="735"></colgroup>
Web Design | | fasctimseo
| http://www.funlove.com/store/6_Pack_Shooter_Beer_Belt/?mode=add_vote&vote=60 | <colgroup><col width="735"></colgroup>
| http://www.funlove.com/store/6_pack_shooter_beer_belt/?popup=Y |0 -
Outsourcing Content - Finding Superior Providers...
I am looking for content writers. Not textbroker.com, I want content written that isnt scraped and reworded from information already in google. Can anyone recommend a company which isnt afraid to read a book or a magazine, dig up old information to write something truly unique? This should likely be in a fresh thread, but ill put it here as a side note. If you also can recommend a wordpress or joomla theme designer who has his own creative ideas and is highly skilled...
Web Design | | getbigyadig0