Duplicate Content for index.html
-
In the Crawl Diagnostics Summary, it says that I have two pages with duplicate content which are:
I read in a Dream Weaver tutorial that you should name your home page "index.html" and then you can let www.mywebsite.com automatically direct the user to index.html. Is this a bug in SEOMoz's crawler or is it a real problem with my site?
Thank you,
Dan
-
The code should definitely go into the websites root directory's .htaccess, however .htaccess can be weird, a few days ago I ran into a similar issue with a client's website, and I was able to remedy the issue with a variation of the code.
index Redirect RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)index.(php|html|htm|asp)\ HTTP/ RewriteRule ^(([^/]+/))index.(php|html|htm|asp)$ http://yoursite.com/$1 [R=301,L]
If you give me the URL for the site I will take a look at it and let you know what would be feasible.
-
Hi Daniel, can you share with us the URL of your site? We can take a look at it and give you a more precise answer that way. Thanks!
-
I eventually figured out that your method was a 301 redirect and I definitely broke my site trying to use the code you posted. .. haha. Its ok though. I just removed the code and it went back to normal. At first, I was editing the .htaccess file in the public_html folder which wasnt working. Then I tried the root folder for the site (I created the .htaccess file since it did not exist.) Neither of those worked. (I am using Bluehost so I do not think that I have root access and I am not sure if it is a Linux server or not.)
If there is an easy way to explain what I am doing wrong, please do so. Otherwise, I will use canonical.
Thanks for everything!
-
@Dan
Thanks for your reply. It seems like there are lots of different ways to solve this problem. I just watched this video on Matt Cutt's blog where he discusses his preference for 301 redirects over rel canonical tag.
Where would you say your solution fits in?
sorry about the delay of this response, i didn't realize the that you were asking me a question right away. When placing the code I provided in my previous answer this will cause a 301 perminant redirect to the original URL. That's actually what the
[R=301,L]
portion of the code is stating (R) redirect (301) status is referring to. After reviewing the Matt Cutts video, I realize that I should have asked you if you were operating on a Linux server that you had root access to. We actually utilize both redirects and canonical tags since it was recommended by the on-page optimization reports. Heck Google uses them, I would assume because it's easier for the user to be referred to a single page URL. Obviously though if you don't have server header access, and are not familiar with .htaccess (you can accidentally break your site) then the canonical solution is appropriate
-
Josh,
Thanks for your reply. It seems like there are lots of different ways to solve this problem. I just watched this video on Matt Cutt's blog where he discusses his preference for 301 redirects over rel canonical tag.
Where would you say your solution fits in?
Thanks,
Dan -
use the link rel tag for all my homepages for the http://www.yoursite.com
-
Odd enough I just recently answered this question. The SEOmoz crawler is correct, because without a redirect you will be able to access both versions of the page in your browser.
To resolve this issue simply rewrite the index.html to the root url by placing the following code into your .htaccess file into your root directory.
Options +FollowSymlinks RewriteEngine on
Index Rewrite RewriteRule ^index.(htm|html|php) http://www.yoursite.com/ [R=301,L] RewriteRule ^(.*)/index.(htm|html|php) http://www.yoursite.com/$1/ [R=301,L]
You can also do the same with the index file in any subdirectories that you might create, by simply placing a .htaccess into those sub directories and using variations of the above code. This is how you create nice tight URLs without the duplicate content issue that look like - http://www.semclix.com/design/business/
-
It is a problem which you need to fix. You need to canonicalize your pages.
Those are all various URLs which most likely lead to the same web page. I say "most likely" because these URLs can actually lead to different pages.
You need to tell crawlers and search engines how you organize your site. There are several ways to achieve canonicalization. The method I prefer is to add the following line of code to each page:
The URL provided should be the preferred URL for your page.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Regarding rel=canonical on duplicate pages on a shopping site... some direction, please.
Good morning, Moz community: My name is David, and I'm currently doing internet marketing for an online retailer of marine accessories. While many product pages and descriptions are unique, there are some that have the descriptions duplicated across many products. The advice commonly given is to leave one page as is / crawlable (probably best for one that is already ranking/indexed), and use rel=canonical on all duplicates. Any idea for direction on this? Do you think it is necessary? It will be a massive task. (also, one of the products that we rank highest for, we have tons of duplicate descriptions.... so... that is sort of like evidence against the idea?) Thanks!
Web Design | | DavidCiti0 -
How to deal with 100s of Wordpress media link pages, containing images, but zero content
I have a Wordpress website with well over 1000 posts. I had a SEO audit done and it was highlighted that every post had clickable images. If you click the image a new webpage opens containing nothing but the image. I was told these image pages with zero content are very bad for SEO and that I should get them removed. I have contacted several Wordpress specialists on People Per Hour. I have basically been offered two solutions. 1 - redirect all these image pages to a 404, so they are not found by Google 2 - redirect each image page to the main post page the image is from. What's my best option here? Is there a better option? I don't care if these pages remain, providing they are not crawled by Google and classified as spam etc. All suggestions greatly received!
Web Design | | xpers0 -
Will numbers & data be considered as user generated content by Google OR naturally written text sentences only refer to user generated content.
Hi, Will numbers & data be considered as user generated content by Google OR naturally written text sentences only refer to user generated content. Regards
Web Design | | vivekrathore0 -
How much does on-site duplicated content affect SERPs?
Hi, We've recently gotten into Moz, with our E-commerce websites, and discovered that it's crawler takes note of about 2500 pages which it thinks are the same (duplicated). We've now begun to completely rewrite every description of every product (including Meta Title/Description) so that this number may be reduced. Since this is the biggest issue Moz spots I'm wondering what the effect of fixing it will be on our position in the SERP (mainly Google). Does anybody have some stories or experience about this topic? Thanks in Advance! 🙂 Alexander
Web Design | | WebmasterAlex0 -
Does using role="heading" instead of H1 in HTML code affects SEO?
Does using role="heading" instead of affect SEO? http://www.w3.org/WAI/GL/wiki/Headings_using_role%3Dheading
Web Design | | LNEseo0 -
Crawl Diagnostics Summary - Duplicate Content
Hello SEO Experts, I am a developer at www.bowanddrape.com and we are working on improving the SEO of the website. The SEOMoz Crawl Diagnostics Summary shows that following 2 URL have duplicate content. http://www.bowanddrape.com/clothing/Tan+Accessories+Calfskin+Belt/50_5142 http://www.bowanddrape.com/clothing/Black+Accessories+Calfskin+Belt/50_5143 Can you please suggest me ways to fix this problem? Is the duplicate content error because of same "The Details", "Size Chart" and "The Silhouette" and "You may also like" ? Thanks, Chirag
Web Design | | ChiragNirmal0 -
AJAX & JQuery Tabs: Indexation & Navigation
Hi I've two questions about indexing Tabs. 1. Let's say I have tabs, or an accordion that is triggered with Jquery. That means that all HTML is accessible and indexed by search engines. But let's say a search query is relevant to the content in Tab#3, while Tab#1 is the one that's open by default. Is there any way that Tab#3 would be open directly if it's more relevant to the search query? 2. AJAX Tabs: We have pages that have Tabs triggered by AJAX (example: http://www.swisscom.ch/en/residential/help/loesung/entfernen-sie-sim-lock.html). I'm wondering about the current best practice. Google recommends HTML Snapshots. A newer SEOMoz Article talks about pushState(). What's the way to go here? Or in other words: How to get Tabs & Accordion content indexed and allow users to navigate directly to it?
Web Design | | zeepartner0 -
Im having duplicate content issues in wordpress
all of my pages are working fine. but i added my sitemap to my footer in my website and when i click on my blog from my footer it takes me to the homepage. so now im having duplicate content for two diff urls. ive tried adding a rel=canonical and a 301 redirect to the blog page but it doesnt resolve the problem. also, when i go to my footer and click blog. after it brings me to the homepage ill try to click on my pages from the original bar at the top of my screen and it will bring me to the right pages. but it will have the same blog url in the search bar even when im on other pages. other than that all of my pages in my footer and in my homepage toolbar work fine. its just that one particular problem with the blog page in the footer and how it stays with the same blog url on every page after i click the blog in the footer. can someone please help. im using yoast and idk if i should disable it or what.
Web Design | | ClearVisionDesign0