Duplicate Content for index.html
-
In the Crawl Diagnostics Summary, it says that I have two pages with duplicate content which are:
I read in a Dream Weaver tutorial that you should name your home page "index.html" and then you can let www.mywebsite.com automatically direct the user to index.html. Is this a bug in SEOMoz's crawler or is it a real problem with my site?
Thank you,
Dan
-
The code should definitely go into the websites root directory's .htaccess, however .htaccess can be weird, a few days ago I ran into a similar issue with a client's website, and I was able to remedy the issue with a variation of the code.
index Redirect RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)index.(php|html|htm|asp)\ HTTP/ RewriteRule ^(([^/]+/))index.(php|html|htm|asp)$ http://yoursite.com/$1 [R=301,L]
If you give me the URL for the site I will take a look at it and let you know what would be feasible.
-
Hi Daniel, can you share with us the URL of your site? We can take a look at it and give you a more precise answer that way. Thanks!
-
I eventually figured out that your method was a 301 redirect and I definitely broke my site trying to use the code you posted. .. haha. Its ok though. I just removed the code and it went back to normal. At first, I was editing the .htaccess file in the public_html folder which wasnt working. Then I tried the root folder for the site (I created the .htaccess file since it did not exist.) Neither of those worked. (I am using Bluehost so I do not think that I have root access and I am not sure if it is a Linux server or not.)
If there is an easy way to explain what I am doing wrong, please do so. Otherwise, I will use canonical.
Thanks for everything!
-
@Dan
Thanks for your reply. It seems like there are lots of different ways to solve this problem. I just watched this video on Matt Cutt's blog where he discusses his preference for 301 redirects over rel canonical tag.
Where would you say your solution fits in?
sorry about the delay of this response, i didn't realize the that you were asking me a question right away. When placing the code I provided in my previous answer this will cause a 301 perminant redirect to the original URL. That's actually what the
[R=301,L]
portion of the code is stating (R) redirect (301) status is referring to. After reviewing the Matt Cutts video, I realize that I should have asked you if you were operating on a Linux server that you had root access to. We actually utilize both redirects and canonical tags since it was recommended by the on-page optimization reports. Heck Google uses them, I would assume because it's easier for the user to be referred to a single page URL. Obviously though if you don't have server header access, and are not familiar with .htaccess (you can accidentally break your site) then the canonical solution is appropriate
-
Josh,
Thanks for your reply. It seems like there are lots of different ways to solve this problem. I just watched this video on Matt Cutt's blog where he discusses his preference for 301 redirects over rel canonical tag.
Where would you say your solution fits in?
Thanks,
Dan -
use the link rel tag for all my homepages for the http://www.yoursite.com
-
Odd enough I just recently answered this question. The SEOmoz crawler is correct, because without a redirect you will be able to access both versions of the page in your browser.
To resolve this issue simply rewrite the index.html to the root url by placing the following code into your .htaccess file into your root directory.
Options +FollowSymlinks RewriteEngine on
Index Rewrite RewriteRule ^index.(htm|html|php) http://www.yoursite.com/ [R=301,L] RewriteRule ^(.*)/index.(htm|html|php) http://www.yoursite.com/$1/ [R=301,L]
You can also do the same with the index file in any subdirectories that you might create, by simply placing a .htaccess into those sub directories and using variations of the above code. This is how you create nice tight URLs without the duplicate content issue that look like - http://www.semclix.com/design/business/
-
It is a problem which you need to fix. You need to canonicalize your pages.
Those are all various URLs which most likely lead to the same web page. I say "most likely" because these URLs can actually lead to different pages.
You need to tell crawlers and search engines how you organize your site. There are several ways to achieve canonicalization. The method I prefer is to add the following line of code to each page:
The URL provided should be the preferred URL for your page.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Problems preventing Wordpress attachment pages from being indexed and from being seen as duplicate content.
Hi According to a Moz Crawl, it looks like the Wordpress attachment pages from all image uploads are being indexed and seen as duplicate content..or..is it the Yoast sitemap causing it? I see 2 options in SEO Yoast: Redirect attachment URLs to parent post URL. Media...Meta Robots: noindex, follow I set it to (1) initially which didn't resolve the problem. Then I set it to option (2) so that all images won't be indexed but search engines would still associate those images with their relevant posts and pages. However, I understand what both of these options (1) and (2) mean, but because I chose option 2, will that mean all of the images on the website won't stand a chance of being indexed in search engines and Google Images etc? As far as duplicate content goes, search engines can get confused and there are 2 ways for search engines
Web Design | | SEOguy1
to reach the correct page content destination. But when eg Google makes the wrong choice a portion of traffic drops off (is lost hence errors) which then leaves the searcher frustrated, and this affects the seo and ranking of the site which worsens with time. My goal here is - I would like all of the web images to be indexed by Google, and for all of the image attachment pages to not be indexed at all (Moz shows the image attachment pages as duplicates and the referring site causing this is the sitemap url which Yoast creates) ; that sitemap url has been submitted to the search engines already and I will resubmit once I can resolve the attachment pages issues.. Please can you advise. Thanks.0 -
Spanish website indexed in English, redirect to spanish or english version if i do a new website design?
Hi MOZ users, i have this problem. We have a website in Spanish Language but Google crawls it on English (it is not important the reasons). We re made the entire website and now we are planning the move. The new website will have different language versions, english, spanish and portuguese. Somebody tells me that we have to redirect the old urls (crawled on english) to the new english versions, not to the spanish (the real language of the firsts). Example: URL1 Language: Spanish - Crawled on English --> redirect to Language English version. the other option will be redirect to the spanish new version, which the visitor is waiting to find. URL1 Language: Spanish - Crawled on English --> redirect to Language Spanish version. What do you think? Which is the better option?
Web Design | | NachoRetta0 -
Why is Google Webmaster suddenly started showing hundreds of HTML Improvements
Why is Google Webmaster suddenly started showing hundreds of HTML Improvements I mean to ask, my hundreds pages are been shown as duplicate - despite canonical marked correctly Below are sample url - which are been crawled in own way. I have rechecked canonical tag - which is correct as URL - 1, in all 3 url Do i need to worry about anything or shall i presume its a flaw from search engine to report this as an issue (This only pertain to Forum section) http://www.mycarhelpline.com/index.php?option=com_easydiscuss&view=post&id=1683&Itemid=78 http://www.mycarhelpline.com/?id=1683&Itemid=78&option=com_easydiscuss&view=post http://www.mycarhelpline.com/index.php?option=com_easydiscuss&view=post&id=1683 ps - i know these are dynamic url and not sef friendly url, but its been 3 yrs and , due to our ignorance and site builder took advantage of this. now - nothing can be done much to make them sef friendly as site has several thousand pages and touchwood - these dynamic url are not impacting much
Web Design | | Modi0 -
Google HTML, CSS and javascript styleguides ?
Who's following the Google style guides especially in HTML, CSS and javascript? What are the benefits of following the style guides? I am thinking of sending the style guides to our web development team before we launch our new site but I think there might be some conflicts. I'm an SEO and not programmer or web developer and I'm sure there are some "rules" that these web dev guys should follow and break as well. Thanks in advance! 🙂
Web Design | | esiow20130 -
Converting to WP - Should I add .html or 301?
Moving my site to WP and the old url structure pages end in ".html". I have seen there are plugins that allow you to add .html to the WP pages to preserve links. I am hosting on Synthesis and they do not support htaccess, although you can submit 301 re-directs through the help ticket system. My question is what is the best way to proceed? I have read that 301s "leak" some link juice, but I sure do like those pretty urls. Advice appreciated!
Web Design | | Chris6611 -
AJAX & JQuery Tabs: Indexation & Navigation
Hi I've two questions about indexing Tabs. 1. Let's say I have tabs, or an accordion that is triggered with Jquery. That means that all HTML is accessible and indexed by search engines. But let's say a search query is relevant to the content in Tab#3, while Tab#1 is the one that's open by default. Is there any way that Tab#3 would be open directly if it's more relevant to the search query? 2. AJAX Tabs: We have pages that have Tabs triggered by AJAX (example: http://www.swisscom.ch/en/residential/help/loesung/entfernen-sie-sim-lock.html). I'm wondering about the current best practice. Google recommends HTML Snapshots. A newer SEOMoz Article talks about pushState(). What's the way to go here? Or in other words: How to get Tabs & Accordion content indexed and allow users to navigate directly to it?
Web Design | | zeepartner0 -
Using "#" anchors to display different content
If I have a page that has an area on the page that acts like a widget and has three different tabs. These tabs provide 3 different types of information relevant to the page subject matter. By default when someone goes to the page one of the tabs is showing but you have to click on the others to see the info on them. Is it OK to use domain.com/topic#TAB1, domain.com/topic#TAB2, domain.com/topic#TAB3 to create shortcut links so that people can land on the page and have that predetermined tab showing. I'm wondering what search engines might think. Essentially all the content of all three tabs is there for people to see but they'd have to click to see the other tabs. I don't consider the content to be hidden. But I'd like to hear people's thoughts.
Web Design | | Business.com0 -
Are my duplicate meta titles and descriptions an issue ?
HelloMy website http://www.gardenbeet.com has been rebuilt using prestacart and there are 158 duplicate title and meta descriptions being reported by google.My developer advised the following Almost all the duplicates are due to the same page being accessible at the root and following the category heading. e.g; /75-vegetable-patio-planter-turquoise.html
Web Design | | GardenBeet
/patio-planters/75-vegetable-patio-planter-turquoise.html This is hard-wired into PrestaShop. Was the Canonical module (now disabled) responsible for the confusion by not including the category name? The Googlebot shouldn't be scanning the root versions now. I don't believe this to be a serious issue but I'd recommend a second opinion from someone more SEO savvy just to be sure.Opinions??0