Duplicate Content for index.html
-
In the Crawl Diagnostics Summary, it says that I have two pages with duplicate content which are:
I read in a Dream Weaver tutorial that you should name your home page "index.html" and then you can let www.mywebsite.com automatically direct the user to index.html. Is this a bug in SEOMoz's crawler or is it a real problem with my site?
Thank you,
Dan
-
The code should definitely go into the websites root directory's .htaccess, however .htaccess can be weird, a few days ago I ran into a similar issue with a client's website, and I was able to remedy the issue with a variation of the code.
index Redirect RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)index.(php|html|htm|asp)\ HTTP/ RewriteRule ^(([^/]+/))index.(php|html|htm|asp)$ http://yoursite.com/$1 [R=301,L]
If you give me the URL for the site I will take a look at it and let you know what would be feasible.
-
Hi Daniel, can you share with us the URL of your site? We can take a look at it and give you a more precise answer that way. Thanks!
-
I eventually figured out that your method was a 301 redirect and I definitely broke my site trying to use the code you posted. .. haha. Its ok though. I just removed the code and it went back to normal. At first, I was editing the .htaccess file in the public_html folder which wasnt working. Then I tried the root folder for the site (I created the .htaccess file since it did not exist.) Neither of those worked. (I am using Bluehost so I do not think that I have root access and I am not sure if it is a Linux server or not.)
If there is an easy way to explain what I am doing wrong, please do so. Otherwise, I will use canonical.
Thanks for everything!
-
@Dan
Thanks for your reply. It seems like there are lots of different ways to solve this problem. I just watched this video on Matt Cutt's blog where he discusses his preference for 301 redirects over rel canonical tag.
Where would you say your solution fits in?
sorry about the delay of this response, i didn't realize the that you were asking me a question right away. When placing the code I provided in my previous answer this will cause a 301 perminant redirect to the original URL. That's actually what the
[R=301,L]
portion of the code is stating (R) redirect (301) status is referring to. After reviewing the Matt Cutts video, I realize that I should have asked you if you were operating on a Linux server that you had root access to. We actually utilize both redirects and canonical tags since it was recommended by the on-page optimization reports. Heck Google uses them, I would assume because it's easier for the user to be referred to a single page URL. Obviously though if you don't have server header access, and are not familiar with .htaccess (you can accidentally break your site) then the canonical solution is appropriate
-
Josh,
Thanks for your reply. It seems like there are lots of different ways to solve this problem. I just watched this video on Matt Cutt's blog where he discusses his preference for 301 redirects over rel canonical tag.
Where would you say your solution fits in?
Thanks,
Dan -
use the link rel tag for all my homepages for the http://www.yoursite.com
-
Odd enough I just recently answered this question. The SEOmoz crawler is correct, because without a redirect you will be able to access both versions of the page in your browser.
To resolve this issue simply rewrite the index.html to the root url by placing the following code into your .htaccess file into your root directory.
Options +FollowSymlinks RewriteEngine on
Index Rewrite RewriteRule ^index.(htm|html|php) http://www.yoursite.com/ [R=301,L] RewriteRule ^(.*)/index.(htm|html|php) http://www.yoursite.com/$1/ [R=301,L]
You can also do the same with the index file in any subdirectories that you might create, by simply placing a .htaccess into those sub directories and using variations of the above code. This is how you create nice tight URLs without the duplicate content issue that look like - http://www.semclix.com/design/business/
-
It is a problem which you need to fix. You need to canonicalize your pages.
Those are all various URLs which most likely lead to the same web page. I say "most likely" because these URLs can actually lead to different pages.
You need to tell crawlers and search engines how you organize your site. There are several ways to achieve canonicalization. The method I prefer is to add the following line of code to each page:
The URL provided should be the preferred URL for your page.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Index Page Redirect to Home Page? Best Practices...
Hi, I am wondering what the best practice is when a site has an index page and a home page? I have two pages, listed below, and want to know if I should 301 redirect my "index" page to my standard home page. The home page is where I would like all traffic to fall on for our website. Additionally, I used the rel=canonical tag years ago on the index page to indicate that the home page is the main content. Home Page - https://www.1099pro.com/ (PA 45) Home Page Canonical: rel="canonical" href="https://www.1099pro.com/"/> Index Page - https://www.1099pro.com/index.asp (PA - 33) Index Page Canonical: rel="canonical" href="https://www.1099pro.com/"/> It seems to me that there is some extra juice that could be passed to my home page (which is the page that ranks highly for our major keywords) by 301 redirecting the index page. Is there any reason why I should not do that? Really appreciate any help - especially with extra explanations - for the simple minded like me ;)! -Michael
Web Design | | Stew2220 -
When rel canonical tag used, which page does Google considers for ranking and indexing? A/B test scenario!
Hi Moz community, We have redesigned our website and launched for A/B testing using canonical tags from old website to new website pages, so there will be no duplicate content issues and new website will be shown to the half of the website visitors successfully to calculate the metrics. However I wonder how actually Google considers it? Which pages Google will crawl and index to consider for ranking? Please share your views on this for better optimisation. Thanks
Web Design | | vtmoz0 -
Sitemap and Privacy Policy marked for duplicate content?
On a recent crawl, Moz flagged a page of our site for duplicate content. However, the pages listed are our sitemap and our privacy policy -- both very different: http://elearning.smp.org/sitemap/ http://elearning.smp.org/privacy-policy/ What is our best option to address this issue? I had considered a noindex tag on the privacy policy page, but since we have enabled user insights in Google Analytics we need to have the privacy policy displayed and I worry that putting a noindex on the page would cause problems later.
Web Design | | calliek0 -
Fixing Render Blocking Javascript and CSS in the Above-the-fold content
We don't have a responsive design site yet, and our mobile site is built through Dudamobile. I know it's not the best, but I'm trying to do whatever we can until we get around to redesigning it. Is there anything I can do about the following Page Speed Insight errors or are they just a function of using Dudamobile? Eliminate render-blocking JavaScript and CSS in above-the-fold content Your page has 3 blocking script resources and 5 blocking CSS resources. This causes a delay in rendering your page.None of the above-the-fold content on your page could be rendered without waiting for the following resources to load. Try to defer or asynchronously load blocking resources, or inline the critical portions of those resources directly in the HTML.Remove render-blocking JavaScript: http://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js http://mobile.dudamobile.com/…ckage.min.js?version=2015-04-02T13:36:04 http://mobile.dudamobile.com/…pts/blogs.js?version=2015-04-02T13:36:04 Optimize CSS Delivery of the following: http://fonts.googleapis.com/…:400|Great+Vibes|Signika:400,300,600,700 http://mobile.dudamobile.com/…ont-pack.css?version=2015-04-02T13:36:04 http://mobile.dudamobile.com/…kage.min.css?version=2015-04-02T13:36:04 http://irp-cdn.multiscreensite.com/kempruge/files/kempruge_0.min.css?v=6 http://irp-cdn.multiscreensite.com/…mpruge/files/kempruge_home_0.min.css?v=6 Thanks for any tips, Ruben
Web Design | | KempRugeLawGroup0 -
Increasing content, adding rich snippets... and losing tremendous amounts of organic traffic. Help!
I know dramatic losses in organic traffic is a common occurrence, but having looked through the archives I'm not sure that there's a recent case that replicates my situation. I've been working to increase the content on my company's website and to advise it on online marketing practices. To that end, in the past four months, I've created about 20% more pages — most of which are very high quality blog posts; adopted some rich snippets (though not all that I would like to see at this point); improved and increased internal links within the site; removed some "suspicious" pages as id'd by Moz that had a lot of links on it (although the content was actually genuine navigation); and I've also begun to guest blog. All of the blog content I've written has been connected to my G+ account, including most of the guest blogging. And... our organic traffic is preciptiously declining. Across the board. I'm befuddled. I can see no warnings (redirects &c) that would explain this. We haven't changed the site structure much — I think the most invasive thing we did was optimize our title tags! So no URL changes, nothing. Obviously, we're all questioning all the work I've done. It just seems like we've sunk SO much energy into "doing the right thing" to no effect (this site was slammed before for its shady backlink buying — though not from any direct penalty, just as a result of the Penguin update). We noticed traffic taking a particular plunge at the beginning of June. Can anyone offer insights? Very much appreciated.
Web Design | | Novos_Jay0 -
Effects of HTML layout on Arabic websites and SEO
Hi all, I was hoping someone may be able to help. We are putting together an Arabic website and due to reading right to left as opposed to left to right, the site HTML layout is mirrored compared to normal with everything flipped over. What we are wondering is, will this effect SEO and what are the SEO implications of this? Do the search engine bots automaticlaly know to read the content etc differently and understand that everything is purposely mirrored / the HTML is in a different location compared to a site in the UK / US etc? Any help on this would be most appreciated. Cheers!
Web Design | | marcelo-2753980 -
How will engines deal with duplicate head elements e.g. title or canonicals?
Obviously duplicate content is never a good thing...on separate URL's. Question is, how will the engines deal with duplicate meta tags on the same page. Example Head Tag: <title>Example Title - #1</title> <title>Example Title - #2</title> My assumption is that Google (and others) will take the first instance of the tag, such that "Example Title - #1" and canonical = "http://www.example.com" would be considered for ranking purposes while the others are disregarded. My assumption is based on how SE's deal with duplicate links on a page. Is this a correct assumption? We're building a CMS-like service that will allow our SEO team to change head tag content on the fly. The easiest solution, from a dev perspective, is to simply place new/updated content above the preexisting elements. I'm trying to validate/invalidate the approach. Thanks in advance.
Web Design | | PCampolo0 -
Why is site not being indexed by Google, and not showing on a crawl test??
On a site we developed of which .com is forwarded to .net domain, we quit getting crawled by google on about the 20th of Feb. Now when we try to run a crawl test on either url, we get There was an error fetching this page. Error description For some reason the page returned did not describe itself as an html page. It could be possible that the url is serving an image, rss feed, pdf, or xml file of some sort. The crawl tool does not currently report metrics on this type of data. Our other sites are fine and this was up to this date. We took out noodp, noydir today as the only thing we could think of. Site is on WP cms.
Web Design | | RobertFisher0