Duplicate Content for index.html
-
In the Crawl Diagnostics Summary, it says that I have two pages with duplicate content which are:
I read in a Dream Weaver tutorial that you should name your home page "index.html" and then you can let www.mywebsite.com automatically direct the user to index.html. Is this a bug in SEOMoz's crawler or is it a real problem with my site?
Thank you,
Dan
-
The code should definitely go into the websites root directory's .htaccess, however .htaccess can be weird, a few days ago I ran into a similar issue with a client's website, and I was able to remedy the issue with a variation of the code.
index Redirect RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)index.(php|html|htm|asp)\ HTTP/ RewriteRule ^(([^/]+/))index.(php|html|htm|asp)$ http://yoursite.com/$1 [R=301,L]
If you give me the URL for the site I will take a look at it and let you know what would be feasible.
-
Hi Daniel, can you share with us the URL of your site? We can take a look at it and give you a more precise answer that way. Thanks!
-
I eventually figured out that your method was a 301 redirect and I definitely broke my site trying to use the code you posted. .. haha. Its ok though. I just removed the code and it went back to normal. At first, I was editing the .htaccess file in the public_html folder which wasnt working. Then I tried the root folder for the site (I created the .htaccess file since it did not exist.) Neither of those worked. (I am using Bluehost so I do not think that I have root access and I am not sure if it is a Linux server or not.)
If there is an easy way to explain what I am doing wrong, please do so. Otherwise, I will use canonical.
Thanks for everything!
-
@Dan
Thanks for your reply. It seems like there are lots of different ways to solve this problem. I just watched this video on Matt Cutt's blog where he discusses his preference for 301 redirects over rel canonical tag.
Where would you say your solution fits in?
sorry about the delay of this response, i didn't realize the that you were asking me a question right away. When placing the code I provided in my previous answer this will cause a 301 perminant redirect to the original URL. That's actually what the
[R=301,L]
portion of the code is stating (R) redirect (301) status is referring to. After reviewing the Matt Cutts video, I realize that I should have asked you if you were operating on a Linux server that you had root access to. We actually utilize both redirects and canonical tags since it was recommended by the on-page optimization reports. Heck Google uses them, I would assume because it's easier for the user to be referred to a single page URL. Obviously though if you don't have server header access, and are not familiar with .htaccess (you can accidentally break your site) then the canonical solution is appropriate
-
Josh,
Thanks for your reply. It seems like there are lots of different ways to solve this problem. I just watched this video on Matt Cutt's blog where he discusses his preference for 301 redirects over rel canonical tag.
Where would you say your solution fits in?
Thanks,
Dan -
use the link rel tag for all my homepages for the http://www.yoursite.com
-
Odd enough I just recently answered this question. The SEOmoz crawler is correct, because without a redirect you will be able to access both versions of the page in your browser.
To resolve this issue simply rewrite the index.html to the root url by placing the following code into your .htaccess file into your root directory.
Options +FollowSymlinks RewriteEngine on
Index Rewrite RewriteRule ^index.(htm|html|php) http://www.yoursite.com/ [R=301,L] RewriteRule ^(.*)/index.(htm|html|php) http://www.yoursite.com/$1/ [R=301,L]
You can also do the same with the index file in any subdirectories that you might create, by simply placing a .htaccess into those sub directories and using variations of the above code. This is how you create nice tight URLs without the duplicate content issue that look like - http://www.semclix.com/design/business/
-
It is a problem which you need to fix. You need to canonicalize your pages.
Those are all various URLs which most likely lead to the same web page. I say "most likely" because these URLs can actually lead to different pages.
You need to tell crawlers and search engines how you organize your site. There are several ways to achieve canonicalization. The method I prefer is to add the following line of code to each page:
The URL provided should be the preferred URL for your page.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How Does Google View Hidden Content?
I have a website which contains a lot of content behind a show hide, does Google crawl the "hidden" copy?
Web Design | | jasongmcmahon0 -
HTML and XML sitemaps for one website.
Hi all, First, we have created a HTML sitemap for our wordpress website. Then we again generated XML sitemap and submitted same in search console. It's been more than a week and still new XML sitemap has not been indexed yet. I can still only see HTML sitemap for search results "company sitemap". Also search console do have only XML sitemap. Both sitemaps are accessible but only HTML has been indexed. Is there anything wrong having 2 sitemaps? Why XML sitemap not been indexed? Thanks
Web Design | | vtmoz0 -
Bing Indexation and handling of X-ROBOTS tag or AngularJS
Hi MozCommunity, I have been tearing my hair out trying to figure out why BING wont index a test site we're running. We're in the midst of upgrading one of our sites from archaic technology and infrastructure to a fully responsive version.
Web Design | | AU-SEO
This new site is a fully AngularJS driven site. There's currently over 2 million pages and as we're developing the new site in the backend, we would like to test out the tech with Google and Bing. We're looking at a pre-render option to be able to create static HTML snapshots of the pages that we care about the most and will be available on the sitemap.xml.gz However, with 3 completely static HTML control pages established, where we had a page with no robots metatag on the page, one with the robots NOINDEX metatag in the head section and one with a dynamic header (X-ROBOTS meta) on a third page with the NOINDEX directive as well. We expected the one without the meta tag to at least get indexed along with the homepage of the test site. In addition to those 3 control pages, we had 3 pages where we had an internal search results page with the dynamic NOINDEX header. A listing page with no such header and the homepage with no such header. With Google, the correct indexation occured with only 3 pages being indexed, being the homepage, the listing page and the control page without the metatag. However, with BING, there's nothing. No page indexed at all. Not even the flat static HTML page without any robots directive. I have a valid sitemap.xml file and a robots.txt directive open to all engines across all pages yet, nothing. I used the fetch as Bingbot tool, the SEO analyzer Tool and the Preview Page Tool within Bing Webmaster Tools, and they all show a preview of the requested pages. Including the ones with the dynamic header asking it not to index those pages. I'm stumped. I don't know what to do next to understand if BING can accurately process dynamic headers or AngularJS content. Upon checking BWT, there's definitely been crawl activity since it marked against the XML sitemap as successful and put a 4 next to the number of crawled pages. Still no result when running a site: command though. Google responded perfectly and understood exactly which pages to index and crawl. Anyone else used dynamic headers or AngularJS that might be able to chime in perhaps with running similar tests? Thanks in advance for your assistance....0 -
Why is my homepage not indexed by Google or Bing
http://www.schoppnutritionclinic.com/ Home page is not indexed by Google or Bing but all other pages are indexed. I know that currently i am missing the robot.TXT file and the sitemap. This is something i am woking on as a possible solution. I would have thought Google/Bing would have still indexed this page regardless of the lack of sitemap/robot.txt files not being present. I attempted to run a fetch and render in Webmaster tools and received a Not Found status.
Web Design | | ChrisSams0 -
Increasing content, adding rich snippets... and losing tremendous amounts of organic traffic. Help!
I know dramatic losses in organic traffic is a common occurrence, but having looked through the archives I'm not sure that there's a recent case that replicates my situation. I've been working to increase the content on my company's website and to advise it on online marketing practices. To that end, in the past four months, I've created about 20% more pages — most of which are very high quality blog posts; adopted some rich snippets (though not all that I would like to see at this point); improved and increased internal links within the site; removed some "suspicious" pages as id'd by Moz that had a lot of links on it (although the content was actually genuine navigation); and I've also begun to guest blog. All of the blog content I've written has been connected to my G+ account, including most of the guest blogging. And... our organic traffic is preciptiously declining. Across the board. I'm befuddled. I can see no warnings (redirects &c) that would explain this. We haven't changed the site structure much — I think the most invasive thing we did was optimize our title tags! So no URL changes, nothing. Obviously, we're all questioning all the work I've done. It just seems like we've sunk SO much energy into "doing the right thing" to no effect (this site was slammed before for its shady backlink buying — though not from any direct penalty, just as a result of the Penguin update). We noticed traffic taking a particular plunge at the beginning of June. Can anyone offer insights? Very much appreciated.
Web Design | | Novos_Jay0 -
Im having duplicate content issues in wordpress
all of my pages are working fine. but i added my sitemap to my footer in my website and when i click on my blog from my footer it takes me to the homepage. so now im having duplicate content for two diff urls. ive tried adding a rel=canonical and a 301 redirect to the blog page but it doesnt resolve the problem. also, when i go to my footer and click blog. after it brings me to the homepage ill try to click on my pages from the original bar at the top of my screen and it will bring me to the right pages. but it will have the same blog url in the search bar even when im on other pages. other than that all of my pages in my footer and in my homepage toolbar work fine. its just that one particular problem with the blog page in the footer and how it stays with the same blog url on every page after i click the blog in the footer. can someone please help. im using yoast and idk if i should disable it or what.
Web Design | | ClearVisionDesign0 -
How much content is too much? Best Pages For Content?
To my understanding content has a lot to do with organic rankings if written correctly. My question is, how much content is too much and what pages are best to place content. Our company sells very costly products. Our customers call to purchase, we do not have an eCommerce site. Write now we have on average 350 words per page. We have about 200+ pages. Each page is written for that general category and each product has its own unique content. It seems to me that the pages with less content, tend to rank a bit better. As we are in the process of redoing our website, is there any recommendations on writing content, or adjusting the amount of text. I am thinking a lot of our text is informative only to a certain extent. Would writing content just for the main category page be better, and then on the actual product page, have only about 250 words as a description? Are there any other recommendations for SEO that are fairly new? Besides the Title, Description, Heading Tags, Image Alts, URLS etc.
Web Design | | hfranz0 -
Two URLs with same content
We recently had a client who own multiple brands switch from having multiple urls to having a single domain with multiple sub domains. I've posted an example below to better explain. My question is the original url is still functional, so there are two urls with identical content, yet I haven't been getting a duplicate content error. Also, would a rel canonical link be beneficial in this case since the duplicate content is on two separate domains? My thoughts were to put a 301 redirect on the original pages so they permanently forward to the new sub-domain format. Is this the best course of action? If not, what would you recommend? Example: Original URLs
Web Design | | BluespaceCreative
www.example1.com
www.example2.com
www.example3.com
www.parentcompany.com New URLs
example1.parentcompany.com
example2.parentcompany.com
example3.parentcompany.com
www.parentcompany.com Let me know if this I need to clarify anything in better detail.
Thanks in advance!0