Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Duplicate Content for index.html
-
In the Crawl Diagnostics Summary, it says that I have two pages with duplicate content which are:
I read in a Dream Weaver tutorial that you should name your home page "index.html" and then you can let www.mywebsite.com automatically direct the user to index.html. Is this a bug in SEOMoz's crawler or is it a real problem with my site?
Thank you,
Dan
-
The code should definitely go into the websites root directory's .htaccess, however .htaccess can be weird, a few days ago I ran into a similar issue with a client's website, and I was able to remedy the issue with a variation of the code.
index Redirect RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)index.(php|html|htm|asp)\ HTTP/ RewriteRule ^(([^/]+/))index.(php|html|htm|asp)$ http://yoursite.com/$1 [R=301,L]
If you give me the URL for the site I will take a look at it and let you know what would be feasible.
-
Hi Daniel, can you share with us the URL of your site? We can take a look at it and give you a more precise answer that way. Thanks!
-
I eventually figured out that your method was a 301 redirect and I definitely broke my site trying to use the code you posted. .. haha. Its ok though. I just removed the code and it went back to normal. At first, I was editing the .htaccess file in the public_html folder which wasnt working. Then I tried the root folder for the site (I created the .htaccess file since it did not exist.) Neither of those worked. (I am using Bluehost so I do not think that I have root access and I am not sure if it is a Linux server or not.)
If there is an easy way to explain what I am doing wrong, please do so. Otherwise, I will use canonical.
Thanks for everything!
-
@Dan
Thanks for your reply. It seems like there are lots of different ways to solve this problem. I just watched this video on Matt Cutt's blog where he discusses his preference for 301 redirects over rel canonical tag.
Where would you say your solution fits in?
sorry about the delay of this response, i didn't realize the that you were asking me a question right away. When placing the code I provided in my previous answer this will cause a 301 perminant redirect to the original URL. That's actually what the
[R=301,L]
portion of the code is stating (R) redirect (301) status is referring to. After reviewing the Matt Cutts video, I realize that I should have asked you if you were operating on a Linux server that you had root access to. We actually utilize both redirects and canonical tags since it was recommended by the on-page optimization reports. Heck Google uses them, I would assume because it's easier for the user to be referred to a single page URL. Obviously though if you don't have server header access, and are not familiar with .htaccess (you can accidentally break your site) then the canonical solution is appropriate
-
Josh,
Thanks for your reply. It seems like there are lots of different ways to solve this problem. I just watched this video on Matt Cutt's blog where he discusses his preference for 301 redirects over rel canonical tag.
Where would you say your solution fits in?
Thanks,
Dan -
use the link rel tag for all my homepages for the http://www.yoursite.com
-
Odd enough I just recently answered this question. The SEOmoz crawler is correct, because without a redirect you will be able to access both versions of the page in your browser.
To resolve this issue simply rewrite the index.html to the root url by placing the following code into your .htaccess file into your root directory.
Options +FollowSymlinks RewriteEngine on
Index Rewrite RewriteRule ^index.(htm|html|php) http://www.yoursite.com/ [R=301,L] RewriteRule ^(.*)/index.(htm|html|php) http://www.yoursite.com/$1/ [R=301,L]
You can also do the same with the index file in any subdirectories that you might create, by simply placing a .htaccess into those sub directories and using variations of the above code. This is how you create nice tight URLs without the duplicate content issue that look like - http://www.semclix.com/design/business/
-
It is a problem which you need to fix. You need to canonicalize your pages.
Those are all various URLs which most likely lead to the same web page. I say "most likely" because these URLs can actually lead to different pages.
You need to tell crawlers and search engines how you organize your site. There are several ways to achieve canonicalization. The method I prefer is to add the following line of code to each page:
The URL provided should be the preferred URL for your page.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Making html table as 'seofriendly' as possible
Hi, On my website I have a table with a list of products, on every row I have a different product and a different property on each column. The table is made with css so the html code is clean. The problem is (I guess) that google doesn't 'understand' what its inside on the table. So if I do a google search that page appears on the page 87, there is any way to improve my SEO without changing the table? Or to improve my SEO I must change the format of my content? In resume, I want to improve the SEO page of a page that contains information organized inside a table. I don't know if there is a specific answer to this question. Any help is welcome. Regards
Web Design | | jcobo0 -
Any risks involved in removing a sub-domain from search index or completely taking down? Ranking impact?
Hi all, One of our sub-domains has thousands of indexed pages but traffic is very less and irrelevant. There are links between this sub-domain to other sub domains of ours. We are planning to take this subdomain completely. What happens if so? Google responds for this with a ranking change? Thanks
Web Design | | vtmoz0 -
Hiding content until user scrolls - Will Google penalize me?
I've used: "opacity:0;" to hide sections of my content, which are triggered to show (using Javascript) once the user scrolls over these sections. I remember reading a while back that Google essentially ignores content which is hidden from your page (it mentioned they don't index it, so it's close to impossible to rank for it). Is this still the case? Thanks, Sam
Web Design | | Sam.at.Moz0 -
Https pages indexed but all web pages are http - please can you offer some help?
Dear Moz Community, Please could you see what you think and offer some definite steps or advice.. I contacted the host provider and his initial thought was that WordPress was causing the https problem ?: eg when an https version of a page is called, things like videos and media don't always show up. A SSL certificate that is attached to a website, can allow pages to load over https. The host said that there is no active configured SSL it's just waiting as part of the hosting package just in case, but I found that the SSL certificate is still showing up during a crawl.It's important to eliminate the https problem before external backlinks link to any of the unwanted https pages that are currently indexed. Luckily I haven't started any intense backlinking work yet, and any links I have posted in search land have all been http version.I checked a few more url's to see if it’s necessary to create a permanent redirect from https to http. For example, I tried requesting domain.co.uk using the https:// and the https:// page loaded instead of redirecting automatically to http prefix version. I know that if I am automatically redirected to the http:// version of the page, then that is the way it should be. Search engines and visitors will stay on the http version of the site and not get lost anywhere in https. This also helps to eliminate duplicate content and to preserve link juice. What are your thoughts regarding that?As I understand it, most server configurations should redirect by default when https isn’t configured, and from my experience I’ve seen cases where pages requested via https return the default server page, a 404 error, or duplicate content. So I'm confused as to where to take this.One suggestion would be to disable all https since there is no need to have any traces to SSL when the site is even crawled ?. I don't want to enable https in the htaccess only to then create a https to http rewrite rule; https shouldn't even be a crawlable function of the site at all.RewriteEngine OnRewriteCond %{HTTPS} offor to disable the SSL completely for now until it becomes a necessity for the website.I would really welcome your thoughts as I'm really stuck as to what to do for the best, short term and long term.Kind Regards
Web Design | | SEOguy10 -
Lots of Listing Pages with Thin Content on Real Estate Web Site-Best to Set them to No-Index?
Greetings Moz Community: As a commercial real estate broker in Manhattan I run a web site with over 600 pages. Basically the pages are organized in the following categories: 1. Neighborhoods (Example:http://www.nyc-officespace-leader.com/neighborhoods/midtown-manhattan) 25 PAGES Low bounce rate 2. Types of Space (Example:http://www.nyc-officespace-leader.com/commercial-space/loft-space)
Web Design | | Kingalan1
15 PAGES Low bounce rate. 3. Blog (Example:http://www.nyc-officespace-leader.com/blog/how-long-does-leasing-process-take
30 PAGES Medium/high bounce rate 4. Services (Example:http://www.nyc-officespace-leader.com/brokerage-services/relocate-to-new-office-space) High bounce rate
3 PAGES 5. About Us (Example:http://www.nyc-officespace-leader.com/about-us/what-we-do
4 PAGES High bounce rate 6. Listings (Example:http://www.nyc-officespace-leader.com/listings/305-fifth-avenue-office-suite-1340sf)
300 PAGES High bounce rate (65%), thin content 7. Buildings (Example:http://www.nyc-officespace-leader.com/928-broadway
300 PAGES Very high bounce rate (exceeding 75%) Most of the listing pages do not have more than 100 words. My SEO firm is advising me to set them "No-Index, Follow". They believe the thin content could be hurting me. Is this an acceptable strategy? I am concerned that when Google detects 300 pages set to "No-Follow" they could interpret this as the site seeking to hide something and penalize us. Also, the building pages have a low click thru rate. Would it make sense to set them to "No-Follow" as well? Basically, would it increase authority in Google's eyes if we set pages that have thin content and/or low click thru rates to "No-Follow"? Any harm in doing this for about half the pages on the site? I might add that while I don't suffer from any manual penalty volume has gone down substantially in the last month. We upgraded the site in early June and somehow 175 pages were submitted to Google that should not have been indexed. A removal request has been made for those pages. Prior to that we were hit by Panda in April 2012 with search volume dropping from about 7,000 per month to 3,000 per month. Volume had increased back to 4,500 by April this year only to start tanking again. It was down to 3,600 in June. About 30 toxic links were removed in late April and a disavow file was submitted with Google in late April for removal of links from 80 toxic domains. Thanks in advance for your responses!! Alan0 -
How to put 'Link to this article' HTML code at bottom of article & is it helpful?
Hello, I was thinking about putting a box down at the bottom of my client's main articles that let's the reader easily copy the html code it takes to link to the article they're reading. Maybe I'd put it after the author bio. Do any of you do this? If so, what format do you use? It has to look nice of course. This is a non-techie industry. Thanks.
Web Design | | BobGW0 -
Facebook code being duplicated? (Any developers mind taking a peek?)
I'm using a few different plug ins to give me various Facebook functions on my site. I'm curious there are any developers out there would could take a look at my source code and see if it looks there is some code being duplicated that's slowing down my site. Thanks so much!
Web Design | | NoahsDad0 -
Indexing Dynamic Pages
Hi, I am having an issues among others, regarding indexing dynamic pages. Our website, www.me-by-melia, was just put live and I am concerned the bottom naviagtion pages (http://www.me-by-melia.com/#store, http://www.me-by-melia.com/#facebook, etc) will not be indexed and create duplicate pages. Also, when you open these pages in a new tab, it takes you to homepage. The website was created in HTML5. Please advise.
Web Design | | Melia0