Duplicate content issue index.html vs non index.html
-
Hi
I have an issue. In my client's profile, I found that the "index.html" are mostly authoritative than non "index.html", and I found that www. version is more authoritative than non www. The problem is that I find the opposite situation where non "index.html" are more authoritative than "index.html" or non www more authoritative than www.
My logic would tell me to still redirect the non"index.html" to "index.html". Am I right?
and in the case I find the opposite happening, does it matter if I still redirect the non"index.html" to "index.html"?
The same question for www vs non www versions?
Thank you
-
Yes, I like using rewrites in an .htaccess file, which is covered in the links above.
-
I fix the 2 URLs.
In this case domain.com/index.html is the code for domain.com/.
Do you mean to use mode_rewrite and create a 301 redirect from domain.com/index.html to domain.com/ ?
Thank you for your time.
-
<colgroup><col span="30" width="64"></colgroup>
Hi Taysir, first of all ypou must take an overview with what is duplicate content? Solving the cannonical problems with www. Duplicate Content Issues in www & non www I hope that your query had been solved. -
It's very likely that the "index.html" version is more authoritative because you're using it in internal links. The problem is that that often creates a duplication issue - you refer to the root (non-index.html) version in inbound links, social, etc. (and people tend to link and bookmark the root version), but then link internally to "index.html", so Google will end up indexing both.
If the authority is coming from internal links, and you:
(1) Switch the internal links to the root ("/")
(2) 301-redirect "index.html" to the root ("/")
...you shouldn't lose any authority, as you'll have re-routed it by doing step (1). You'll also consolidate your signals and be better off all-around, IMO.
Kane's right, though - it's a bit tough to tell without knowing the specifics.
-
Redirecting the authoritative link to the less authoritative URL is not ideal.
However, in my opinion being consistent with URLs throughout the site takes precedent.
Implementing 301 redirects will indicate that there has been a permanent relocation of that pages content, and you will get most of the link value from the authoritative link. That said, if you feel comfortable emailing the person who created that authoritative link, it's worth a little effort to ask them to change it, but if it's a hassle to do so, don't push it.
-
How to redirect domain.com/index.html to domain.com/index.html?
Those two URLs are the same, so there is nothing to change. If you wanted to redirect domain.com/index.html to domain.com/ then you would do so with 301 redirects. Here's a guide on getting started:
http://www.seomoz.org/learn-seo/redirection
http://www.seomoz.org/blog/url-rewrites-and-301-redirects-how-does-it-all-work
-
I personally would rewrite & redirect everything using the 2nd option above.
Can you explain me how to do that, please?
How to redirect domain.com/index.html to domain.com?
Thanks
-
thank you for your detailed answer but one more thing does it matter if I redirect a more authoritative link to a weaker one for the benefit of staying consistent and vice versa?
let s say I redirect a non index.html to an index.html and vice versa for the sake of consistency?
-
You should stick with one format across the site:
-
domain.com/index.html and domain.com/subfolder/index.html
**OR **
I typically choose the second option because it is agnostic of CMS or file type, and it looks better in my opinion. I would not mix the two across the site because it causes a confusing user experience.
So, to answer your questions directly:
My logic would tell me to still redirect the non"index.html" to "index.html". Am I right?
No, not necessarily. By telling us that there are examples where .html is more authoritative and there are examples where it isn't as authoritative, it's impossible for us to say which is the better choice. I personally would rewrite & redirect everything using the 2nd option above.
**The same question for www vs non www versions? **
I believe that WWW vs non-WWW is less important. You could decide based upon which format has more links or which one has been historically used. Consistency (using the same across the entire site), proper 301 redirects, and proper rel canonical tags are your priorities here.
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate content analysis
Hi all,We have some pages being flagged as duplicates by the google search console. However, we believe the content on these pages is distinctly different (for example, they have completely different search results returned, different headings etc). An example of two pages google finds to be duplicates is below. if anyone can spot what might be causing the duplicate issue here, would very much appreciate suggestions! Thanks in advance.
Technical SEO | | Eric_S
Examples: https://www.vouchedfor.co.uk/IFA-financial-advisor-mortgage/harborne
https://www.vouchedfor.co.uk/accountant/harborne0 -
Duplicate content issue with ?utm_source=rss&utm_medium=rss&utm_campaign=
Hello,
Technical SEO | | Dinsh007
Recently, I was checking how my site content is getting indexed in Google and from today I noticed 2 links indexed on google for the same article: This is the proper link - https://techplusgame.com/hideo-kojima-not-interested-in-new-silent-hills-revival-insider-claims/ But why this URL was indexed, I don't know - https://techplusgame.com/hideo-kojima-not-interested-in-new-silent-hills-revival-insider-claims/?utm_source=rss&utm_medium=rss&utm_campaign=hideo-kojima-not-interested-in-new-silent-hills-revival-insider-claims Could you please tell me how to solve this issue? Thank you1 -
Development Website Duplicate Content Issue
Hi, We launched a client's website around 7th January 2013 (http://rollerbannerscheap.co.uk), we originally constructed the website on a development domain (http://dev.rollerbannerscheap.co.uk) which was active for around 6-8 months (the dev site was unblocked from search engines for the first 3-4 months, but then blocked again) before we migrated dev --> live. In late Jan 2013 changed the robots.txt file to allow search engines to index the website. A week later I accidentally logged into the DEV website and also changed the robots.txt file to allow the search engines to index it. This obviously caused a duplicate content issue as both sites were identical. I realised what I had done a couple of days later and blocked the dev site from the search engines with the robots.txt file. Most of the pages from the dev site had been de-indexed from Google apart from 3, the home page (dev.rollerbannerscheap.co.uk, and two blog pages). The live site has 184 pages indexed in Google. So I thought the last 3 dev pages would disappear after a few weeks. I checked back late February and the 3 dev site pages were still indexed in Google. I decided to 301 redirect the dev site to the live site to tell Google to rank the live site and to ignore the dev site content. I also checked the robots.txt file on the dev site and this was blocking search engines too. But still the dev site is being found in Google wherever the live site should be found. When I do find the dev site in Google it displays this; Roller Banners Cheap » admin dev.rollerbannerscheap.co.uk/ A description for this result is not available because of this site's robots.txt – learn more. This is really affecting our clients SEO plan and we can't seem to remove the dev site or rank the live site in Google. In GWT I have tried to remove the sub domain. When I visit remove URLs, I enter dev.rollerbannerscheap.co.uk but then it displays the URL as http://www.rollerbannerscheap.co.uk/dev.rollerbannerscheap.co.uk. I want to remove a sub domain not a page. Can anyone help please?
Technical SEO | | SO_UK0 -
Duplicate content issues, I am running into challenges and am looking for suggestions for solutions. Please help.
So I have a number of pages on my real estate site that display the same listings, even when parsed down by specific features and don't want these to come across as duplicate content pages. Here are a few examples: http://luxuryhomehunt.com/homes-for-sale/lake-mary/hanover-woods.html?feature=waterfront http://luxuryhomehunt.com/homes-for-sale/lake-mary/hanover-woods.html This happens to be a waterfront community so all the homes are located along the waterfront. I can use a canonical tag, but I not every community is like this and I want the parsed down feature pages to get index. Here is another example that is a little different: http://luxuryhomehunt.com/homes-for-sale/winter-park/bear-gully-bay.html http://luxuryhomehunt.com/homes-for-sale/winter-park/bear-gully-bay.html?feature=without-pool http://luxuryhomehunt.com/homes-for-sale/winter-park/bear-gully-bay.html?feature=4-bedrooms http://luxuryhomehunt.com/homes-for-sale/winter-park/bear-gully-bay.html?feature=waterfront So all the listings in this community happen to have 4 bedrooms, no pool, and are waterfront. Meaning that they display for each of the parsed down categories. I can possible set something that if the listings = same then use canonical of main page url, but in the next case its not so simple. So in this next neighborhood there are 48 total listings as seen at: http://luxuryhomehunt.com/homes-for-sale/windermere/isleworth.html and being that it is a higher end neighborhood, 47 of the 48 listings are considered "traditional listings" and while it is not exactly all of them it is 99%. Any recommendations is appreciated greatly.
Technical SEO | | Jdubin0 -
Duplicate Content on Product Pages
Hello I'm currently working on two sites and I had some general question's about duplicate content. For the first one each page is a different location, but the wording is identical on each; ie it says Instant Remote Support for Critical Issues, Same Day Onsite Support with a 3-4 hour response time, etc. Would I get penalized for this? Another question i have is, we offer Antivirus support for providers ie Norton, AVG,Bit Defender etc. I was wondering if we will get penalized for having the same first paragraph with only changing the name of the virus provider on each page? My last question is we provide services for multiple city's and towns in various states. Will I get penalized for having the same content on each page, such as towns and producuts and services we provide? Thanks.
Technical SEO | | ilyaelbert0 -
Duplicate Content
Many of the pages on my site are similar in structure/content but not exactly the same. What amount of content should be unique for Google to not consider it duplicate? If it is something like 50% unique would it be preferable to choose one page as the canonical instead of keeping them both as separate pages?
Technical SEO | | theLotter0 -
How to Solve Duplicate Page Content Issue?
I have created one campaign over SEOmoz tools for my website. I have found 89 duplicate content issue from report. Please, look in to Duplicate Page Content Issue. I am quite confuse to resolve this issue. Can any one suggest me best solution to resolve it?
Technical SEO | | CommercePundit0 -
Duplicate content
I am getting flagged for duplicate content, SEOmoz is flagging the following as duplicate: www.adgenerator.co.uk/ www.adgenerator.co.uk/index.asp These are obviously meant to be the same path so what measures do I take to let the SE's know that these are to be considered the same page. I have used the canonical meta tag on the Index.asp page.
Technical SEO | | IPIM0