How to resolve Duplicate Page Content issue for root domain & index.html?
-
SEOMoz returns a Duplicate Page Content error for a website's index page, with both domain.com and domain.com/index.html isted seperately. We had a rewrite in the htacess file, but for some reason this has not had an impact and we have since removed it. What's the best way (in an HTML website) to ensure all index.html links are automatically redirected to the root domain and these aren't seen as two separate pages?
-
great code Josh...but , after i saved it on .htaccess , a "?" appeared on the link..
http://www.domain.com/?/example/file.html
Is this ok ? pls advice/
Thank you,
-
You touched on a good point here "We set up our site to utilize a index redirect for all of our sub directories as well, so with this method you simply name your sub directories to match the url path that you desire. Each sub directory has it's own index which you redirect with a variation of the above code. By doing this you can have nice clean url paths like http://www.semclix.com/design/ecommerce/ - and mitigate the duplicate content issue. We hope that this helps."
Too often I see sites where they get the home page right but miss the re-write on the directories.
-
Here's the .htaccess rewrite command that you can use for the index.html redirect -
Options +FollowSymlinks RewriteEngine on
Index Rewrite RewriteRule ^index.(htm|html|php) http://www.amarasoftware.com/ [R=301,L] RewriteRule ^(.*)/index.(htm|html|php) http://www.amarasoftware.com/$1/ [R=301,L]
We set up our site to utilize a index redirect for all of our sub directories as well, so with this method you simply name your sub directories to match the url path that you desire. Each sub directory has it's own index which you redirect with a variation of the above code. By doing this you can have nice clean url paths like http://www.semclix.com/design/ecommerce/ - and mitigate the duplicate content issue. We hope that this helps.
-
I'd check it with some other software too... i.e. Raven Tools free trial or something, that will tell you if there's canonicalization problems... of course I'm not advocating Raven Tools over SEOmoz tools (I'm a member here and not there for good reasons), I just think best to try a few different tests before deciding if it's a problem. There might just be an issue with the SEOmoz campaign tool for the moment, which I'm sure they'll fix as soon as they realise.
Hey, aren't you the tutor I had in my SEC usability course?
-
Unfortunately I can't speak for how SEOmoz handles rewrites like this if it's already crawled the page.
The rewrite rule you're using looks like it's only rewriting the www portion of the URL, not index.html. So alone it wouldn't do anything to solve dupe content issues. (someone please correct me if I'm misreading the rewrite rule)
Here's a link to what I used to write a redirect for index.html on another site.
http://www.webmasterworld.com/forum92/6375.htm
I think it is a fairly safe assumption to make that SEOmoz is smart enough to realize if you're got a redirect in there (providing that its working). I'd still recommend taking a look to see if Google has cached or indexed an index.html version, though.
Edit: my personal, highly technical, acid-test for an index.html redirect is just going there and manually entering the url with index.html on the end, rather than waiting for a recrawl to see if you're heading in the right direction.
-
RewriteEngine on RewriteCond %{HTTP_HOST} ^([a-z.]+)?amarasoftware.com$ [NC] RewriteCond %{HTTP_HOST} !^www. [NC] RewriteRule .? http://www.%1amarasoftware.com%{REQUEST_URI} [R=301,L] Is what I use. In Seomoz this leads to www.amarasoftware.com and index.html so 2 different URL's, both with different incoming links, and a different authority, which has an impact on my ranking if correct. in SEomoz this a returns a duplicate title and meta tags errors. If SEOmoz finds 2 pages instead of one I may assume that Google agrees with this.
-
As you did, I'd normally handle this with a 301 from index.html to the root domain. When you say that it's "not had an impact" do you mean that the SEOmoz dashboard continues to show an error after it re-crawls, or that the search engines are not picking up the redirect?
SEOmoz dashboard does a great job, but I'd check to see how the search engines are actually indexing yourdomain.com/index.html vs. yourdomain.com also. If the search engines are indexing it as you want them to, then I'd be inclined to ignore the dashboard error.
I apologize if this is a stupid question, but I assume you manually checked that the redirect worked?
-
You wish to canonicalize the pages. That is the SEO word which describes exactly what you are trying to achieve.
Above are 5 URLs which can possibly lead to the exact same page. If you add the following HTML in the code then the pages will be canonicalized.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Issue with AMP pages
Hello, We have implemented AMP on our blog pages, but now some of the Web pages are also being shown like AMP pages. ( no footer and no navigation ) What could have gone wrong ? Thanks
Intermediate & Advanced SEO | | Johnroger0 -
Multiple h1 tags on this html 5 page a issue?
Hi Guys, I have a html5 page located here: https://tinyurl.com/yc6s3xs2 I know from some online discussions having multiple h1 tags on HTML 5 pages like this, shouldn't be an issue. Any thoughts on this? Cheers,
Intermediate & Advanced SEO | | bridhard80 -
Duplicate content on product pages
Hi, We are considering the impact when you want to deliver content directly on the product pages. If the products were manufactured in a specific way and its the same process across 100 other products you might want to tell your readers about it. If you were to believe the product page was the best place to deliver this information for your readers then you could potentially be creating mass content duplication. Especially as the storytelling of the product could equate to 60% of the page content this could really flag as duplication. Our options would appear to be:1. Instead add the content as a link on each product page to one centralised URL and risk taking users away from the product page (not going to help with conversion rate or designers plans)2. Put the content behind some javascript which requires interaction hopefully deterring the search engine from crawling the content (doesn't fit the designers plans & users have to interact which is a big ask)3. Assign one product as a canonical and risk the other products not appearing in search for relevant searches4. Leave the copy as crawlable and risk being marked down or de-indexed for duplicated contentIts seems the search engines do not offer a way for us to serve this great content to our readers with out being at risk of going against guidelines or the search engines not being able to crawl it.How would you suggest a site should go about this for optimal results?
Intermediate & Advanced SEO | | FashionLux2 -
Subdomain and root domain
Hey Everyone, our page has multiple domains and I'm wondering how it affects search rankings today. I saw some stuff from almost a year ago, but I'm not sure if something has changed. We currently have our root domain "www.xyz.com" and started moving some pages over to a different sub-domain "web.xyz.com" because of usability and ease of adjusting content. How much will this affect our seo? Thanks!
Intermediate & Advanced SEO | | josh1230 -
Can pages compete with each other? Inbound links & domain authority, How to determine problem areas?
Heyy, I'm having some pretty big SEO issues. 😞 We have had some drops in our ranking. We're 5th page or worse depending on location for a few of our keywords that we used to rank well for. There are all sorts of random non relevant sites outranking us for the term "stickley" and "stickley furniture" One thing I noticed is that we are ranking for a different page for each keyphrase. Our home page is ranking for "Stickley" and our stickley page is ranking for "Stickley Furniture" Is this normal? I guess Google is just picking what it see's as what's more relevant. Is it possible that these two pages are "competing?" Do similar phrases linking to different pages cause pages to "fight" or unevenly disperse link juice? I'm having trouble knowing which page I should send inbound links to since Google seems to be linking similar keywords to different pages. How much should I stress about which pages I receive links on? Is it true that any inbound link to a site site will help increase its overall domain authority and overall SEO? What should I be focusing on? I've added 301 redirects for non WWW as well as tried to make the pages well optimized for SEO. Should I just add more related content to the pages? I know backlinks are important but I'm having a really hard time figuring out how to get links that aren't just spammy forum post footers or junk directory submissions. The thing that bothers me is we were ranking well and then suddenly are way back. We have never done any black hat SEO of any sort. I feel a bit stuck and confused at the moment 😞 Thanks in advance for any help!
Intermediate & Advanced SEO | | SheffieldMarketing
-Amy0 -
To index or not to index search pages - (Panda related)
Hi Mozzers I have a WordPress site with Relevanssi the search engine plugin, free version. Questions: Should I let Google index my site's SERPS? I am scared the page quality is to thin, and then Panda bear will get angry. This plugin (or my previous search engine plugin) created many of these "no-results" uris: /?s=no-results%3Ano-results%3Ano-results%3Ano-results%3Ano-results%3Ano-results%3Ano-results%3Akids+wall&cat=no-results&pg=6 I have added a robots.txt rule to disallow these pages and did a GWT URL removal request. But links to these pages are still being displayed in Google's SERPS under "repeat the search with the omitted results included" results. So will this affect me negatively or are these results harmless? What exactly is an omitted result? As I understand it is that Google found a link to a page they but can't display it because I block GoogleBot. Thanx in advance guys.
Intermediate & Advanced SEO | | ClassifiedsKing0 -
Can I Use Cross Domain Canonical For Duplicate Categories & Product Pages?
I want to fix issue regarding duplicate categories & product pages on my multiple eCommerce websites. http://www.vistastores.com/patio-umbrellas-fiberbuilt-umbrellas-llc-7gcrw-teal.html - Want to rank with this... http://www.vistapatioumbrellas.com/patio-umbrellas-fiberbuilt-umbrellas-llc-7gcrw-teal.html - Duplicate one! http://www.vistastores.com/patio-umbrellas - Want to rank with this... http://www.vistapatioumbrellas.com/patio-umbrellas - Duplicate one!
Intermediate & Advanced SEO | | CommercePundit0 -
Duplicate Content Through Sorting
I have a website that sells images. When you search you're given a page like this: http://www.andertoons.com/search-cartoons/santa/ I also give users the option to resort results by date, views and rating like this: http://www.andertoons.com/search-cartoons/santa/byrating/ I've seen in SEOmoz that Google might see these as duplicate content, but it's a feature I think is useful. How should I address this?
Intermediate & Advanced SEO | | andertoons0