How to compete with duplicate content in post panda world?

CommercePundit

I want to fix duplicate content issues over my eCommerce website.

I have read very valuable blog post on SEOmoz regarding duplicate content in post panda world and applied all strategy to my website.

I want to give one example to know more about it.

http://www.vistastores.com/outdoor-umbrellas

Non WWW version:

http://vistastores.com/outdoor-umbrellas redirect to home page.

For HTTPS pages:

https://www.vistastores.com/outdoor-umbrellas

I have created Robots.txt file for all HTTPS pages as follow.

https://www.vistastores.com/robots.txt

And, set Rel=canonical to HTTP page as follow.

http://www.vistastores.com/outdoor-umbrellas

Narrow by search:

My website have narrow by search and contain pages with same Meta info as follow.

http://www.vistastores.com/outdoor-umbrellas?cat=7

http://www.vistastores.com/outdoor-umbrellas?manufacturer=Bond+MFG

http://www.vistastores.com/outdoor-umbrellas?finish_search=Aluminum

I have restricted all dynamic pages by Robots.txt which are generated by narrow by search.

http://www.vistastores.com/robots.txt

And, I have set Rel=Canonical to base URL on each dynamic pages.

Order by pages:

http://www.vistastores.com/outdoor-umbrellas?dir=asc&order=name

I have restrict all pages with robots.txt and set Rel=Canonical to base URL.

For pagination pages:

http://www.vistastores.com/outdoor-umbrellas?dir=asc&order=name&p=2

I have restrict all pages with robots.txt and set Rel=Next & Rel=Prev to all paginated pages.

I have also set Rel=Canonical to base URL.

I have done & apply all SEO suggestions to my website but, Google is crawling and indexing 21K+ pages. My website have only 9K product pages.

Google search result:

https://www.google.com/search?num=100&hl=en&safe=off&pws=0&gl=US&q=site:www.vistastores.com&biw=1366&bih=520

Since last 7 days, my website have affected with 75% down of impression & CTR.

I want to recover it and perform better as previous one.

I have explained my question in long manner because, want to recover my traffic as soon as possible.

KrisRoadruck

Not a complete answer but instead of rel-canonicaling your dynamic pages you may just want to robot.txt block them somthing like:

Disallow: /*?

this will prevent google from crawling any version of the page that includes the ? in the URL. Cannonical is a suggetion whereas robots is more of a command.

as you can see from this query:

https://www.google.com/search?num=100&hl=en&safe=off&pws=0&gl=US&q=site:www.vistastores.com&biw=1366&bih=520#sclient=psy-ab&hl=en&safe=off&pws=0&gl=US&source=hp&q=site:www.vistastores.com%2Fpatio-living-concepts%3F&pbx=1&oq=site:www.vistastores.com%2Fpatio-living-concepts%3F&aq=f&aqi=&aql=&gs_sm=e&gs_upl=8408l8408l1l8644l1l1l0l0l0l0l65l65l1l1l0&bav=on.2,or.r_gc.r_pw.r_cp.,cf.osb&fp=b03d3d8a434daa&biw=1599&bih=795

Google has indexed 132 versions of that single page rather than follow your rel=canonical suggestion.

To further enforce this you may be able to use a fancy bit of php code to detect if the url is dynamic and do a

robots noindex, noarchive on only the dynamic renderings of the page.

This could be done like this:

I also believe there are some filtering tools for this right within webmaster tools. Worth a peek if your site is registered.

Additionally where you are redirecting non-www subpages to the home page you may instead want to redirect them to their www versions.

this can be done in htaccess like this:

Redirect non-www to www: RewriteEngine On RewriteBase / RewriteCond %{HTTP_HOST} ^yourdomain.com [NC] RewriteRule ^(.*)$ http://www.yourdomain.com/$1 [L,R=301]

This will likely provide both a better user experience as well as a better solution in googles eyes.

I'm sure some other folks will come in with some other great suggestions for you as well

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

How to compete with duplicate content in post panda world?

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Duplicate content across different domains in different countries?

Web accessibility - High Contrast web pages, duplicate content and SEO

Concerns of Duplicative Content on Purchased Site

Pages with Duplicate Page Content (with and without www)

PDF for link building - avoiding duplicate content

Could you use a robots.txt file to disalow a duplicate content page from being crawled?

Duplicate content issue

Duplicate Content, Campaign Explorer & Rel Canonical