How to compete with duplicate content in post panda world?
-
I want to fix duplicate content issues over my eCommerce website.
I have read very valuable blog post on SEOmoz regarding duplicate content in post panda world and applied all strategy to my website.
I want to give one example to know more about it.
http://www.vistastores.com/outdoor-umbrellas
Non WWW version:
http://vistastores.com/outdoor-umbrellas redirect to home page.
For HTTPS pages:
https://www.vistastores.com/outdoor-umbrellas
I have created Robots.txt file for all HTTPS pages as follow.
https://www.vistastores.com/robots.txt
And, set Rel=canonical to HTTP page as follow.
http://www.vistastores.com/outdoor-umbrellas
Narrow by search:
My website have narrow by search and contain pages with same Meta info as follow.
http://www.vistastores.com/outdoor-umbrellas?cat=7
http://www.vistastores.com/outdoor-umbrellas?manufacturer=Bond+MFG
http://www.vistastores.com/outdoor-umbrellas?finish_search=Aluminum
I have restricted all dynamic pages by Robots.txt which are generated by narrow by search.
http://www.vistastores.com/robots.txt
And, I have set Rel=Canonical to base URL on each dynamic pages.
Order by pages:
http://www.vistastores.com/outdoor-umbrellas?dir=asc&order=name
I have restrict all pages with robots.txt and set Rel=Canonical to base URL.
For pagination pages:
http://www.vistastores.com/outdoor-umbrellas?dir=asc&order=name&p=2
I have restrict all pages with robots.txt and set Rel=Next & Rel=Prev to all paginated pages.
I have also set Rel=Canonical to base URL.
I have done & apply all SEO suggestions to my website but, Google is crawling and indexing 21K+ pages. My website have only 9K product pages.
Google search result:
Since last 7 days, my website have affected with 75% down of impression & CTR.
I want to recover it and perform better as previous one.
I have explained my question in long manner because, want to recover my traffic as soon as possible.
-
Not a complete answer but instead of rel-canonicaling your dynamic pages you may just want to robot.txt block them somthing like:
Disallow: /*?
this will prevent google from crawling any version of the page that includes the ? in the URL. Cannonical is a suggetion whereas robots is more of a command.
as you can see from this query:
Google has indexed 132 versions of that single page rather than follow your rel=canonical suggestion.
To further enforce this you may be able to use a fancy bit of php code to detect if the url is dynamic and do a
robots noindex, noarchive on only the dynamic renderings of the page.
This could be done like this:
I also believe there are some filtering tools for this right within webmaster tools. Worth a peek if your site is registered.
Additionally where you are redirecting non-www subpages to the home page you may instead want to redirect them to their www versions.
this can be done in htaccess like this:
Redirect non-www to www: RewriteEngine On RewriteBase / RewriteCond %{HTTP_HOST} ^yourdomain.com [NC] RewriteRule ^(.*)$ http://www.yourdomain.com/$1 [L,R=301]
This will likely provide both a better user experience as well as a better solution in googles eyes.
I'm sure some other folks will come in with some other great suggestions for you as well
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Avoiding duplicate content in manufacturer's [of single product] website
Hello, So I have read a lot of articles about duplicate content/ keyword canibalism/ competing with yourself, and so on. But none of these articles really fit to manufacturer website who produces one product. For example, lets say I make ceramic tiles, this means: Homepage: "Our tiles are the best tiles, we have numerous designs of tiles. We make them only from natural ceramic" Product list: "Here is a list of our tiles: Poesia tile, white tile, textured tile, etc" Page for each tile: Gallery: a bunch of images trying to prove that these tiles look best 🙂 Where to buy page: a map From what I understand this page is already doomed - it will not go well against larger retailers who don't focus only on tiles but they sell everything. This page is set to have a lot of duplicate content. But I hope I am wrong, can someone please make some suggestions how to do SEO on such a website where all pages are about the same thing? Any help would be much appreciated! Juris
Intermediate & Advanced SEO | | JurisBBB0 -
Duplicate content created by website Calendar - A Penalty?
A colleague of mine asked me a question about duplicate content coming from their event calendar. I don't think this will affect them negatively, but I would love some feedback and thoughts. ThanksOne of my clients, LifeTech Academy, is using my RavenTools software. Raventools has reported a HUGE amount of duplicate content (4.4K instances).The duplicate content all revolves around their calendar and repeating events (http://lifetechacademy.org/events/)The question is this - will this impact their SEO efforts in a negative way?
Intermediate & Advanced SEO | | Bill_K0 -
Search console, duplicate content and Moz
Hi, Working on a site that has duplicate content in the following manner: http://domain.com/content
Intermediate & Advanced SEO | | paulneuteboom
http://www.domain.com/content Question: would telling search console to treat one of them as the primary site also stop Moz from seeing this as duplicate content? Thanks in advance, Best, Paul. http0 -
Galleries and duplicate content
Hi! I am now studing a website, and I have detected that they are maybe generating duplicate content because of image galleries. When they want to show details of some of their products, they link to a gallery url
Intermediate & Advanced SEO | | teconsite
something like this www.domain.com/en/gallery/slide/101 where you can find the logotype, a full image and a small description. There is a next and a prev button over the slider. The next goes to the next picture www.domain.com/en/gallery/slide/102 and so on. But the next picture is in a different URL!!!! The problem is that they are generating lots of urls with very thin content inside.
The pictures have very good resolution, and they are perfect for google images searchers, so we don't want to use the noindex tag. I thought that maybe it would be best to work with a single url with the whole gallery inside it (for example, the 6 pictures working with a slideshow in the same url ), but as the pictures are very big, the page weight would be greater than 7 Mb. If we keep the pictures working that way (different urls per picture), we will be generating duplicate content each time they want to create a gallery. What is your recommendation? Thank you!0 -
Is This Considered Duplicate Content?
My site has entered SEO hell and I am not sure how to fix it. Up until 18 months ago I had tremendous success on Google and Bing and now my website appears below my Facebook page for the term "Direct Mail Raleigh." What makes it even more frustrating is my competitors have done no SEO and they are dominating this keyword. I thought that the issue was due to harmful inbound links and two months ago I disavowed ones that were clearly spam. Somehow my site has actually gone down! I have a blog that I have updated infrequently and I do not know if it I am getting punished for duplicate content. On Google Webmaster Tools it says I have 279 crawled and indexed pages. Yesterday when I ran the MOZ crawl check I was amazed to find 1150 different webpages on my site. Despite the fact that it does not appear on the webmaster tools I have three different webpages due to the format that the Wordpress blog was created: "http://www.marketplace-solutions.com/report/part2leadershi/", "http://www.marketplace-solutions.com/report/page/91/" and "http://www.marketplace-solutions.com/report/category/competent-leadership/page/3/" What does not make sense to me is why Google only indexed 279 webpages AND why MOZ did not identify these three webpages as duplicate content with the Crawl Test Tool. Does anyone have any ideas? Would it be as easy as creating a massive robot.txt file and just putting 2 of the 3 URLs in that file? Thank you for your help.
Intermediate & Advanced SEO | | DR700950 -
Duplicate content
I run about 10 sites and most of them seemed to fall foul of the penguin update and even though I have never sought inorganic links I have been frantically searching for a link based answer since April. However since asking a question here I have been pointed in another direction by one of your contributors. It seems At least 6 of my sites have duplicate content issues. If you search Google for "We have selected nearly 200 pictures of short haircuts and hair styles in 16 galleries" which is the first bit of text from the site short-hairstyles.com about 30000 results appear. I don't know where they're from nor why anyone would want to do this. I presume its automated since there is so much of it. I have decided to redo the content. So I guess (hope) at some point in the future the duplicate nature will be flushed from Google's index? But how do I prevent it happening again? It's impractical to redo the content every month or so. For example if you search for "This facility is written in Flash® to use it you need to have Flash® installed." from another of my sites that I coincidently uploaded a new page to a couple of days ago, only the duplicate content shows up not my original site. So whoever is doing this is finding new stuff on my site and getting it indexed on google before even google sees it on my site! Thanks, Ian
Intermediate & Advanced SEO | | jwdl0 -
Duplicate Content across 4 domains
I am working on a new project where the client has 5 domains each with identical website content. There is no rel=canonical. There is a great variation in the number of pages in the index for each of the domains (from 1 to 1250). OSE shows a range of linking domains from 1 to 120 for each domain. I will be strongly recommending to the client to focus on one website and 301 everything from the other domains. I would recommend focusing on the domain that has the most pages indexed and the most referring domains but I've noticed the client has started using one of the other domains in their offline promotional activity and it is now their preferred domain. What are your thoughts on this situation? Would it be better to 301 to the client's preferred domain (and lose a level of ranking power throught the 301 reduction factor + wait for other pages to get indexed) or stick with the highest ranking/most linked domain even though it doesn't match the client's preferred domain used for email addresses etc. Or would it better to use cross-domain canoncial tags? Thanks
Intermediate & Advanced SEO | | bjalc20110 -
Should I do something about this duplicate content? If so, what?
On our real estate site we have our office listings displayed. The listings are generated from a scraping script that I wrote. As such, all of our listings have the exact same description snippet as every other agent in our office. The rest of the page consists of site-wide sidebars and a contact form. The title of the page is the address of the house and so is the H1 tag. Manually changing the descriptions is not an option. Do you think it would help to have some randomly generated stuff on the page such as "similar listings"? Any other ideas? Thanks!
Intermediate & Advanced SEO | | MarieHaynes0