Removing Duplicate Page Content
-
Since joining SEOMOZ four weeks ago I've been busy tweaking our site, a magento eCommerce store, and have successfully removed a significant portion of the errors.
Now I need to remove/hide duplicate pages from the search engines and I'm wondering what is the best way to attack this?
Can I solve this in one central location, or do I need to do something in the Google & Bing webmaster tools?
Here is a list of duplicate content
http://www.unitedbmwonline.com/?dir=asc&mode=grid&order=name http://www.unitedbmwonline.com/?dir=asc&mode=list&order=name
http://www.unitedbmwonline.com/?dir=asc&order=name http://www.unitedbmwonline.com/?dir=desc&mode=grid&order=name http://www.unitedbmwonline.com/?dir=desc&mode=list&order=name http://www.unitedbmwonline.com/?dir=desc&order=name http://www.unitedbmwonline.com/?mode=grid http://www.unitedbmwonline.com/?mode=listThanks in advance,
Steve
-
Thank you Cyrus I will certainly read the blog post and consider the noindex, nofollow on content with a canonical tag that differs from the current served page' uri.
I am still at little confused as to why the SEOMOZ crawl is highlighting duplicate pages when the canonical tag is present and pointing to the primary content.
Take the following example page for example:-
http://www.planksclothing.com/planks-classic-t-shirt-black-multi.html
Firstly the page has a canonical tag. There is no search on the site and product is viewed a root level without directory structure, which in a Magento instance is the common problem with duplicate content...
Currently at the time of writing SEOMOZ is updating my duplicate repor, so I can't find out what is the duplicate content. Maybe it is updating to say it is not
Thanks
Amendment: After reading the supplied blog post (http://www.seomoz.org/blog/duplicate-content-in-a-post-panda-world) I have learn't that the above page is just not different and probably is in the area of "Thin Content".
-
There are many, many different types of duplicate content, and how you handle it depends on the specific type of duplicate content and your needs.
If you haven't already, I highly suggest you read Dr. Pete's excellent post on dupe content here: http://www.seomoz.org/blog/duplicate-content-in-a-post-panda-world
In your specific case it looks like you have multiple parameters serving the same basic content as your homepage. Is this correct?
In this case, you should set a canonical on every page pointing to the homepage. This also has the benefit of solving the errors in the SEOmoz PRO app.
It also sounds like you've addressed the issue in Google's Webmaster Tools. Unfortunately, Google doesn't let SEOmoz sync with Webmaster Tools, so anything you set there won't show up in the Web App.
Finally, don't forget about Bing Webmaster. They have similar parameter settings you can submit.
By the way, some SEOs would suggest putting meta robots "NOINDEX, FOLLOW" tags on those duplicate pages. While this may potentially send conflicting signals when coupled with the canonical tag, it is a potentially valid approach.
Hope this helps! Best of luck with your SEO.
-
This is exactly my current situation...
As a result of the SEOMOZ Duplicate content report I set about resolving these issues...
In the first instance I configured URL parameters via Google Webmaster Tools. It instantly occurred to me that whilst this fixes these potential duplicate content in Google this configuration does not affect other search engines and the work is unlikely to be reflected in future SEOMOZ crawls of the site.
I'm interested in creating a over arching method of removing the potential duplication caused via URL parameters required to paginate, sort and filter content. The majority of these URL parameters are standardized across web applications. But is it actually required?
In my case each Magento store uses the canonical tag correctly and has an updated robots.txt to restrict the crawling of areas of the site that should be excluded... In a sense this is the over arching method of removing potential duplicate content. So why is SEOMOZ reporting duplicate content?
I suppose the big question is... Is SEOMOZ crawling the site correctly, do these results reflect robots.txt and canonical tags?
-
Thank you for your thoughts.
As mentioned in my above response, canonical tags have already been configured for the site, it's just this home page that remains the issue.
-
Thanks for your response.
I looked in URL Parameters and see dir & mode are already defined.
Then I searched the http://www.unitedbmwonline.com page source for canonical links and none are defined, though I do have canonical tags setup for the rest of the site
Any other thoughts of how to remove these duplicates?
-
You can also tell Google to ignore certain query string variables through Webmaster Tools.
For instance, indicate that "dir" and "mode" have no impact on content.
Other SE's have simular controls.
-
This is why the canonical tag was invented, to solve duplicate content issues when URL parameters are involved. Set a canonical tag on all these pages to point towards the version of the page you want to appear in search results. As long as the pages are identical, or close to it, the search engines (most likely) will respect the canonical tag, and pass along the duplicate versions link juice to the page you're pointing to.
Here's some info: http://googlewebmastercentral.blogspot.com/2009/02/specify-your-canonical.html. If you Google "canonical tag", you'll find lots more!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What is considered duplicate content?
Hi, We are working on a product page for bespoke camper vans: http://www.broadlane.co.uk/campervans/vw-campers/bespoke-campers . At the moment there is only one page but we are planning add similar pages for other brands of camper vans. Each page will receive its specifically targeted content however the 'Model choice' cart at the bottom (giving you the choice to select the internal structure of the van) will remain the same across all pages. Will this be considered as duplicate content? And if this is a case, what would be the ideal solution to limit penalty risk: A rel canonical tag seems wrong for this, as there is no original item as such. Would an iFrame around the 'model choice' enable us to isolate the content from being indexed at the same time than the page? Thanks, Celine
Intermediate & Advanced SEO | | A_Q0 -
How to handle duplicate content with Bible verses
Have a friend that does a site with bible verses and different peoples thoughts or feelings on them. Since I'm an SEO he came to me with questions and duplicate content red flag popped up in my head. My clients all generate their own content so not familiar with this world. Since Bible verses appear all over the place, is there a way to address this from an SEO standpoint to avoid duplicate content issues? Thanks in advance.
Intermediate & Advanced SEO | | jeremyskillings0 -
Which is more valuable in a landing page, content or functionality?
I have two possible landing pages to focus off page links and paid ad links to, one page has space for content but basically only serves as a springboard to a map view style listing page. The idea is to use this page full of good content to build search engine value. The map view page is the most functional and is what visitors would ultimately be seeking, but has no real room for content. Are these content landing pages useful? Would it be better to focus on user functionality even though there is no space for content, and would search engines naturally apply for value to these pages? Are these landing pages necessary? The url's in question are http://www.rentcollegepads.com/marquette/search and http://www.rentcollegepads.com/marquette Thanks guys!
Intermediate & Advanced SEO | | Dom4410 -
Is a different location in page title, h1 title, and meta description enough to avoid Duplicate Content concern?
I have a dynamic website which will have location-based internal pages that will have a <title>and <h1> title, and meta description tag that will include the subregion of a city. Each page also will have an 'info' section describing the generic product/service offered which will also include the name of the subregion. The 'specific product/service content will be dynamic but in some cases will be almost identical--ie subregion A may sometimes have the same specific content result as subregion B. Will the difference of just the location put in each of the above tags be enough for me to avoid a Duplicate Content concern?</p></title>
Intermediate & Advanced SEO | | couponguy0 -
Duplicate keyphrases in page titles = penalty?
Hello Mozzers - just looking at a website which has duplicate keyphrases in its page titles... So you have [keyphrase 1] | [exact match Keyphrase 1] Now I happen to know this particular site has suffered a dramatic fall in traffic - the SEO agency working on the site had advised the client to duplicate keyphrases. Hard to believe, huh! What I'm wondering is whether this extensive exact match keyphrase duplication might've been enough to attract a penalty? Your thoughts would be welcome.
Intermediate & Advanced SEO | | McTaggart0 -
How to associate content on one page to another page
Hi all, I would like associate content on "Page A" with "Page B". The content is not the same, but we want to tell Google it should be associated. Is there an easy way to do this?
Intermediate & Advanced SEO | | Viewpoints1 -
Duplicate Page Content - Shopify
Moz reports that there are 1,600+ pages on my site (Sportiqe.com) that qualify as Duplicate Page Content. The website sells licensed apparel, causing shirts to go into multiple categories (ie - LA Lakers shirts would be categorized in three areas: Men's Shirts, LA Lakers Shirts and NBA Shirts)It looks like "tags" are the primary cause behind the duplicate content issues: // Collection Tags_Example: : http://www.sportiqe.com/collections/la-clippers-shirts (Preferred URL): http://www.sportiqe.com/collections/la-clippers-shirts/la-clippers (URL w/ tag): http://sportiqe.com/collections/la-clippers-shirts/la-clippers (URL w/ tag, w/o the www.): http://sportiqe.com/collections/all-products/clippers (Different collection, w/ tag and same content)// Blog Tags_Example: : http://www.sportiqe.com/blogs/sportiqe/7902801-dispatch-is-back: http://www.sportiqe.com/blogs/sportiqe/tagged/elias-fundWould it make sense to do 301 redirects for the collection tags and use the Parameter Tool in Webmaster Tools to exclude blog post tags from their crawl? Or, is there a possible solution with the rel=cannonical tag?Appreciate any insight from fellow Shopify users and the Moz community.
Intermediate & Advanced SEO | | farmiloe0 -
Subdomains - duplicate content - robots.txt
Our corporate site provides MLS data to users, with the end goal of generating leads. Each registered lead is assigned to an agent, essentially in a round robin fashion. However we also give each agent a domain of their choosing that points to our corporate website. The domain can be whatever they want, but upon loading it is immediately directed to a subdomain. For example, www.agentsmith.com would be redirected to agentsmith.corporatedomain.com. Finally, any leads generated from agentsmith.easystreetrealty-indy.com are always assigned to Agent Smith instead of the agent pool (by parsing the current host name). In order to avoid being penalized for duplicate content, any page that is viewed on one of the agent subdomains always has a canonical link pointing to the corporate host name (www.corporatedomain.com). The only content difference between our corporate site and an agent subdomain is the phone number and contact email address where applicable. Two questions: Can/should we use robots.txt or robot meta tags to tell crawlers to ignore these subdomains, but obviously not the corporate domain? If question 1 is yes, would it be better for SEO to do that, or leave it how it is?
Intermediate & Advanced SEO | | EasyStreet0