Consolidating a Large Site with Duplicate Content
-
I will be restructuring a large website for an OEM. They provide products & services for multiple industries, and the product/service offering is identical across all industries.
I was looking at the site structure and ran a crawl test, and learned they have a LOT of duplicate content out there because of the way they set up their website.
They have a page in the navigation for “solution”, aka what industry you are in. Once that is selected, you are taken to a landing page, and from there, given many options to explore products, read blogs, learn about the business, and contact them. The main navigation is removed.
The URL structure is set up with folders, so no matter what you select after you go to your industry, the URL will be “domain.com/industry/next-page”.
The product offerings, blogs available, and contact us pages do not vary by industry, so the content that can be found on “domain.com/industry-1/product-1” is identical to the content found on “domain.com/industry-2/product-1” and so-on and so-forth.
This is a large site with a fair amount of traffic because it’s a pretty substantial OEM. Most of their content, however, is competing with itself because most of the pages on their website have duplicate content.
I won’t begin my work until I can dive in to their GA and have more in-depth conversations with them about what kind of activity they’re tracking and why they set up the website this way. However, I don’t know how strategic they were in this set up and I don’t think they were aware that they had duplicate content.
My first thought would be to work towards consolidating the way their site is set up, so we don’t spread the link-equity of “product-1” content, and direct all industries to one page, and track conversion paths a different way. However, I’ve never dealt with a site structure of this magnitude and don’t want to risk messing up their domain authority, missing redirect or URL mapping opportunities, or ruin the fact that their site is still performing well, even though multiple pages have the same content (most of which have high page authority and search visibility).
I was curious if anyone has dealt with this before and if they have any recommendations for tackling something like this?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
On Site Errors
HI Folks I'm monitoring a small Australian site bluetea.com.au . Currently I have a SEO specialist who does month onsite maintenance work for this site. However each month I continue to see errors in webmaster tools... as an example, currently webmaster tools suggest we have 21 short meta dic and 26 duplicate title tags..... Examples given are Short Meta Discp /cleaning-products-and-our-health/toxic-cleaners/ /colour-consultant/fushia-door/ /portfolio/parisian-apartment-black-kitchen/parisian-apartment-black-kitchen/ Duplicate title tags <a id="zip_0-anchor" class="zippedsection_title"></a>concrete-kitchen | Blue Tea Kitchen Designs /kitchen-trends-and-material-innovation/concrete-kitchen-2//kitchen-trends-for-2013/concrete-kitchen/<a id="zip_1-anchor" class="zippedsection_title"></a>Potts Point Kitchen | Blue Tea Kitchen Designs/portfolio/potts-point-kitchen//portfolio/potts-point-kitchen/pott-point-kitchen/My SEO tells me that he has solved all these issues but after one or two months they still remain in webmaster tools... can anybody help me understand why?Thank you
On-Page Optimization | | PHD0 -
Duplicate content issues - page content and store URLs
Hi, I'm experiencing some heavy duplicate content Crawl errors on Moz with www.redrockdecals.com and therefore I really need some help. It brings up different connections between products and I'm having a hard time figuring out what it means. It is listing the same products as duplicate content but they have different URL endings. For example:http://www.redrockdecals.com/car-graphics/chevrolet-silverado?___store=nl&___from_store=us
On-Page Optimization | | speedbird1229
&
http://www.redrockdecals.com/car-graphics/chevrolet-silverado?___store=d&___from_store=us It seems like Moz considers the copy-pasted parts in the Full Description (scrolled a bit down on product pages) as Duplicate Content. For example the general text found on this page: http://www.redrockdecals.com/caution-tow-limited-turning-radius-decal Or this page: http://www.redrockdecals.com/if-you-don-t-succeed-first-time-then-skydiving-isn-t-for-you-bumper-sticker I am planning to write new and unique descriptions for all products but what do you suggest - should I either remove the long same descriptions or just shorten them perhaps so they don't outweigh the short but unique descriptions above? I've heard search engines understand that some parts of the page can be same on other pages but I wonder if in my case this has gone too deep... Thanks so much!0 -
Ratings pages are Duplicate Content
This brought up another question. should the review page (which now has a canonical to the item page) be Index,follow? My item review pages are showing up with Duplicate Content errors in MOZ. Here are two examples http://www.americanmusical.com/ItemReview--i-HAM-SK1-LIST http://www.americanmusical.com/ItemReview--i-MAC-203680902-LIST is the problem that the pages contain the same code and questions with very little customer created info?
On-Page Optimization | | dianeb1520 -
Duplicate Page Content
Hi, I am new to the MOZ Pro community. I got the below message for many of my pages. We have a video site so all content in the page except the video link would be different. How can i handle such pages. Can we place adsense AD's on these pages? Duplicate Page Content Code and content on this page looks similar or identical to code and content on other pages on your site. Search engines may not know which pages are best to include in their index and rankings. Common fixes for this issue include 301 redirects, using the rel=canonical tag, and using the Parameter handling tool in Google Webmaster Central. For more information on duplicate content, visit http://moz.com/learn/seo/duplicate-content. Please help me to know how to handle this.. Regards
On-Page Optimization | | Nettv0 -
Duplicate Content
I'm currently working on a site that sells appliances. Currently, there are thousands of "issues" with this site, many of them dealing with duplicate content. Now, the product pages can be viewed in "List" or "Grid" format. As Lists, they have very little in the way of content. My understanding is that the duplicate content arises from different URLs going to the same site. For instance, the site might have a different URL when told to display 9 items than when told to display 15. This could then be solved by inserting rel = canonical. Is there a way to take a site and get a list of all possible duplicates? This would be much easier than slogging through every iteration of the options and copying down the URLs. Also, is there anything I might be missing in terms of why there is duplicate content? Thank you.
On-Page Optimization | | David_Moceri0 -
Boat broker - issues with duplicate content and indexing search results
Hello, I have read a lot about optimising product pages and not indexing search results or category pages as ideally a person should be directed straight to a product page. I am interested in how best to approach a site that is listing second hand products for sale - essentially a marketplace of second hand goods (in my case, www.boatshed.com - international boat brokers). For example, we currently have 5 Colvic Sailer 26 boats for sale across the world - that is 5 boats of the same make and model but differing years, locations, sellers and prices. My concern is with search results and 'category' pages. Unlike typical e-commerce sites, when someone searches for a 'Colvic sailer 26 for sale' I want them to go to a search results style page as it is more useful for them to see a list of boats than one random one that Google decides is most important (or possibly one it can match by location). Currently we have 3 different URL types to show search results style pages (i.e. paginated lists of boats that include name, image and short description):
On-Page Optimization | | pbscreative
manufacturer URL's e.g. http://www.boatshed.com/colvic-manufacturer-145.html
category URL's e.g. barges http://www.boatshed.com/barges-category-55.html
and normal search results e.g. dosearch.php?form_boattype_textbox=&.... I have noindexed the search results pages but our category and manufacturer URLs show up in search results and ultimately these are pages I want people to land on. I am however getting duplicate content warnings in Moz. Most boats are in several categories and all will come up on 1 manufacturer and one manufacturer and model page. Both sets of URL's are in my opinion needed; lots of users search for exact makes / models and lots of users just search for the type of boat e.g. 'barge for sale' so both sets of landing pages are useful. Any suggestions or thoughts greatly appreciated Thanks Ben0 -
Solve duplicate content issues by using robots.txt
Hi, I have a primary website and beside that I also have some secondary websites with have same contents with primary website. This lead to duplicate content errors. Because of having many URL duplicate contents, so I want to use the robots.txt file to prevent google index the secondary websites to fix the duplicate content issue. Is it ok? Thank for any help!
On-Page Optimization | | JohnHuynh0 -
Duplicate Content- Best Practise Usage of the canonical url
Canonical urls stop self competition - from duplicate content. So instead of a 2 pages with a rank of 5 out of 10, it is one page with a rank of 7 out of 10.
On-Page Optimization | | WMA
However what disadvantages come from using canonical urls. For example am I excluding some products like green widet, blue widget. I have a customer with 2 e-commerce websites(selling different manufacturers of a type jewellery). Both websites have massive duplicate content issues.
It is a hosted CMS system with very little SEO functionality, no plugins etc. The crawling report- comes back with 1000 of pages that are duplicates. It seems that almost every page on the website has a duplicate partner or more. The problem starts in that they have 2 categorys for each product type, instead of one category for each product type.
A wholesale category and a small pack category. So I have considered using a canonical url or de-optimizing the small pack category as I believe it receives less traffic than the whole category. On the original website I tried de- optimizing one of the pages that gets less traffic. I did this by changing the order of the meta title(keyword at the back, not front- by using small to start of with). I also removed content from the page. This helped a bit. Or I was thinking about just using a canonical url on the page that gets less traffic.
However what are the implications of this? What happens if some one searches for "small packs" of the product- will this no longer be indexed as a page. The next problem I have is the other 1000s of pages that are showing as duplicates. These are all the different products within the categories. The CMS does not have a front office that allows for canonical urls to be inserted. Instead it would have to be done going into the html of the pages. This would take ages. Another issue is that these product pages are not actually duplicate, but I think it is because they have such little content- that the rodger(seo moz crawler, and probably googles one too) cant tell the difference.
Also even if I did use the canonical url - what happened if people searched for the product by attributes(the variations of each product type)- like blue widget, black widget, brown widget. Would these all be excluded from Googles index.
On the one hand I want to get rid of the duplicate content, but I also want to have these pages included in the search. Perhaps I am taking too idealistic approach- trying to optimize a website for too many keywords. Should I just focus on the category keywords, and forget about product variations. Perhaps I look into Google Analytics, to determine the top landing pages, and which ones should be applied with a canonical. Also this website(hosted CMS) seems to have more duplicate content issues than I have seen with other e-commerce sites that I have applied SEO MOZ to On final related question. The first website has 2 landing pages- I think this is a techical issue. For example www.test.com and www.test.com/index. I realise I should use a canonical url on the page that gets less traffic. How do I determine this? (or should I just use the SEO MOZ Page rank tool?)0