Long list of companies spread out over several pages - duplicate content?
-
Hi all,
I am currently working with a company formation agent. They have a list of every limited company spread over hundreds of pages. What do you guys think? Is there a need for Canonicals? The website is ranking pretty well but I want to make sure there aren't any problems in the future.
Here are two pages as examples:
http://www.formationsdirect.com/companysearchlist.aspx?start=MULLAGHBOY+CONSTRUCTION+LIMITED&next=1#
http://www.formationsdirect.com/companysearchlist.aspx?start=%40a+company+limited&next=1#
Also what about the actual company pages? See an example below
Thanks in advance
Aaron
-
Thanks George,
I'll think I'll take your advice and hold off for now.
Aaron
-
Hi Aaron,
First off, since your rankings haven't been affected I would definitely hold off changing anything in WMT unless you're sure as it might cause more harm than good. If you paginate what looks like potentially thousands of pages I'm not convince Google will look on this fondly. The URLs will probably also change regularly as more companies are incorporated because the pages are set to show fixed list lengths.
Resolving the duplicate content onsite is definitely the best course of action. The fact that Moz is crawling these duplicate pages indicates that it's picking up links from somewhere on your site. If you are able to stop exposing these links and only linking to the "preferred version" i.e. canonical then this will give you some control and a better understanding of the site's information architecture.
Regarding setting up of canonicals, I suspect that this will be a harder job as of the 3 duplicate URLs you provide, it's not immediately clear which one would be the canonical. There are probably also thousands of instances similar to this duplicate group across other company lists and Google will have picked at random which one it sees as the canonical on each one. Marking another URL in the group as the canonical stands to (at least temporarily) cause a drop in rankings and SEO visibility if done across thousands of pages simultaneously.
If I was you and I felt compelled to address the issue I would pick a sample ~10% of the duplicate groups, set a canonical on each of them and see what happens in terms of rankings over 3-6 weeks. I would also add the canonicals to a sitemap and try update any links on your website to make sure only the canonical is referenced.
It's risky though, as your rankings are good even though I understand the principle of what you're trying to achieve. When I've tended to do things like this it's when a website has had nothing to lose.
George
-
Hi George,
Thanks for your clear answer.
The reason I am worried is that MOZ is flagging up thousands of these links as duplicate. Looking at it again today I noticed that it is mainly the list pages that are duplicates. EG
http://www.formationsdirect.com/companysearchlist.aspx?start=%40a+company+limited&next=1
http://www.formationsdirect.com/companysearchlist.aspx?start=AAA+AUTOMOTIVE+LTD&back=1
http://www.formationsdirect.com/companysearchlist.aspx?start=A+LIMITED&next=1
These 3 bring up exactly the same page and it seems that every page in the list has 3 or 4 of these variations.
I did a check in WT and it seems that the 'companysearchlist' parameter has been listed but it is not actually affecting any URLs. Would changing the status to 'pagination' help with this? I imagine that it would be then completely ignored by Google. Or would it better to make a canonical for each duplicate issue so each page gets in once?
PS I left the '#' in the last URL by mistake. It is just a tracking parameter that is being used by the company.
Aaron
-
Hi Aaron,
The search experience on the website is a bit unconventional in that you search for a company name and it returns pages of results alphabetically listed with the name you are searching for hopefully in there somewhere!
You could make changes to the pagination using rel=next/previous, but what you're displaying isn't really "true" results pagination. I would therefore be cautious about changing it if the site is ranking well.
Canonicals would only be required if you were showing the same content on different URLs. A quick "site:" search like the below only returns one result, so either Google isn't showing the duplicate URLs (very likely given your question) or it isn't a problem for you:
site:www.formationsdirect.com inurl:companysearchlist.aspx?name=AMNA+CONSTRUCTION+LTD
You can look in webmaster tools to see which query string parameters it is picking up and configure the behaviour you want GoogleBot to take. You can also get some sense of the duplication if it is an issue.
Regarding the company page URL you gave, anything after the # in the URL won't get crawled so you don't need to worry about canonicalising those.
Again, if it's ranking well, be very careful about trying to solve a problem that doesn't exist. If you can find duplicate content then definitely redirect or canonicalise it and see what kind of impact it has. I would do this before taking on anything more significant like the website information architecture and navigation.
George
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Should I be worried about our 'Duplicate' content
Hi guys... I've just been working through some issues to give our site a little cleanup. I'm working through our duplicate content issues (we have some legitimate duplicate pages that need removing, and some of our dynamic content is problematic. Are web developers are going to sort with canonical tags this week.) However... There are some pages that are actually different products, but are very similar pages that are 'triggering' MOZ to say we have duplicate pages. Here an example... http://www.toaddiaries.co.uk/filofax-refills/filo-12-month-inserts-personal-size/fortnight-view-filofax-personal and http://www.toaddiaries.co.uk/filofax-refills/filo-12-month-inserts-personal-size/week-to-a-view-filofax-personal They are very similar refill products, it's just the diary format is different. Question: Should I be worried about this? I've never seen our rankings change in the past when 'cleaning up' duplicate content. What do you guys think? Isaac.
On-Page Optimization | | isaac6630 -
Duplicate Content in Footers (Not as routine as it seems)
Hello there, I know that content in the footer of sites are safe from duplication penalisation; however, what if the footers where replicated across different subdomains? For instance, the footer was duplicated across: www.example.com blog.example.com blog2.example.com I don't see it as a big issue personally; however, outsourced "specialists" seem to think that this is causing duplication problems and therefore negatively affecting the ranking power of "lesser" subdomains i.e. not the www version, which is by far the strongest subdomain. Would be good to get some insight if anybody has any. Thanks.
On-Page Optimization | | SEONOW1230 -
Is minor duplicate content on my website okay?
I know duplicate content across multiple websites is not a good thing, however I've always wondered about minor duplicate content on your own website. I know its good practice to have unique content on each page but what about the little stuff. For example on our website certain related pages share the same content in a right sidebar. Such as links to pdf leaflets, or "you can read our blog etc" . Is there a minimum number of repeated words required before its flagged as duplicate content? Another example is a customer gave two testimonials for two of our employees - the testimonials were identical other than the employee names - if these were posted on separate pages is it a problem for the site as a whole or for both those individual pages? Thanks
On-Page Optimization | | Brabian0 -
How to solve duplicate content issue???
I have 5 websites with different domain names, every website have same content, same pages, same website design. Kindly let me know how to solve this issue.
On-Page Optimization | | ross254sidney0 -
Form Only Pages Considered No Content/Duplicate Pages
We have a lot of WordPress sites with pages that contain only a form. The header, sidebar and footer content is the same as what's one other pages throughout the site. Each form page has a unique page title, meta description, form title and questions but the form title, description and questions add up to probably less than 100 words. Are these form pages negatively affecting the rankings of our landing pages or being viewed as duplicate or no content pages?
On-Page Optimization | | projectassistant0 -
Wordpress Post as Slideshow - One long page vs many short pages?
We are working on implementing a slideshow format for some of the posts on a website, and it appears that using this format breaks a long post into several shorter pages. That's what we want from a user experience standpoint, but are wondering if there are negative SEO implications from having the content broken up in this way, and whether search engines will view it as one longer page or several very short pages? Here is an example: http://www.forthebestrate.com/10-cheap-ideas-for-summer-fun/ Thanks for the help!
On-Page Optimization | | ILM_Marketing0 -
Thin content and tabs on page
I am reviewing a site, and the web designer used tabs to impart information. I think the tabs idea looks great, but it leaves the page looking thin. Here is a link to a product page, could anyone chime in please? http://www.aireindustrial.net/spill-berms/foam-berm-drive-over-berms.asp Thanks in advance for your opinion!
On-Page Optimization | | drufast10 -
Duplicate page content,
Hi, in my campaign crawls diagnostic, I have a lot of Duplicate page content, but we use canonicalization and I used webmastertool to make sure the campaign parameters are not consider by the Google bot. Can you see what could be my problem, or do you have a tip for me or things to look at ? Thank You VB
On-Page Optimization | | Vale70