Long list of companies spread out over several pages - duplicate content?
-
Hi all,
I am currently working with a company formation agent. They have a list of every limited company spread over hundreds of pages. What do you guys think? Is there a need for Canonicals? The website is ranking pretty well but I want to make sure there aren't any problems in the future.
Here are two pages as examples:
http://www.formationsdirect.com/companysearchlist.aspx?start=MULLAGHBOY+CONSTRUCTION+LIMITED&next=1#
http://www.formationsdirect.com/companysearchlist.aspx?start=%40a+company+limited&next=1#
Also what about the actual company pages? See an example below
Thanks in advance
Aaron
-
Thanks George,
I'll think I'll take your advice and hold off for now.
Aaron
-
Hi Aaron,
First off, since your rankings haven't been affected I would definitely hold off changing anything in WMT unless you're sure as it might cause more harm than good. If you paginate what looks like potentially thousands of pages I'm not convince Google will look on this fondly. The URLs will probably also change regularly as more companies are incorporated because the pages are set to show fixed list lengths.
Resolving the duplicate content onsite is definitely the best course of action. The fact that Moz is crawling these duplicate pages indicates that it's picking up links from somewhere on your site. If you are able to stop exposing these links and only linking to the "preferred version" i.e. canonical then this will give you some control and a better understanding of the site's information architecture.
Regarding setting up of canonicals, I suspect that this will be a harder job as of the 3 duplicate URLs you provide, it's not immediately clear which one would be the canonical. There are probably also thousands of instances similar to this duplicate group across other company lists and Google will have picked at random which one it sees as the canonical on each one. Marking another URL in the group as the canonical stands to (at least temporarily) cause a drop in rankings and SEO visibility if done across thousands of pages simultaneously.
If I was you and I felt compelled to address the issue I would pick a sample ~10% of the duplicate groups, set a canonical on each of them and see what happens in terms of rankings over 3-6 weeks. I would also add the canonicals to a sitemap and try update any links on your website to make sure only the canonical is referenced.
It's risky though, as your rankings are good even though I understand the principle of what you're trying to achieve. When I've tended to do things like this it's when a website has had nothing to lose.
George
-
Hi George,
Thanks for your clear answer.
The reason I am worried is that MOZ is flagging up thousands of these links as duplicate. Looking at it again today I noticed that it is mainly the list pages that are duplicates. EG
http://www.formationsdirect.com/companysearchlist.aspx?start=%40a+company+limited&next=1
http://www.formationsdirect.com/companysearchlist.aspx?start=AAA+AUTOMOTIVE+LTD&back=1
http://www.formationsdirect.com/companysearchlist.aspx?start=A+LIMITED&next=1
These 3 bring up exactly the same page and it seems that every page in the list has 3 or 4 of these variations.
I did a check in WT and it seems that the 'companysearchlist' parameter has been listed but it is not actually affecting any URLs. Would changing the status to 'pagination' help with this? I imagine that it would be then completely ignored by Google. Or would it better to make a canonical for each duplicate issue so each page gets in once?
PS I left the '#' in the last URL by mistake. It is just a tracking parameter that is being used by the company.
Aaron
-
Hi Aaron,
The search experience on the website is a bit unconventional in that you search for a company name and it returns pages of results alphabetically listed with the name you are searching for hopefully in there somewhere!
You could make changes to the pagination using rel=next/previous, but what you're displaying isn't really "true" results pagination. I would therefore be cautious about changing it if the site is ranking well.
Canonicals would only be required if you were showing the same content on different URLs. A quick "site:" search like the below only returns one result, so either Google isn't showing the duplicate URLs (very likely given your question) or it isn't a problem for you:
site:www.formationsdirect.com inurl:companysearchlist.aspx?name=AMNA+CONSTRUCTION+LTD
You can look in webmaster tools to see which query string parameters it is picking up and configure the behaviour you want GoogleBot to take. You can also get some sense of the duplication if it is an issue.
Regarding the company page URL you gave, anything after the # in the URL won't get crawled so you don't need to worry about canonicalising those.
Again, if it's ranking well, be very careful about trying to solve a problem that doesn't exist. If you can find duplicate content then definitely redirect or canonicalise it and see what kind of impact it has. I would do this before taking on anything more significant like the website information architecture and navigation.
George
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate content with tagging and categories
Hello, Moz is showing that a site has duplicate content - which appears to be because of tags and categories. It is a relatively new site, with only a few blog publications so far. This means that the same articles are displayed under a number of different tags and categories... Is this something I should worry about, or just wait until I have more content? The 'tag' and 'category' pages are not really pages I would expect or aim for anyone to find in google results anyway. Would be glad to here any advice / opinions on this Thanks!
On-Page Optimization | | wearehappymedia1 -
How to explain to a client that duplicate content is bad...
Afternoon! An SEO client of ours has copied a load of landing/category page content from other sites. Lots of emails have been sent back and forth asking them to remove it, but they are adamant to keep it up there until we have time to amend it. We have explained to them: The Google penalty risks The copyright risks The short and long-term implications for their brand new business/website The money they are spending on our SEO package could be completely wasted if they're caught I think the above is pretty black and white, but the director of this company will not budge. Does anyone have any different approaches? The director said he's happy for us to amend the content but, in the meantime, the plagiarised content will not be removed. Cheers, Lewis
On-Page Optimization | | PeaSoupDigital0 -
Duplicate content - "Same" profile-information
Hi, I own a casting website with lots of profiles. Some of these profiles only typed in their firstname, email and age, when they registered on the site, and they haven't added more information ever since. From Crawl Diagnostics, I can see that there is "lots" of these profiles, which looks exactly the same (only showing age and firstname), allthought they are not the same. I could add which day the profile were created on the site, to maybe avoid these "duplications". The email will always be hidden. Or, how big an issue is this? Crawl Diagnostics tells me, that there is around 200 of these, and they are "marked" as High Priority. Any ideas on what to do? /Kasper
On-Page Optimization | | KasperGJ0 -
Duplicate Page Content
Hi, I am new to the MOZ Pro community. I got the below message for many of my pages. We have a video site so all content in the page except the video link would be different. How can i handle such pages. Can we place adsense AD's on these pages? Duplicate Page Content Code and content on this page looks similar or identical to code and content on other pages on your site. Search engines may not know which pages are best to include in their index and rankings. Common fixes for this issue include 301 redirects, using the rel=canonical tag, and using the Parameter handling tool in Google Webmaster Central. For more information on duplicate content, visit http://moz.com/learn/seo/duplicate-content. Please help me to know how to handle this.. Regards
On-Page Optimization | | Nettv0 -
Duplicate Page Content
Hi there, We keep getting duplicate page content issues. However, its not actually the same page.
On-Page Optimization | | HamiltonIsland
E.G - There might be 5 pages in say a Media Release section of the website. And each URL says page 1, 2 etc etc. However, its still coming up as duplicate. How can this be fixed so Moz knows its actually different content?0 -
Help: my WordPress Blog generates too many onpage links and duplicate content
I have a WordPress Blog since November last year (so I'm pretty new to WordPress) and the effects on ranking for some keywords are really good. So I thought tag clouds are good. Crawl Diagnostics tell me now that I have too many onpage links for example my author page breaks the record: 256
On-Page Optimization | | inlinear
http://inlinear.com/blog/author/inlinear/ I think thats because there are links for each word in the tag cloud generated ... On this page (and many other pages) WordPress displays (teasers) the beginning of each post (read more ...) producing duplicate content and even new canonical tags.... The page titles are also too long because I installed "All in One SEO Pack" and now this plugin and wordpress itself mixes titles together ... But what can I do to avoid all this. Is there a PlugIn that can help... I think millions of blogs will have the same problems... I my blog yet has very few content. Thanks for your answers :))0 -
Copyscape Duplicate Content Ownership Question
We have a site that has had its content copied verbatim to numerous other sites and articles. We were advised to change our content but the content is originally ours. Does google take that into account before they apply duplicate penalties? And shouldn't copyscape be able to show this information in their reports? It just doesnt seem right that the originating author would have to change content because everyone else is stealing it. Any clarification on this?
On-Page Optimization | | anthonytjm0 -
Does Google still see masked domains as duplicate content?
Older reads state the domain forwarding or masking will create duplicate content but Google has evolved quite a bit and I'm wondering if that is still the case? Not suggesting that a 301 is not the proper way to redirect something but my question is: Does Google still see masked domains as duplicate content? Is there any viable use for domain masking other than for affiliates?
On-Page Optimization | | TracyWeb0