Long list of companies spread out over several pages - duplicate content?
-
Hi all,
I am currently working with a company formation agent. They have a list of every limited company spread over hundreds of pages. What do you guys think? Is there a need for Canonicals? The website is ranking pretty well but I want to make sure there aren't any problems in the future.
Here are two pages as examples:
http://www.formationsdirect.com/companysearchlist.aspx?start=MULLAGHBOY+CONSTRUCTION+LIMITED&next=1#
http://www.formationsdirect.com/companysearchlist.aspx?start=%40a+company+limited&next=1#
Also what about the actual company pages? See an example below
Thanks in advance
Aaron
-
Thanks George,
I'll think I'll take your advice and hold off for now.
Aaron
-
Hi Aaron,
First off, since your rankings haven't been affected I would definitely hold off changing anything in WMT unless you're sure as it might cause more harm than good. If you paginate what looks like potentially thousands of pages I'm not convince Google will look on this fondly. The URLs will probably also change regularly as more companies are incorporated because the pages are set to show fixed list lengths.
Resolving the duplicate content onsite is definitely the best course of action. The fact that Moz is crawling these duplicate pages indicates that it's picking up links from somewhere on your site. If you are able to stop exposing these links and only linking to the "preferred version" i.e. canonical then this will give you some control and a better understanding of the site's information architecture.
Regarding setting up of canonicals, I suspect that this will be a harder job as of the 3 duplicate URLs you provide, it's not immediately clear which one would be the canonical. There are probably also thousands of instances similar to this duplicate group across other company lists and Google will have picked at random which one it sees as the canonical on each one. Marking another URL in the group as the canonical stands to (at least temporarily) cause a drop in rankings and SEO visibility if done across thousands of pages simultaneously.
If I was you and I felt compelled to address the issue I would pick a sample ~10% of the duplicate groups, set a canonical on each of them and see what happens in terms of rankings over 3-6 weeks. I would also add the canonicals to a sitemap and try update any links on your website to make sure only the canonical is referenced.
It's risky though, as your rankings are good even though I understand the principle of what you're trying to achieve. When I've tended to do things like this it's when a website has had nothing to lose.
George
-
Hi George,
Thanks for your clear answer.
The reason I am worried is that MOZ is flagging up thousands of these links as duplicate. Looking at it again today I noticed that it is mainly the list pages that are duplicates. EG
http://www.formationsdirect.com/companysearchlist.aspx?start=%40a+company+limited&next=1
http://www.formationsdirect.com/companysearchlist.aspx?start=AAA+AUTOMOTIVE+LTD&back=1
http://www.formationsdirect.com/companysearchlist.aspx?start=A+LIMITED&next=1
These 3 bring up exactly the same page and it seems that every page in the list has 3 or 4 of these variations.
I did a check in WT and it seems that the 'companysearchlist' parameter has been listed but it is not actually affecting any URLs. Would changing the status to 'pagination' help with this? I imagine that it would be then completely ignored by Google. Or would it better to make a canonical for each duplicate issue so each page gets in once?
PS I left the '#' in the last URL by mistake. It is just a tracking parameter that is being used by the company.
Aaron
-
Hi Aaron,
The search experience on the website is a bit unconventional in that you search for a company name and it returns pages of results alphabetically listed with the name you are searching for hopefully in there somewhere!
You could make changes to the pagination using rel=next/previous, but what you're displaying isn't really "true" results pagination. I would therefore be cautious about changing it if the site is ranking well.
Canonicals would only be required if you were showing the same content on different URLs. A quick "site:" search like the below only returns one result, so either Google isn't showing the duplicate URLs (very likely given your question) or it isn't a problem for you:
site:www.formationsdirect.com inurl:companysearchlist.aspx?name=AMNA+CONSTRUCTION+LTD
You can look in webmaster tools to see which query string parameters it is picking up and configure the behaviour you want GoogleBot to take. You can also get some sense of the duplication if it is an issue.
Regarding the company page URL you gave, anything after the # in the URL won't get crawled so you don't need to worry about canonicalising those.
Again, if it's ranking well, be very careful about trying to solve a problem that doesn't exist. If you can find duplicate content then definitely redirect or canonicalise it and see what kind of impact it has. I would do this before taking on anything more significant like the website information architecture and navigation.
George
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How do I fix duplicate page issue on Shopify with duplicate products because of collections.
I'm working with a new client with a site built on Shopify. Most of their products appear in four collections. This is creating a duplicate content challenge for us. Can anyone suggest specific code to add to resolve this problem. I'm also interested in other ideas solutions, such as "don't use collections" if that's the best approach. I appreciate your insights. Thank you!
On-Page Optimization | | quiltedkoala0 -
Duplicate Content - Blog Rewriting
I have a client who has requested a rewrite of 250 blog articles for his IT company. The blogs are dispersed on a variety of platforms: his own website's blog, a business innovation website, and an IT website. He wants to have each article optimised with keyword phrases and then posted onto his new website thrice weekly. All of this is in an effort to attract some potential customers to his new site and also to establish his company as a leader in its field. To what extent would I need to rewrite each article so as to avoid duplicating the content? Would there even be an issue if I did not rewrite the articles and merely optimised them with keywords? Would the articles need to be completely taken by all current publishers? Any advice would be greatly appreciated.
On-Page Optimization | | StoryScout0 -
Links to Paywall from Content Pages
Hi, My site is funded by subscriptions. We offer lengthy excerpts, and then direct people to a single paywall page, something like domain.com/subscribe/ This means that most pages on the site links to /subscribe, including all of the high value pages that bring people in from Google. This is a page with an understandably high bounce rate, as most users are not interested in paying for content on the web. My question is are we being penalized in Google for having so many internal links to a page with a very high bounce rate? If anyone has worked with paywall sites before and knows the best practices for this, I'd be really grateful to learn more.
On-Page Optimization | | enotes0 -
Exponentially Increasing Duplicate Content On Blogs
Most of the clients that I pick up are either new to SEO best practices, or have worked with sketchy SEO providers in the past, who did little more than build spammy links. Most of them have deployed little if any on-site SEO best practices, and early on I spend a lot of time fixing canonical and duplicate content issues alla 301 redirects. Using SEOMOZ, however, I see a lot of duplicate content issues with blogs that live on the sites I work on. With every new blog article we publish, more duplicate content builds up. I feel like duplicate content on blogs grows exponentially, because every time you write a blog article, it exists provisionally on the blog homepage, the article link, a category page, maybe a tag page, and an author page. I have a two-part question: Is duplicate content like this a problem for a blog -- and for the website that the blog lives on? Are search engines able to parse out that this isn't really duplicate content? If it is a problem, how would you go about solving it? Thanks in advance!
On-Page Optimization | | RCNOnlineMarketing0 -
Duplicate Title & Content in WordPress
I'm getting a lot of Crawl Errors due to duplicate content and duplicate title because of category and tag posts in WordPress. I rebuilt the sitemap and said to exclude category and tags, should that clear up the issue? I've also went through and did NO INDEX and NO FOLLOW for all categories and posts. Any thoughts on this issue?
On-Page Optimization | | seantgreen0 -
301 redirects from several sub-pages to one sub-page
Hi! I have 14 sub-pages i deleted earlier today. But ofcourse Google can still find them, and gives everyone that gives them a go a 404 error. I have come to the understading that this wil hurt the rest of my site, at least as long as Google have them indexed. These sub-pages lies in 3 different folders, and i want to redirect them to a sub-page in a folder number 4. I have already an htaccess file, but i just simply cant get it to work! It is the same file as i use for redirecting trafic from mydomain.no to www.mydomain.no, and i have tried every kind of variation i can think of with the sub-pages. Has anyone perhaps had the same problem before, or for any other reason has the solution, and can help me with how to compose the htaccess file? 🙂 You have to excuse me if i'm using the wrong terms, missing something i should have seen under water while wearing a blindfold, or i am misspelling anything. I am neither very experienced with anything surrounding seo or anything else that has with internet to do, nor am i from an englishspeaking country. Hope someone here can light up my path 🙂 Thats at least something you can say in norwegian...
On-Page Optimization | | MarieA1 -
Where to Put Content For Product Pages - How To Structure Website?
Currently we have 300+ products. We do not have a CMS or Ecommerce site at the time being for certain reasons. Currently our site is set up with content on almost every single page. The main catagory page, explains everything on the main page, then our products page has a lot of text too. But right now, it seems as if our main pages are only ranking. In the near future I will be using a cms and purchasing a template. I noticed most Ecommerce style websites have just the product with the name and price, then when they click on the product it brings you to that page with a brief product description and some photos. My question is, does each page need content? Or can just the product page itself have content? For example, say we have a link to SHOES. Then the shoes page displays dress, casual and athletic. Then the athletic page brings you to a page with, running, tennis, cross training shoes, and so forth. Is it best to write content on this main catagory page? If so, how much? Or should we focus on putting content on the actual page of the individual product? Along with pictures and specifications? I know Content is Key and we are doing pretty well at that, however, I am starting to wondering if we have to much content or too similar content. What is the best structure to try and recieve GREAT organic rankings?
On-Page Optimization | | hfranz0 -
Duplicate content Issue
I'm getting a report of duplicate title and content on: http://www.website.com/ http://www.website.com/index.php Of course, they're the same pages but does this need to be corrected somehow. Thanks!
On-Page Optimization | | dbaxa-2613380