How does Google decide what content is "similar" or "duplicate"?
-
Hello all,
I have a massive duplicate content issue at the moment with a load of old employer detail pages on my site. We have 18,000 pages that look like this:
http://www.eteach.com/Employer.aspx?EmpNo=26626
http://www.eteach.com/Employer.aspx?EmpNo=36986
and Google is classing all of these pages as similar content which may result in a bunch of these pages being de-indexed. Now although they all look rubbish, some of them are ranking on search engines, and looking at the traffic on a couple of these, it's clear that people who find these pages are wanting to find out more information on the school (because everyone seems to click on the local information tab on the page). So I don't want to just get rid of all these pages, I want to add content to them.
But my question is...
If I were to make up say 5 templates of generic content with different fields being replaced with the schools name, location, headteachers name so that they vary with other pages, will this be enough for Google to realise that they are not similar pages and will no longer class them as duplicate pages?
e.g. [School name] is a busy and dynamic school led by [headteachers name] who achieve excellence every year from ofsted. Located in [location], [school name] offers a wide range of experiences both in the classroom and through extra-curricular activities, we encourage all of our pupils to “Aim Higher". We value all our teachers and support staff and work hard to keep [school name]'s reputation to the highest standards.
Something like that...
Anyone know if Google would slap me if I did that across 18,000 pages (with 4 other templates to choose from)?
-
Hi Virginia,
Maybe this whiteboard Friday can help you out.
-
Hey Virginia
That is essentially what we call near duplicates and is the kind of content that can easily be created by pulling fields out of a database and dynamically creating the pages and dropping name, address etc into the placeholders.
Unique content is essentially that, unique content so this approach is probably not going to cut it. You could have certain elements pulled like this such as the address but you need to either remove these duplicate blocks and keep it more simple (like a business directory) and ideally add some unique elements to each page.
These kinds of pages often still rank for very specific queries and also often well thought out landing pages that link to pages like this that have value for users but are not search friendly can be a strategy.
So, assess how well these work as landing pages from search or are they coming in elsewhere? If they come in elsewhere you could no index these pages or block them in robots.txt. Then, target the bigger search terms higher up the tree and create good search landing pages that link to these other pages for users.
This is a real good read to get a better handle on duplicate content types and the relevant strategies:
http://moz.com/blog/fat-pandas-and-thin-content
Hope that helps
Marcus
-
Hi Virginia,
If you take your pages as a whole, code and all, the only slight difference in those pages is the
tag and the sidebar info with school address. The rest of the page code is exactly the same.
If you were to create 5 templates similar to:
[School name] is a busy and dynamic school led by [headteachers name] who achieve excellence every year from ofsted. Located in [location], [school name] offers a wide range of experiences both in the classroom and through extra-curricular activities, we encourage all of our pupils to “Aim Higher". We value all our teachers and support staff and work hard to keep [school name]'s reputation to the highest standards.
If all you are doing is changing the [school name] ans [location] etc, I'm sure Google will still flag these pages as duplicate content.
Unique content is the best way. If theres not a lot of competition for the school name and the page has enough content about each individual school, head teacher etc, then "templates" might work. You can try it out but I'd say unique content is the best way. It's the nature of the beast with so many pages.
Hope this helps.
Robert
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Moving content form Non-performing site to performing site - wihtout 301 Redirection
I have 2 different websites: one have good amount of traffic and another have No Traffic at all. I have a website that has lots of valuable content But no traffic. And I want to move the content of non-performing site to performing site. (Don't want to redirect) My only concern is duplicate content. I was thinking of setting the pages to "noindex" on the original website and wait until they don't appear in Google's index. Then I'd move them over to the performing domain to be indexed again. So, I was wondering If it will create any copied content issue or not? What should i have to take care of when I am going to move content from one site to another?
White Hat / Black Hat SEO | | HuptechWebseo0 -
Help! Is this what is called "cloaking"?
Friend asked me to look at her website. Ran it through screaming frog and BAM, instead of 4 pages i was expecting it returned HUNDREDS. 99.9% of them are for cheap viagra and pharmaceuticals. I asked her if she was selling viagra, which is fine, I don't judge. But she swears she isn't. http://janeflahertyesq.com I ran it through google site:janeflahertyesq.com and sure enough, if you click on some of those, they take you to canadien pharmacys selling half priced blue pills. a) is this cloaking? if not, what is going on? b) more importantly, how do I we get rid of those hundreds of pages / de-indexed She's stumped and scared. Any help would be greatly appreciated. Thank you all in advance and for the work you do.
White Hat / Black Hat SEO | | TeamPandoraBeauty0 -
Are businesses still hiring SEO that use strategies that could lead to a Google penalty?
Is anyone worried that businesses know so little about SEO that they are continuing to hire SEO consultants that use strategies that could land the website with a Google penalty? I ask because we did some research with businesses and found the results worrying: blog farms, over optimised anchor text. We will be releasing the data later this week, but wondered if it something for the SEO community to worry about and what can be done about it.
White Hat / Black Hat SEO | | williamgoodseoagency.com0 -
On-site duplication working - not penalised - any ideas?
I've noticed a website that has been set up with many virtually identical pages. For example many of them have the same content (minimal text, three video clips) and only the town name varies. Surely this is something that Google would be against? However the site is consistently ranking near the top of Google page 1, e.g. http://www.maxcurd.co.uk/magician-guildford.html for "magician Guildford", http://www.maxcurd.co.uk/magician-ascot.html for "magician Ascot" and so on (even when searching without localisation or personalisation). For years I've heard SEO experts say that this sort of thing is frowned on and that they will get penalised, but it never seems to happen. I guess there must be some other reason that this site is ranked highly - any ideas? The content is massively duplicated and the blog hasn't been updated since 2012 but it is ranking above many established older sites that have lots of varied content, good quality backlinks and regular updates. Thanks.
White Hat / Black Hat SEO | | MagicianUK0 -
Fix Bad Links in Google
I have a client who had some grey hat SEO done in the past. Some of their back links aren't from the best neighborhoods. Google didn't seem to mind until 9/28, when they literally disappeared for all searches except for their domain name. Google still has their site indexed, but it's just not showing up. There are no messages in Webmaster Tools. I know Bing has the tool where you can disavow bad links and ask them to discount them. Google doesn't have such a tool, but what is the strategy when you don't have control over the link sources, such as in blog comments? Could this update have been a delayed Penguin ranking change from the latest Penguin Update on the 18th? http://www.seomoz.org/google-algorithm-change Any advice would be greatly appreciated. Thanks, Tom
White Hat / Black Hat SEO | | TomBristol0 -
Why is Google not punishing paid links as it says it will?
I've recently started working with a travel company - and finding the general link building side of the business quite difficult. I had a call from an SEO firm the other day offering their services, and stating that they had worked with a competitor of ours and delivered some very good results. I checked the competitors rankings, PR, link profile, and indeed, the results were quite impressive. However, the link profile pointed to one thing, that was incredibly obvious. They had purchased a large amount of sidebar text links from powerful blogs in the travel sector. Its painfully obvious what has happened, yet they still rank very highly for a lot of key terms. Why don't Google do something about this? They aren't the only company in this sector doing this, but it just seems pointless for white hats trying to do things properly, then those with the dollar in their pockets just buy success in the SERPS. Thanks
White Hat / Black Hat SEO | | neilpage1230 -
What are the biggest optimization factors for Google Places?
I know some of the basic factors to rank better on Google Places, but I'm looking to see where the priority is and if there are negative factors?
White Hat / Black Hat SEO | | anchorwave0 -
If I were to change the geographic keyword such as "foreclosures in Dallas" on 20 related blogs to "foreclosures in Los Angeles" what would happen?
In other words I'm wondering if someone built up an internet presence for their company through multiple websites over the years and then decided to move to another part of the united states, would it work to change all the keywords to the new location? Would that work toward getting them ranked in the new area or would you have to create entirely new websites? Thanks guys.
White Hat / Black Hat SEO | | whorneff3100