Could this be seen as duplicate content in Google's eyes?
-
Hi
I'm an in-house SEO and we've recently seen Panda related traffic loss along with some of our main keywords slipping down the SERPs.
Looking for possible Panda related issues I was wondering if the following could be seen as duplicate content. We've got some very similar holidays (travel company) on our website. While they are different I'm concerned it may be seen as creating content that is too similar:
They do all have unique text but as you can see from the titles, they are very similar (note from an SEO point of view the tabbed content is all within the same page at source level).
At the top level of the holiday pages we have a filtered search:
http://www.naturalworldsafaris.com/destinations/africa-and-the-indian-ocean/kenya/suggested-holidays.aspxThese pages have a unique introduction but the content snippets being pulled into the boxes is drawn from each of the individual holiday pages.
I'm just concerned that these could be introducing some duplicating issues. Any thoughts?
-
Hi Cyrus,
Thanks for taking the time to answer.
It seems that there is no firm answer on this one - interesting to see you felt there wasn't necessarily an issue of duplicated content but that grouping these pages into themes with a hub page would be of benefit (assuming I've understood your suggestions).
The issue is that in some ways the pages and content is similar, so the trips are focused on the beaches and wildlife of Kenya - a lot of the difference is in the accommodation and level of luxury, which is dealt with in the on page copy. I think we will have to revisit how we handle page titles.
We only fairly recently changed those pages to ensure that all content in the individual tabs is visible to search engines (previously they were only able to crawl the content in the overview tabs, the content of other tabs was effectively hidden). I have checked this in Google Webmaster Tools and it all displays fine / all the tabbed content is found within the html.
Many thanks
Kate -
I'm going to go against the grain and say this doesn't look like a duplicate content issue to me - at least based on text. There's enough unique content on those pages that you shouldn't be falling into those filters. No one can say for sure - that's simply based on my experience.
That said, there are other signals around these pages that are very similar. Namely things like title tags and anchor text.
Title Tags:
- The Wildlife & Beaches of Kenya - Natural World Safaris
- Ultimate Kenya Wildlife and Beaches Safari - Natural World Safaris
- Wildlife & Beach Family Safari - Natural World Safaris
From a topic perspective, are these differentiated enough? They seem to target very similar topics and keywords. ... and the anchor text to these pages follows similar patterns, mostly internal links from the sidebar.
So long story short, these pages may not be differentiated enough that they may be interpreted as dupe content (or thin content topics, as it were) and there simply aren't enough external signals to keep these pages afloat.
The solution may be to consolidate or group these pages into themes. Make sure you have strong "hub" pages that link everything together (think Trip Advisor)
One other thing of note - I notice the page is JavaScript dependent. Because of this, make sure to perform a "Fetch and Render" in Google Webmaster Tools, and make sure the page displays correctly. If it doesn't, be sure to address any issues.
-
Thanks for the replies Andy and Amelia
We cover around 30 destinations and each one has a suggested-holidays page and then maybe 5-15 individual itineraries. Using the copy from any of those itinerary pages will show multiple results in Google as the opening text is being pulled into several other areas on the site.
However, individually a lot of these itinerary pages and overview suggested-holiday main pages rank reasonably well and account for quite a lot of traffic to the site. We can't no-index or use canonicalisation really as each page does have unique content and is different - there is just quite a bit of cross over. At the same time we saw a significant drop with Panda 4.0 and see smaller drops every month with each subsequent update.
Has anyone got any suggestions on how else we can handle this content?
Thanks
Kate -
Hi Kate,
Your assumption about duplicate / similar content appears to be well founded. Just to test a sample, I took the following snippet from this page, and searched in Google:
"Acacia House sits in Ol Chorro Losoit Valley, within the Lemak Hills"
Google returns 4 pages, so yes, there are issues here - and it isn't as straight forward as canonicalisation to fix as this can mean other pages could miss out on a chance to be indexed and returned. However, what you can't tell, is to what degree Google is objecting to these kids of issues. Some say that Google is smart enough to understand what a snippet is, and won't penalise based on this - others disagree. Myself, I try to ensure my clients have unique content on each page and always err on the side of caution.
I also took a snippet from itinerary here and did the same - this time it came back with 5 different pages.
My opinion is that yes, you do have problems that need to be rectified. I know this was only a very quick look, but I shouldn't be seeing so many pages with the same snippets of content in Google. The odd one you can get away with, but I bet I would find lots.
How many unique pages with content like this do you think you have?
-Andy
-
If you're aggregating content from different pages into one, then you may want to look at canonical tags. I'm sure someone much smarter than me will tell you how to do it
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Does redirecting a duplicate page NOT in Google‘s index pass link juice? (External links not showing in search console)
Hello! We have a powerful page that has been selected by Google as a duplicate page of another page on the site. The duplicate is not indexed by Google, and the referring domains pointing towards that page aren’t recognized by Google in the search console (when looking at the links report). My question is - if we 301 redirect the duplicate page towards the one that Google has selected as canonical, will the link juice be passed to the new page? Thanks!
Intermediate & Advanced SEO | | Lewald10 -
Pages excluded from Google's index due to "different canonicalization than user"
Hi MOZ community, A few weeks ago we noticed a complete collapse in traffic on some of our pages (7 out of around 150 blog posts in question). We were able to confirm that those pages disappeared for good from Google's index at the end of January '18, they were still findable via all other major search engines. Using Google's Search Console (previously Webmastertools) we found the unindexed URLs in the list of pages being excluded because "Google chose different canonical than user". Content-wise, the page that Google falsely determines as canonical instead has little to no similarity to the pages it thereby excludes from the index. False canonicalization About our setup: We are a SPA, delivering our pages pre-rendered, each with an (empty) rel=canonical tag in the HTTP header that's then dynamically filled with a self-referential link to the pages own URL via Javascript. This seemed and seems to work fine for 99% of our pages but happens to fail for one of our top performing ones (which is why the hassle 😉 ). What we tried so far: going through every step of this handy guide: https://moz.com/blog/panic-stations-how-to-handle-an-important-page-disappearing-from-google-case-study --> inconclusive (healthy pages, no penalties etc.) manually requesting re-indexation via Search Console --> immediately brought back some pages, others shortly re-appeared in the index then got kicked again for the aforementioned reasons checking other search engines --> pages are only gone from Google, can still be found via Bing, DuckDuckGo and other search engines Questions to you: How does the Googlebot operate with Javascript and does anybody know if their setup has changed in that respect around the end of January? Could you think of any other reason to cause the behavior described above? Eternally thankful for any help! ldWB9
Intermediate & Advanced SEO | | SvenRi1 -
Google isn't seeing the content but it is still indexing the webpage
When I fetch my website page using GWT this is what I receive. HTTP/1.1 301 Moved Permanently
Intermediate & Advanced SEO | | jacobfy
X-Pantheon-Styx-Hostname: styx1560bba9.chios.panth.io
server: nginx
content-type: text/html
location: https://www.inscopix.com/
x-pantheon-endpoint: 4ac0249e-9a7a-4fd6-81fc-a7170812c4d6
Cache-Control: public, max-age=86400
Content-Length: 0
Accept-Ranges: bytes
Date: Fri, 14 Mar 2014 16:29:38 GMT
X-Varnish: 2640682369 2640432361
Age: 326
Via: 1.1 varnish
Connection: keep-alive What I used to get is this: HTTP/1.1 200 OK
Date: Thu, 11 Apr 2013 16:00:24 GMT
Server: Apache/2.2.23 (Amazon)
X-Powered-By: PHP/5.3.18
Expires: Sun, 19 Nov 1978 05:00:00 GMT
Last-Modified: Thu, 11 Apr 2013 16:00:24 +0000
Cache-Control: no-cache, must-revalidate, post-check=0, pre-check=0
ETag: "1365696024"
Content-Language: en
Link: ; rel="canonical",; rel="shortlink"
X-Generator: Drupal 7 (http://drupal.org)
Connection: close
Transfer-Encoding: chunked
Content-Type: text/html; charset=utf-8 xmlns:content="http://purl.org/rss/1.0/modules/content/"
xmlns:dc="http://purl.org/dc/terms/"
xmlns:foaf="http://xmlns.com/foaf/0.1/"
xmlns:og="http://ogp.me/ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:sioc="http://rdfs.org/sioc/ns#"
xmlns:sioct="http://rdfs.org/sioc/types#"
xmlns:skos="http://www.w3.org/2004/02/skos/core#"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#"> <title>Inscopix | In vivo rodent brain imaging</title>0 -
Http and https duplicate content?
Hello, This is a quick one or two. 🙂 If I have a page accessible on http and https count as duplicate content? What about external links pointing to my website to the http or https page. Regards, Cornel
Intermediate & Advanced SEO | | Cornel_Ilea0 -
Wordpress.com content feeding into site's subdomain, who gets SEO credit?
I have a client who had created a Wordpress.com (not Wordpress.org) blog, and feeds blog posts into a subdomain blog.client-site.com. My understanding was that in terms of SEO, Wordpress.com would still get the credit for these posts, and not the client, but I'm seeing conflicting information. All of the posts are set with permalinks on the client's site, such as blog.client-site.com/name-of-post, and when I run a Google site:search query, all of those individual posts appear in the Google search listings for the client's domain. Also, I've run a marketing.grader.com report, and these same results are seen. Looking at the source code on the page, however, I see this information which leads me to believe the content is being credited to, and fed in from, Wordpress.com ('client name' altered for privacy): href="http://client-name.files.wordpress.com/2012/08/could_you_survive_a_computer_disaster.jpeg">class="alignleft size-thumbnail wp-image-2050" title="Could_you_survive_a_computer_disaster" src="http://client-name.files.wordpress.com/2012/08/could_you_survive_a_computer_disaster.jpeg?w=150&h=143" I'm looking to provide a recommendation to the client on whether they are ok to continue moving forward with this current setup, or whether we should port the blog posts over to a subfolder on their primary domain www.client-site.com/blog and use Wordpress.org functionality, for proper SEO. Any advice?? Thank you!
Intermediate & Advanced SEO | | grapevinemktg0 -
Wordpress Duplicate Content
We have recently moved our company's blog to Wordpress on a subdomain (we utilize the Yoast SEO plugin). We are now experiencing an ever-growing volume of crawl errors (nearly 300 4xx now) for pages that do not exist to begin with. I believe it may have something to do with having the blog on a subdomain and/or our yoast seo plugin's indexation archives (author, category, etc) --- we currently have Subpages of archives and taxonomies, and category archives in use. I'm not as familiar with Wordpress and the Yoast SEO plugin as I am with other CMS' so any help in this matter would be greatly appreciated. I can PM further info if necessary. Thank you for the help in advance.
Intermediate & Advanced SEO | | BethA0 -
My homepage doesn't rank anymore. It's been replaced by irrelevant subpages which rank around 100-200 instead of top 5.
Hey guys, I think I got some kind of penalty for my homepage. I was in top5 for my keywords. Then a few days ago, my homepage stopped ranking for anything except searching for my domain name in Google. sitename.com/widget-reviews/ previously ranked #3 for "widget reviews"
Intermediate & Advanced SEO | | wearetribe
but now....
sitename.com/widget-training-for-pet-cats/ is ranking #84 for widget reviews instead. Similarly across all my other keywords, irrelevant, wrong pages are ranking. Did I get some kind of penalty?0 -
Guest blogging and duplicate content
I have a guest blog prepared and several sites I can submit it to, would it be considered duplicate content if I submitted one guest blog post to multipul blogs? and if so this content is not on my site but is linking to it. What will google do? Lets say 5 blogs except the same content and post it up, I understand that the first blog to have it up will not be punished, what about the rest of the blogs? can they get punished for this duplicate content? can I get punished for having duplicate content linking to me?
Intermediate & Advanced SEO | | SEODinosaur0