PDF's - Dupe Content
-
Hi
I have some pdfs linked to from a page with little content. Hence thinking best to extract the copy from the pdf and have on-page as body text, and the pdf will still be linked too. Will this count as dupe content ?
Or is it best to use a pdf plugin so page opens pdf automatically and hence gives page content that way ?
Cheers
Dan
-
Should be different, but you would have to look at them to make sure.
-
ps - is a pdf to html coverter different from a plugin that loads the pdf as an open page when you click it ? or same thing ?
-
That is what I was going to suggest - setting up a canonical in the http header of the PDF back to the article
https://support.google.com/webmasters/answer/139394?hl=en
As another option, you can just block access to the PDFs to keep them out of the index as well.
-
thanks Chris
yes you can canonicalise the pdf to the html (according to the comments of that article i just linked to anyway)
-
Hi Dan,
Yes PDFs are crawlable (sorry for confusion!) if you were to put it into say a .zip or .rar (or similar) it wouldn't be crawled or you could no index the link i guess. You would need to stick the PDF (download) behind some thing that couldn't be crawled. You could try rel= canonical but I've never tried it with a PDF so i'm not sure how that would go.
Hope that enlightens you a bit.
-
Thanks Chris although i thought PDFS were crawlable??: http://www.lunametrics.com/blog/2013/01/10/seo-pdfs/
Hence why im worried about dupe content if use content of pdf as body text too OR are you saying should no-follow the link to the pdf if use its content as body text because it is considered dupe content in that scenario ?
Ideally i want both - the copy on it used as body text copy on page and the pdf a linkable download, or page as embed of open pdf via a plugin.
-
What would give the user the best experience is the really question,I would;d say put it on page then if the user is lacking a plugin they can still read it, if you have it as a downloadable PDF is shouldn't be able to get crawled and thus avoiding the problem.
Hope that helps.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Social engineering Content detected
Hi there I am Facing Social Engineering Content Detected on www.domain.com from long time, we have Removed All Bad Java script, unnecessary links, bad content. After removing we Did Review also in Google, But still again & again we are getting this Notification in webmaster, is This harmful for Our web traffic?? how do I permanently Clear This Notification ? please any body can help ? Thanx in advance
On-Page Optimization | | iepl20 -
Number of internal links and passing 'link juice' down to key pages.
Howdy Moz friends. I've just been checking out this post on Moz from 2011 and wanted to know how relevant it is today? I'm particularly interested in a number of links we have on our HP potentially harming important landing page rankings because not enough 'link juice is getting to them i.e) are they are being diluted by all the many other links on the page? (deeper pages, faqs, etc etc) It seems strange to me that as Google as has got more sophisticated this would still be that relevant (thus the reason for posting). Anyway, I thought I was definitely worth asking. If we can leverage more out of our on-page efforts then great 🙂
On-Page Optimization | | isaac6630 -
I'm puzzled
Last week we decided to run a facebook campaign with a small offer, any way cut a long story short. www.specialistsonlinepaints.co.uk was dropped by google when the penguin update took place in 2012, however it has been re included by google but nothing really returned in terms of search results. so our facebook campaign reached over 3k of people according to the stats but no one purchased anything or even clicked to visit the website. am I missing something in terms on onpage, black listing etc? Im at a loss at the minute.
On-Page Optimization | | TeamacPaints0 -
Don't understand this ... :-(
Hello, I'm going nuts as I don't understand what's going on with this domain of a client. We have this classical htaccess redirect from http://domain.com to http://www.domain.com But I'm getting Page Authority for both domains, and the non-www, which shouldn't be crawled, is gettting higher PA .. http://www.myanamar.rundreisen.de - PA 34 http://myanamr-rundreisen.de - PA 36 I attach a file, you see there that google robot is recognizing the 301 redirecht from non-www to www ... But, the site isn't doing good at all in google, it seems the home page has a penalty ... duplicate content due to non-www and www home page? So it would be great if somebody has a hint for me ... my client is losing trust in me Thx! GbDC4.jpg
On-Page Optimization | | hgw570 -
Duplicate Page Content Issues
How can I fix Duplicate Page Content Issues on my site : www.ifocalmedia.com. This is a WP site and the diagnostics shows I have 115 errors? I know this is damaging to my SEO campaign how do I clear these? Any help is very welcome.
On-Page Optimization | | shami0 -
Duplicate Content Warning
Hi Mozers, I have a question about the duplicate content warnings I am recieving for some of my pages. I noticed that the below pattern of URLs are being flagged as duplicate content. I understand that these are seen as two different pages but I would like to know if this has an negative impact on my SEO? Why is this happening? How do I stop it from happening? http://www.XXXX.com/product1234.html?sef_rewrite=1 http://www.XXXX.com/product1234.html Thanks in advance!
On-Page Optimization | | mozmonkey0 -
Another SEO's point of view
Hiya fellow SEO's I have been working on a site - www.hplmotors.co.uk and I must say it has become difficult due to flaws with the content management system . We are speaking with the web site makers to be able to add a unique title, description to all pages. I know what is wrong but I would also like some 2nd opinions on this and welcome any suggestions for the site. A burnt out seo 🙂 thanks
On-Page Optimization | | onlinemediadirect0 -
Original content and the Google Panda Update
We are an online furniture store with about 1300 products on the site, and we mostly use the catalogue descriptions for the product. Recently I have been reading about One Way Furniture: http://ecommerceprnews.com/e-commerce_articles/2011/03/one-way-furniture-shifts-toward-quality-content-after-google-panda-update-201928.htm They are a big american online furniture which seemed to have lost about a 3rd of there traffic due to being punished in the panda update. Now it seems they are blaming the fact they use they use catalogue descriptions for the product (like us), and now they are going to rewrite all their product descriptions. We are a small company and rewriting 1300 products (meaningfully) is no small task. Looking at our own traffic we have taken a small slump since feb after about 18 months of general increased month on month traffic ( bar seasonal dips and boost), but we didn't have a "fall of the cliff" like One Way Furniture. But have been expanding into other areas (and there for new keywords), so we had expected to be increasing our traffic. So the question is, how important is unique content for all our products? is it worth all the time and money to fix all the pages? Our plan is to make sure our category pages (and there for landing pages) have unique content, would that be enough on its own, or are the product pages damaging the site over all?
On-Page Optimization | | eunaneunan0