PDF's - Dupe Content
-
Hi
I have some pdfs linked to from a page with little content. Hence thinking best to extract the copy from the pdf and have on-page as body text, and the pdf will still be linked too. Will this count as dupe content ?
Or is it best to use a pdf plugin so page opens pdf automatically and hence gives page content that way ?
Cheers
Dan
-
Should be different, but you would have to look at them to make sure.
-
ps - is a pdf to html coverter different from a plugin that loads the pdf as an open page when you click it ? or same thing ?
-
That is what I was going to suggest - setting up a canonical in the http header of the PDF back to the article
https://support.google.com/webmasters/answer/139394?hl=en
As another option, you can just block access to the PDFs to keep them out of the index as well.
-
thanks Chris
yes you can canonicalise the pdf to the html (according to the comments of that article i just linked to anyway)
-
Hi Dan,
Yes PDFs are crawlable (sorry for confusion!) if you were to put it into say a .zip or .rar (or similar) it wouldn't be crawled or you could no index the link i guess. You would need to stick the PDF (download) behind some thing that couldn't be crawled. You could try rel= canonical but I've never tried it with a PDF so i'm not sure how that would go.
Hope that enlightens you a bit.
-
Thanks Chris although i thought PDFS were crawlable??: http://www.lunametrics.com/blog/2013/01/10/seo-pdfs/
Hence why im worried about dupe content if use content of pdf as body text too OR are you saying should no-follow the link to the pdf if use its content as body text because it is considered dupe content in that scenario ?
Ideally i want both - the copy on it used as body text copy on page and the pdf a linkable download, or page as embed of open pdf via a plugin.
-
What would give the user the best experience is the really question,I would;d say put it on page then if the user is lacking a plugin they can still read it, if you have it as a downloadable PDF is shouldn't be able to get crawled and thus avoiding the problem.
Hope that helps.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
404's Wordpress products
Hi Guy's, On a Wordpress website we have a SEO Ultimate plugin running. Every day i get lot's of 404 errors of products that doesn't exist anymore (but are indexed, site: .... ). In the beginning we had lot's of testproduct that are not coming back in the shop. So i was wondering if there is a way to automaticly redirect product when there are out of stock, or not comming back anymore... So my 404's can be fixed. Thanks!
On-Page Optimization | | Happy-SEO1 -
Do we need to worry about internal duplicate content?
Hi, I have a question about internal duplicate content. We have a catalogue of around 4000 products. Most of these do have individual descriptions but for most of the products they contain a generic summary that includes a sentence to begin with that includes each product name. We're currently working on descriptions for each product, but as you can imagine it's quite a chore. I was wondering if there are actually any penalties for this or whether we can ignore the crawl errors from the moz report? Thanks in Advance!
On-Page Optimization | | 10dales0 -
Duplicate content errors
I have multiple duplicate content errors in my crawl diagnostics. The problem is though that i already took care of these problems with the canonical tag but MOZ keeps saying there is a problem. For example this page http://www.letspump.dk/produkter/56-aminosyre/ has a canonical tag, but moz still says it has an error. Why is that?
On-Page Optimization | | toejklemme0 -
Content with changing URL and duplicate content
Hi everyone, I have a question regarding content (user reviews), that are changing URL all the time. We get a lot of reviews from users that have been dining at our partner restaurants, which get posted on our site under (new) “reviews”. My worry however is that the URL for these reviews is changing all the time. The reason for this is that they start on page 1, and then get pushed down to page 2, and so on when new reviews come in. http://www.r2n.dk/restaurant-anmeldelser I’m guessing that this could cause for serious indexing problems? I can see in google that some reviews are indexed multiple times with different URLs, and some are not indexed at all. We further more have the specific reviews under each restaurant profile. I’m not sure if this could be considered duplicate content? Maybe we should tell google not to index the “new reviews section” by using robots.txt. We don’t get much traffic on these URLs anyways, and all reviews are still under each restaurant-profile. Or maybe the canonical tag can be used? I look forward to your input. Cheers, Christian
On-Page Optimization | | Christian_T2 -
Checking for content originality in a site
two part question on original content How would you go about checking if a site holds original content accept the long search quary within Google? ans also if I find many sites carrying my content and I am the original source should I replace the content? thanks
On-Page Optimization | | ciznerguy0 -
Home Page Content - In a Div?
Is putting content in a div so it doesn't muck up the look of the home page create a problem in doing well organically? Example - http://www.callawaygardens.com. We have lots of clients that want no text on the home page and we are trying to figure out how to do this while still ranking well organically. What are your thoughts? Can we get in trouble? Are there negative impacts with SEO doing it like this? Thank you!
On-Page Optimization | | RezStream80 -
Duplicate content on homepage?
Hi I have just created a new campaign and it states that I have duplicate page content which would affect search rankings. Basically it is counting my site www.mydomain.com and www.mydomain.com/index.php as two seperate pages. How can I make it so that only www.mydomain.com is visible reducing the duplicate content issue? Many Thanks
On-Page Optimization | | idv0 -
Duplicate page content errors
Site just crawled and report shows many duplicate pages but doesn't tell me which ones are dups of each other. For you experienced duplicate page experts, do you have a subscription with copyscape and pay $.05 per test? What is the best way to clear these? Thanks in advance
On-Page Optimization | | joemas990