PDF's - Dupe Content
-
Hi
I have some pdfs linked to from a page with little content. Hence thinking best to extract the copy from the pdf and have on-page as body text, and the pdf will still be linked too. Will this count as dupe content ?
Or is it best to use a pdf plugin so page opens pdf automatically and hence gives page content that way ?
Cheers
Dan
-
Should be different, but you would have to look at them to make sure.
-
ps - is a pdf to html coverter different from a plugin that loads the pdf as an open page when you click it ? or same thing ?
-
That is what I was going to suggest - setting up a canonical in the http header of the PDF back to the article
https://support.google.com/webmasters/answer/139394?hl=en
As another option, you can just block access to the PDFs to keep them out of the index as well.
-
thanks Chris
yes you can canonicalise the pdf to the html (according to the comments of that article i just linked to anyway)
-
Hi Dan,
Yes PDFs are crawlable (sorry for confusion!) if you were to put it into say a .zip or .rar (or similar) it wouldn't be crawled or you could no index the link i guess. You would need to stick the PDF (download) behind some thing that couldn't be crawled. You could try rel= canonical but I've never tried it with a PDF so i'm not sure how that would go.
Hope that enlightens you a bit.
-
Thanks Chris although i thought PDFS were crawlable??: http://www.lunametrics.com/blog/2013/01/10/seo-pdfs/
Hence why im worried about dupe content if use content of pdf as body text too OR are you saying should no-follow the link to the pdf if use its content as body text because it is considered dupe content in that scenario ?
Ideally i want both - the copy on it used as body text copy on page and the pdf a linkable download, or page as embed of open pdf via a plugin.
-
What would give the user the best experience is the really question,I would;d say put it on page then if the user is lacking a plugin they can still read it, if you have it as a downloadable PDF is shouldn't be able to get crawled and thus avoiding the problem.
Hope that helps.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Potential duplicate content issue?
We have a category on our website for PVC rolls to buy as standard 50m rolls (this includes 15 products in the category). We're also releasing PVC rolls to buy per metre (10m roll/25m roll etc...), again with 15 products, which we are adding as a separate category as it makes more sense for our customers and removes the risk of having too many options. Would using the same description be bad practice for SEO? The product is exactly the same just available in different roll sizes, but we definitely do not want to combine categories as it doesn't work for our customers. Any help or suggestions would be appreciated, thanks.
On-Page Optimization | | RayflexGroup0 -
Pages I don't want
Hello Friends, A few days ago, I set up my wordpress site and everything is great so far, there's just one small problem. Wenn I search for site:mysite.com I see a lot of pages in the index I don't want to have on my site. It looks like demo pages from the theme I bought. Examples: http://mysite.com/themefusion_es_groups/group1/ http://mysite.com/slide/youtube/ http://mysite.com/lorem-ipsum-2/ http://mysite.com/slide/demo-5 What would be the best way to handle it? Simply delete the pages, then a soft 404 shows. Delete the pages and 301 them to the start page. Are they disapearing from the infex then? I am very grateful for tips regarding this! Thanks in Advance:)
On-Page Optimization | | grobro0 -
What's the point of my blog?
My website, www.toplinecomms.com has a reasonably good blog that gets quite good interaction and sharing. I introduced the blog at the start of 2013 because the general sentiment from all the SEO books and articles I had read was that a good blog could be invaluable to a search marketing campaign. The posts on the blog are keyword optimised and they get great shares and social engagement. However, I have noticed that the blog is stealing my services' pages' thunder! There are some keywords that I am keen for our services pages to rank for, but the blog is beating them to it! So my question is: How should I be using my blog to get my services pages to rank higher?
On-Page Optimization | | HeatherBakerTopLine0 -
What Should I Do With Low Quality Content?
As my site has definitely got hit by Panda, I am in the process of cleaning my website of low quality content. Needless to say, shitty articles are completed being removed but I think lots of this content is now of low quality because it is obsolete and dated. So what should I do with this content? Should I rewrite those articles as completely new posts and link from the old posts to the new ones? Or should I delete the old posts and do a 301 redirect to the new post? Or should I rewrite the content of these articles in place so I can keep the old URL and backlinks? One thing is that I've got a lot more followers than I used to so publishing a new post gets a lot more views, like and shares and whatnot from social networks.
On-Page Optimization | | sbrault741 -
Home Page Content
Hello. i'm optimizing this website, > home page for one keyword phrase and i was wondering how many words article do i need with that keyword?and if i need it at all? as you can see if i add some content on my home page before the slider, it will ruin the look of the website, What is the right way to do it? Thank you!
On-Page Optimization | | KentR0 -
What's the best way to handle crawling of photo gallery?
When you have a photo gallery with many search filters and loads and loads of pages, is it best to block all the filters and use google's pagination code? Ex: http://photo.net/gallery/photocritique/filter This site has pages for many different queries. While the page titles are unique, the pages are showing duplicated content.
On-Page Optimization | | cakelady0 -
Quick question about bold italics keywords in today's SEO world
Hello guyz do you think that , **or **tags still help you in ranking better for some keyword or this method has become obsolete?****
On-Page Optimization | | ksbnok0 -
Duplicate Content Warning
Hi Mozers, I have a question about the duplicate content warnings I am recieving for some of my pages. I noticed that the below pattern of URLs are being flagged as duplicate content. I understand that these are seen as two different pages but I would like to know if this has an negative impact on my SEO? Why is this happening? How do I stop it from happening? http://www.XXXX.com/product1234.html?sef_rewrite=1 http://www.XXXX.com/product1234.html Thanks in advance!
On-Page Optimization | | mozmonkey0