PDFs and webpages
-
If a website provides PDF versions of the page as a download option, should the PDF be no-indexed in your opinion?
We have to offer PDF versions of the webpage as our customers want them, they are a group who will download/print the pdfs. I thought of leaving the pdfs alone as they site in a subdomain but the more I think about it, I should probably noindex them. My reasons
- They site in a subdomain, if users have linked to them, my main domain isn't getting the rank juice
- Duplication issues, they might be affecting the rank of the existing webpages
- I can't track the PDF as they are in a subdomain, I can see event clicks to them from the main site though
On the flipside
- I could lose out on the traffic the pdfs bring when a user loads it from an organic search and any link existing on the pdf
What are your experiences?
-
Cool. It's advisable to add canonical HTTP headers to the PDFs too, if you can.
-
Thanks Alex,
I do have canonical tags on the webpages to ensure they are seen as the main one. I'll look into tracking subdomains.
-
Google now class subdomains pretty much as part of your main domain: http://www.youtube.com/watch?v=_MswMYk05tk - so you will be getting some of that rank juice.
I'd think that the major search engines wouldn't have a problem knowing that an HTML version of a page is preferred over a PDF. However, you can use canonical HTTP headers to make sure there are no problems with duplicate content: http://moz.com/blog/how-to-advanced-relcanonical-http-headers
If you use Google Analytics you will be able to track the subdomain. You can do it as part of your existing profile or by setting up a separate one: https://developers.google.com/analytics/devguides/collection/gajs/gaTrackingSite (ensure this is the version of Analytics you have installed).
There's a short guide here on getting more data about PDFs through Google Analytics: http://moz.com/ugc/how-to-track-pdf-traffic-links-in-google-analytics-open-site-explorer
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
I have a same paragraph appearing on two webpages of my site!
is it gonna affect rankings if so what should be done thanks
Intermediate & Advanced SEO | | Sam09schulz0 -
What can we do to optimize / be mobile-friendly for PDFs?
I'm getting a "Your page is not mobile-friendly." notice in the SERPs for all of our PDFs. I check the pdf on the phone and it appears just fine. rFtLq
Intermediate & Advanced SEO | | johnnybgunn0 -
Best Practices for Converting PDFs to HTML
We're working with a client who gets about 80% of their organic, inbound search traffic from links to PDF files on their site. Obviously, this isn't ideal, because someone who just downloads a PDF file directly from a Google query is unlikely to interact with the site in any other way. I'm looking to develop a plan to convert those PDF files to HTML content, and try to get at least some of those visitors to convert into subscribers. What's the best way to go about this? My plan so far is: Develop HTML landing pages for each of the popular PDFs, with the content from the PDF, as well as the option to download the PDF with an email signup. Gradually implement 301 redirects for the existing PDFs, and see what that does to our inbound SEO traffic. I don't want to create a dip in traffic, although our current "direct to inbound" traffic is largely useless. Are their things I should watch out for? Will I get penalized by Google for redirecting a PDF to HTML content? Other things I should be aware of?
Intermediate & Advanced SEO | | atourgates0 -
Moving a lot of pdfs to main site. Worth trying to get them indexed?
On my main site we link to pdfs that are located on another one of our domains. The only thing that is on this other domain is the pdfs. It was setup really poorly so I am going to redesign everything and probably move it. Is it worthwhile trying to add these pdfs to our sitemap and to try and get them indexed? They are all connected to a current item, but the content is original.
Intermediate & Advanced SEO | | EcommerceSite0 -
Should I change my product infomation (PDF attatchments) into additional webpages with links ?
Hello All, On our eCommerce site some products have additional information which we currently show via a PDF link next to the product. I am thinking, is it more beneficial from an SEO point of view , If I was to put this additional pdf information to a webpage and have a link going from the product to this . From what I read, google cannot read contents of pdfs so if I was to have this as webpage via a link , then the product page would get more keywords and strength around it which would help improve it's seo etc. Just wondered if this is the best way forward or not ? thanks Peter
Intermediate & Advanced SEO | | PeteC120 -
How to make AJAX content crawlable from a specific section of a webpage?
Content is located in a specific section of the webpage that are being loaded via AJAX.
Intermediate & Advanced SEO | | zpm20140 -
Javascript to fetch page title for every webpage, is it good?
We have a zend framework that is complex to program if you ask me, and since we have 20k+ pages that we need to get proper titles to and meta descriptions, i need to ask if we use Javascript to handle page titles (basically the previously programming team had NOT set page titles at all) and i need to get proper page titles from a h1 tag within the page. current course of action which we can easily implement is fetch page title from that h1 tag being used throughout all pages with the help of javascript, But this does makes it difficult for engines to actually read what's the page title? since its being fetched with javascript code that we have put in, though i had doubts, is anyone one of you have simiilar situation before? if yes i need some help! Update: I tried the JavaScript way and here is what it looks like http://islamicencyclopedia.org/public/index/hadith/id/1/book_id/106 i know the fact that google won't read JavaScript like the way we have done with the website, But i need help on "How we can work around this issue" Knowing we don't have other options.
Intermediate & Advanced SEO | | SmartStartMediacom0 -
Exact Syntax for Canonical to PDFs for Windows Server
Hi There, I have got in my web several PDFs with the same content of the HTML version. Thus I need to set up a canonical for each of them in order to avoid duplicate content. In particular, I need to know how to write the exact syntax for the windows server (web.config) in order to implement the canonical to PDF. I surfed the web but it seems I cannot find this piece of info anywhere Thanks a lot!!
Intermediate & Advanced SEO | | Midleton0