Does google scrape links from PDF files? do these links pass link juice?
-
Title is pretty much the whole question.
-
I made a test and it seems that yes, the links from pdf count for ranking.
The test is on my Romanian blog http://seogan.ro/link-building-pdf-urile-o-sursa-de-linkuri-test
You can find an English translation here: http://www.seogan.com/pdf-link-building
Hope it helps.
-
Yes it does according to Google tech spec http://code.google.com/apis/searchappliance/documentation/50/admin_crawl/Introduction.html
which specifically states if follows html links in pdf 'It follows HTML links in PDF files, Word documents, and Shockwave documents'. Google's own api docs carry more weight than a comment in a forum_._ If they are licencing this out as an application it would suggest that the same technology is available in the main engine as does Dunamis's comment about a listing in a pdf document being found in search results.
You can test for youself by publishing a pdf with a link to a info page that does not show up in any other links. Include the pdf in your sitemap but not the test page and check if it shows in googles index site:yoursite.com the next time it crawls.
This also gives some insight in an interview with Matt Cutts - http://www.stonetemple.com/articles/interview-matt-cutts-012510.shtml
Eric Enge: What about PDF files?
Matt Cutts: We absolutely do process PDF files. I am not going to talk about whether links in PDF files pass PageRank. But, a good way to think about PDFs is that they are kind of like Flash in that they aren't a file format that's inherent and native to the web, but they can be very useful. In the same way that we try to find useful content within a Flash file, we try to find the useful content within a PDF file. At the same time, users don't always like being sent to a PDF. If you can make your content in a Web-Native format, such as pure HTML, that's often a little more useful to users than just a pure PDF file.
-
This person seems to think no: http://www.google.fr/support/forum/p/Webmasters/thread?tid=14c5fe970fe84361&hl=en
but i'm not sure how much i can trust a random comment from a random source. any evidence for either argument?
EDIT: And this person seems to think they do pass link juice: http://www.whydowork.com/blog/link-building/274/
Could a mod remove the marked as answered? i don't think i am able to remove it, and the question isn't really answered.
-
yes, but do they crawl the links they find in these documents, or do they just index their contents.
-
Hmmm although i thought you had answered my question, i actually feel that you have not... Yes the links you provided state that google scrapes pdfs and even OCRs pdfs to get a better idea what is in them, but i don't see anywhere that they mention crawling the urls they find in these pdf documents.
-
Google definitely does index the contents of pdf files. I found this out the hard way as I had a real estate pdf on my site that I wanted to have listed in the index, but I didn't know that the contents would be crawled. The pdf contained some listings that I was not legally allowed to advertise on my site. (It was legal for me to give someone a report with the listings in it though).
When another realtor was searching for their own listing, my pdf came up. I got in trouble. I'm ok now though.
-
Have a look at this article http://searchenginewatch.com/article/2067225/Google-Does-PDF-Other-Changes it explains some of the doc library search for pdf files and Google's statement here http://googleblog.blogspot.com/2008/10/picture-of-thousand-words.html.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Should I disavow these links?
Hi all, I have a ski website that I am currently performing a toxic backlink audit on. I have noted that a lot of the links being flagged as toxic/spammy by the tool I am using seem to be the same/similar sites with different URLs. The sites are vaguely related to skiing (relating to helicopter travel options for travelling to ski resorts) but it is concerning me that there are so many of them and they are being flagged as so toxic.
Link Building | | SolveWebMedia
Do you think it is worth disavowing these? Or contacting the owner to ask them to remove the link? I have included an example of some of the links below. https://www.cannes-helicopters.co.uk/index.php?menuopen=21&showcontent=5
https://nice-helicopter.co.uk/index.php?menuopen=21&showcontent=5
https://monaco-helicopter.co.uk/index.php?menuopen=21&showcontent=5 Slightly different site but same favicon icon:
https://monaco-helicopter.co.uk/index.php?menuopen=21&showcontent=5
https://www.whitetracks-holidays.com/Helicopter_Transfers_Villars_Switzerland.htm Thanks in advance for any advice / help!0 -
Internal links: How do I find keywords that are not linked to a URL?
Hi Moz members I'd really like to place an internal link of every instance of an important keyword phrase "cycling caps" or "cycling cap" to my eComm category page
Link Building | | andystorey
https://www.prendas.co.uk/collections/headwear/cotton-caps I feel this would not only help my customers browsing my store, but I believe it will help from an SEO perspective. How can I search my site using Moz, Screaming Frog, Google etc to find every time cycling cap(s) is used but is not linked to the above URL? I can then apply the same for cycling jerseys, socks, etc Andy0 -
What does google think about legit link exchanges where one is follow and one is no follow?
Hi Experts! Here is my Question for you. I am doing a link exchange in a legit way to increase sales for my site and my associate's site. My associate just wants a sales increase and no link juice. He has a very low DA so I want to give him a no follow link. Is it suspicious of fishy that I give a no follow link and receive a followed link in return? Please let me know how to proceed, I don't want to take any changes. Can you tell me the best way to proceed with this link exchange? Thanks
Link Building | | Ruchy0 -
Is providing a paid scholarship to schools and receiving a back link, classed as a paid link scheme?
I've always wondered if it is classed as paid links in Google's eyes?
Link Building | | ResumeGenius0 -
Root Domain Link for Affiliate's Link
It seems my affiliate link: http://www.hrmsplugins.com?partners=21 is not being considered as a "root domain" backlink when this link is used on their website. Is there a reason for this?
Link Building | | delphia0 -
Google notice of unnatural links - no penalty
my site just received this warning on 3/18. I haven't seen any rankings drop or anything... i don't know why i got this as i really don't build links in a frowned upon way. Anyone know if you get penalized for this, and if so, how long after you get the message does the penalty kick in, and what is the penalty.... Also, if i gave you guys my domain name do you think you'd be able to tell which links would cause this message? thanks
Link Building | | Prime850 -
Link values
I have an interesting question, it's more to see other peoples opinions than looking for an exact answer, as I know noone can give that. If you had a good link on harvard.edu.. how many very low quality links on low DA domains that are full of junk, would you need to match the harvard link... if all other factors were equal. I'm gonna go for 500.. I've had great results with a few high DA/trust links. I'm not looking for anyone to give me a definative answer, just curious as to the views of other SEOs 😄
Link Building | | PeterM220 -
.edu links
Hi mozzers, I've had some great success reaching out to many .edu webmasters. Often they are professors and such. My question is, i'm guessing during the summer holiday period, reaching out to these webmasters won't be so fruitful. I'm not in the US and was wondering when the US Summer holidays are for universities over there? Also, does anyone have experience outreaching to these webmasters over the holidays? Peter
Link Building | | PeterM221