Does google scrape links from PDF files? do these links pass link juice?
-
Title is pretty much the whole question.
-
I made a test and it seems that yes, the links from pdf count for ranking.
The test is on my Romanian blog http://seogan.ro/link-building-pdf-urile-o-sursa-de-linkuri-test
You can find an English translation here: http://www.seogan.com/pdf-link-building
Hope it helps.
-
Yes it does according to Google tech spec http://code.google.com/apis/searchappliance/documentation/50/admin_crawl/Introduction.html
which specifically states if follows html links in pdf 'It follows HTML links in PDF files, Word documents, and Shockwave documents'. Google's own api docs carry more weight than a comment in a forum_._ If they are licencing this out as an application it would suggest that the same technology is available in the main engine as does Dunamis's comment about a listing in a pdf document being found in search results.
You can test for youself by publishing a pdf with a link to a info page that does not show up in any other links. Include the pdf in your sitemap but not the test page and check if it shows in googles index site:yoursite.com the next time it crawls.
This also gives some insight in an interview with Matt Cutts - http://www.stonetemple.com/articles/interview-matt-cutts-012510.shtml
Eric Enge: What about PDF files?
Matt Cutts: We absolutely do process PDF files. I am not going to talk about whether links in PDF files pass PageRank. But, a good way to think about PDFs is that they are kind of like Flash in that they aren't a file format that's inherent and native to the web, but they can be very useful. In the same way that we try to find useful content within a Flash file, we try to find the useful content within a PDF file. At the same time, users don't always like being sent to a PDF. If you can make your content in a Web-Native format, such as pure HTML, that's often a little more useful to users than just a pure PDF file.
-
This person seems to think no: http://www.google.fr/support/forum/p/Webmasters/thread?tid=14c5fe970fe84361&hl=en
but i'm not sure how much i can trust a random comment from a random source. any evidence for either argument?
EDIT: And this person seems to think they do pass link juice: http://www.whydowork.com/blog/link-building/274/
Could a mod remove the marked as answered? i don't think i am able to remove it, and the question isn't really answered.
-
yes, but do they crawl the links they find in these documents, or do they just index their contents.
-
Hmmm although i thought you had answered my question, i actually feel that you have not... Yes the links you provided state that google scrapes pdfs and even OCRs pdfs to get a better idea what is in them, but i don't see anywhere that they mention crawling the urls they find in these pdf documents.
-
Google definitely does index the contents of pdf files. I found this out the hard way as I had a real estate pdf on my site that I wanted to have listed in the index, but I didn't know that the contents would be crawled. The pdf contained some listings that I was not legally allowed to advertise on my site. (It was legal for me to give someone a report with the listings in it though).
When another realtor was searching for their own listing, my pdf came up. I got in trouble. I'm ok now though.
-
Have a look at this article http://searchenginewatch.com/article/2067225/Google-Does-PDF-Other-Changes it explains some of the doc library search for pdf files and Google's statement here http://googleblog.blogspot.com/2008/10/picture-of-thousand-words.html.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is it a good practice to get link for your original page by shorter URL links
Hello Moz Community, I have a question regarding getting links on bitly, tiny, or an other url shortner, Is it a good practice to get link for your page by shorter url I mean lets suppose i am getting natural guest post link from content for my blog home page
Link Building | | JoeySolicitor
https://coupontoaster.com/blog/ ; and shorten it with bitly and then hyper link so is it a good practice am I getting the same power and SEO consideration as if i would have done this without url shortener. And what anchor texts shall i be using like rich anchor text or any other like click here, visit here, bitly or etc Please kindly help me with this thanks1 -
We're looking at providing SEO for a website that has the majority of its incoming links from websites created solely to provide links. Few have bad spam rankings. How worried should I be about those links?
The majority of incoming links to a prospect's website are from website pages apparently created solely to provide links to the website. Few have high spam scores. The sites linking to the main site have versions of blogs with linked text. They seem to be providing positive SEO value now, but I'm concerned they might get noticed and hurt the main site in the future.
Link Building | | PKI_Niles1 -
Shortened URLs Passing Link Equity?
Hi everyone, I was going through a competitor's backlinks, and I noticed they had a number of links from ow.ly and bit.ly that according to Open Site Explorer were passing link equity with PA in the 40 and DA in the 90s. How does that happen? And, how can I duplicate that? I thought those services just shortened your URLs for Twitter feeds mostly, and Twitter no-followed everything. Thanks for any assistance you can provide! Ruben
Link Building | | KempRugeLawGroup0 -
What is the number of links do you require your link builder to attain each week or month? What is a reasonable goal?
Hello Mozzers... I would like to get a survey or feedback from other Mozzers who owns an SEO company or manages / hire link builders. **What is the average number of links does your link builder need to attain per week or month? what is their goal? ** I understand quality over quantity but I want to make sure there is a reasonable average to provide them goals and something to achieve on. Of course reward them more if they exceed or get .edu links. What do you institute? What do you think is fair and achievable per website? PS. There is no right or wrong answers here. I am looking for a measurable answer not subjectable. Again just gathering MEASURABLE GOALS.
Link Building | | ChatterBuzzMedia0 -
Spammy Links in MOZ but when I go to the external link I can't find a link to us
I was going to try to contact webmasters to see if they would remove some of our spammy links. I see alot of them in MOZ but when you go to the site our anchor text is not there. Is this good? How often does MOZ refresh external links. Please see: http://www.opensiteexplorer.org/anchors?site=www.totalvac.com None of the links for the anchor text <a class="clickable title link-pivot" title="See top linking pages that use this anchor text." data-text="vacuum cleaner parts vacuum parts vacuum bags vacuum cleaner bags" data-id="46391436859">vacuum cleaner parts vacuum part...</a> in MOZ exist? We got hit extremely hard by Penguin in May
Link Building | | totalvac0 -
Linking Etiquette
Hi Moz Community, Long time lurker, first time poster. I work for a real estate firm and have recently done some link analysis. I'm noticing that my company is not getting linked to as frequently as we should be. Several news outlets (including NYT & Bloomberg) have cited our reports, interviews with employees and other original content belonging to my company without linking back to our site (although they do mention us).Some publications are even linking back to our competitors for similar content but not ours. Is it appropriate to reach out and ask for links from these outlets after they've been published? Does anybody have tips on making others aware we want links shared for future articles? Thanks in advance!
Link Building | | rlaughlin0 -
Is this link Followed?
Hi All, Noticed this link in opensiteexplorer. Its not marked as no-follow, but not sure if this link would pass any value? http://goafrica.about.com/od/southafrica/a/selfdrivesa.htm Click on any external link, it takes you to a "preview" page from within the main website. Is this a false positive? As in it shows as a follow link, but when you actually click it, it doesnt redirect to the website. Any thoughts on this? Greg
Link Building | | AndreVanKets0 -
Panda Update: Isn't a link still a link?
I was doing some link building and some SEO's said that the Panda update affected many websites. I am going to use eZineArticles.com as my example. EzineArticles was affected by the Panda update and does not show up in the SERPs as much as before. But they still have doFollow Links coming from the articles I am submitting. QUESTION: Regardless if EzineArticles was affected by the Panda Update, isn't a "Follow Link" still a "Follow Link" OR am I completely wasting my time on this devalued website? Edit: Yes I know a PR 0 page is not as valuable as a PR 9 page. I am asking from the standpoint of the affected Panda Update domains overall.
Link Building | | Francisco_Meza0