Does google scrape links from PDF files? do these links pass link juice?
-
Title is pretty much the whole question.
-
I made a test and it seems that yes, the links from pdf count for ranking.
The test is on my Romanian blog http://seogan.ro/link-building-pdf-urile-o-sursa-de-linkuri-test
You can find an English translation here: http://www.seogan.com/pdf-link-building
Hope it helps.
-
Yes it does according to Google tech spec http://code.google.com/apis/searchappliance/documentation/50/admin_crawl/Introduction.html
which specifically states if follows html links in pdf 'It follows HTML links in PDF files, Word documents, and Shockwave documents'. Google's own api docs carry more weight than a comment in a forum_._ If they are licencing this out as an application it would suggest that the same technology is available in the main engine as does Dunamis's comment about a listing in a pdf document being found in search results.
You can test for youself by publishing a pdf with a link to a info page that does not show up in any other links. Include the pdf in your sitemap but not the test page and check if it shows in googles index site:yoursite.com the next time it crawls.
This also gives some insight in an interview with Matt Cutts - http://www.stonetemple.com/articles/interview-matt-cutts-012510.shtml
Eric Enge: What about PDF files?
Matt Cutts: We absolutely do process PDF files. I am not going to talk about whether links in PDF files pass PageRank. But, a good way to think about PDFs is that they are kind of like Flash in that they aren't a file format that's inherent and native to the web, but they can be very useful. In the same way that we try to find useful content within a Flash file, we try to find the useful content within a PDF file. At the same time, users don't always like being sent to a PDF. If you can make your content in a Web-Native format, such as pure HTML, that's often a little more useful to users than just a pure PDF file.
-
This person seems to think no: http://www.google.fr/support/forum/p/Webmasters/thread?tid=14c5fe970fe84361&hl=en
but i'm not sure how much i can trust a random comment from a random source. any evidence for either argument?
EDIT: And this person seems to think they do pass link juice: http://www.whydowork.com/blog/link-building/274/
Could a mod remove the marked as answered? i don't think i am able to remove it, and the question isn't really answered.
-
yes, but do they crawl the links they find in these documents, or do they just index their contents.
-
Hmmm although i thought you had answered my question, i actually feel that you have not... Yes the links you provided state that google scrapes pdfs and even OCRs pdfs to get a better idea what is in them, but i don't see anywhere that they mention crawling the urls they find in these pdf documents.
-
Google definitely does index the contents of pdf files. I found this out the hard way as I had a real estate pdf on my site that I wanted to have listed in the index, but I didn't know that the contents would be crawled. The pdf contained some listings that I was not legally allowed to advertise on my site. (It was legal for me to give someone a report with the listings in it though).
When another realtor was searching for their own listing, my pdf came up. I got in trouble. I'm ok now though.
-
Have a look at this article http://searchenginewatch.com/article/2067225/Google-Does-PDF-Other-Changes it explains some of the doc library search for pdf files and Google's statement here http://googleblog.blogspot.com/2008/10/picture-of-thousand-words.html.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How many links per week/month should a link building campaign acquire?
I am running a linkbuilding campaign for my company. I am mostly focusing on guest blogging opportunities and responding to emails from HARO. How many links would I have to acquire each week or month to be considered doing a good job over a 6 month or a year time period? Thank you,
Link Building | | fersu0 -
Manual Links Vs. Smart Links
Hey Everybody, Is there any downside to the smart link plugins that I see all over wordpress? Basically in short I enter a keyword and where I want that link to go (primarily internal) and set the parameters of repetition etc and it automatically adds the link to that. Now other than the obvious situation where it might put a link in an awkward place, is there anything wrong with this sort of software? Part of me things that this sort of software would not be liked by Google, and that it isn't really getting counted as a true link, but i wanted to see if there were other thoughts or experience with this.
Link Building | | HashtagHustler0 -
Should I disavow spammy links that are showing in Open Site Explorer but not showing in Google Webmaster Tools?
Howdy Mozzers, After carrying out a backlink audit for a client, Open Site Explorer shows a range of outrageously spammy links for guys wanting a bit of assistance in the bedroom area, and so on. Hopping over to Google Webmaster Tools, there is no reported trace of said spammy links (for either of the www/non-www versions of the site). There are also no manual webspam actions found on the www/non-www Google Webmaster Tools accounts for this website. So my question is: do I carry out a pre-emptive strike and go down the disavow process of requesting removal from the spammy sites, and then submit a disavow request to Google after allowing a suitable period of time for the junky/compromised website to (not) respond? Or do I just leave it alone? Thanks in advance for your wise words of wisdom and pearls of clarity.
Link Building | | wh-seo0 -
Indirect Link Earning via dofollow Links In News Articles
Hello, MOZ SEO Gurus. I've been trying to think some deep thoughts on safe, effective link earning for news publishing sites, and wanted to run this up the flagpole and see if you salute. Our site is a biotech news service -- we pump out copious amounts of news content each day, which works well for driving traffic. That being said, we also want to rank some optimized landing pages as well. Take, for example, this page, which we'd like to rank for "secondary progressive MS" and related keywords: http://bionews-tx.com/secondary-progressive-ms/ Now, as far as I'm concerned, shopping this page around to MS influencers isn't easy. I can go to Foundational websites, blogs, etc., and say, "hey, we have this info page on SPMS, and I thought that you might find it helpful/want to link to it." But chances are, the MS influencers already have their own proprietary content on SPMS, and there isn't much value to linking to it. Therefore, I think that we'll get few link earning conversions on the effort. However, what if I take our Secondary Progressive MS landing page, and I link to it in a corresponding article about SPMS research, as I did here: http://bionews-tx.com/news/2014/01/30/secondary-progressive-ms-natalizumab-clinical-trial/ Then, I go to the drug developer who is at the center of this story and say to them, "hey, we recently covered your drug in the news, and I thought you might want to link to it." Then, we get a link from an MS drug developer to the news article, which in turn has a prominent anchor text, dofollow internal link to the landing page for SPMS. If the link from the drug developer is dofollow, then we flow page rank juice from the drug developer page to our news page to our landing page. To me, it's much easier to earn safe links this way than to try and shop the landing page itself. That being said, if we get a dofollow link on the news piece, we only get a diminished portion of page rank going to the landing page. Is this strategy viable? Is the indirect flow of page rank from a linking site to a news article to a landing page even worth it? I'd love to hear your thoughts. Thanks!
Link Building | | bionewstx2 -
Does Open Site Explorer show juice passing links
I'm a little confused between all the link types. Internal / External - easy Linking Root Domain - easy Followed / No followed - easy But then there's talk about "juice passing links" and I can't quite get how this is defined, and why it's something you can get from the API, but not from Open Site Explorer... or can you get that info from OSE?
Link Building | | eatyourveggies0 -
Link building
i want to ask a question. i am sure this is kind black hat seo. look at this web site http://www.199999dollars.com/ if we also create website like this and put a banner as domain for sale and in the footer put back links to our domain ? thanks
Link Building | | idreams0 -
Acquiring a link
once I get a list of links from ontolo or the link acquisition assitant here on seomoz what is the next step? do i email that site? what do i actually do to get acquire the link? thank you for answering this! I am new to link building and so new that I dont actually know the steps and need help from someone. thank you VIjay
Link Building | | vijayvasu0 -
Google dosent care that much about links lately?
Google dosent care that much about links lately or crawl started beeing more intelegent about it? Do you feel any changes?
Link Building | | DiamondJewelryEmpire0