Does google scrape links from PDF files? do these links pass link juice?
-
Title is pretty much the whole question.
-
I made a test and it seems that yes, the links from pdf count for ranking.
The test is on my Romanian blog http://seogan.ro/link-building-pdf-urile-o-sursa-de-linkuri-test
You can find an English translation here: http://www.seogan.com/pdf-link-building
Hope it helps.
-
Yes it does according to Google tech spec http://code.google.com/apis/searchappliance/documentation/50/admin_crawl/Introduction.html
which specifically states if follows html links in pdf 'It follows HTML links in PDF files, Word documents, and Shockwave documents'. Google's own api docs carry more weight than a comment in a forum_._ If they are licencing this out as an application it would suggest that the same technology is available in the main engine as does Dunamis's comment about a listing in a pdf document being found in search results.
You can test for youself by publishing a pdf with a link to a info page that does not show up in any other links. Include the pdf in your sitemap but not the test page and check if it shows in googles index site:yoursite.com the next time it crawls.
This also gives some insight in an interview with Matt Cutts - http://www.stonetemple.com/articles/interview-matt-cutts-012510.shtml
Eric Enge: What about PDF files?
Matt Cutts: We absolutely do process PDF files. I am not going to talk about whether links in PDF files pass PageRank. But, a good way to think about PDFs is that they are kind of like Flash in that they aren't a file format that's inherent and native to the web, but they can be very useful. In the same way that we try to find useful content within a Flash file, we try to find the useful content within a PDF file. At the same time, users don't always like being sent to a PDF. If you can make your content in a Web-Native format, such as pure HTML, that's often a little more useful to users than just a pure PDF file.
-
This person seems to think no: http://www.google.fr/support/forum/p/Webmasters/thread?tid=14c5fe970fe84361&hl=en
but i'm not sure how much i can trust a random comment from a random source. any evidence for either argument?
EDIT: And this person seems to think they do pass link juice: http://www.whydowork.com/blog/link-building/274/
Could a mod remove the marked as answered? i don't think i am able to remove it, and the question isn't really answered.
-
yes, but do they crawl the links they find in these documents, or do they just index their contents.
-
Hmmm although i thought you had answered my question, i actually feel that you have not... Yes the links you provided state that google scrapes pdfs and even OCRs pdfs to get a better idea what is in them, but i don't see anywhere that they mention crawling the urls they find in these pdf documents.
-
Google definitely does index the contents of pdf files. I found this out the hard way as I had a real estate pdf on my site that I wanted to have listed in the index, but I didn't know that the contents would be crawled. The pdf contained some listings that I was not legally allowed to advertise on my site. (It was legal for me to give someone a report with the listings in it though).
When another realtor was searching for their own listing, my pdf came up. I got in trouble. I'm ok now though.
-
Have a look at this article http://searchenginewatch.com/article/2067225/Google-Does-PDF-Other-Changes it explains some of the doc library search for pdf files and Google's statement here http://googleblog.blogspot.com/2008/10/picture-of-thousand-words.html.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
When conducting a link building strategy does it matter the country the link is from?
We are a UK business and if we have links mostly from US based blogs and websites does this penalise us. The links are from relevant websites and topics. Should we be focusing on .co.uk sites
Link Building | | Caffeine_Marketing1 -
Link Building where link building is dead. Latest tactics in a land of content.
Hello, What is your latest link building techniques considering manual link building is basically dead, and you should really build good content to drive results. Link building has really become less important. It seems to be much more 10X content now. What would you recommend for basic link building considering this besides the 50 first Links video by Rand, which is excellent. Thanks, Bob
Link Building | | BobGW0 -
What is the importance of root domains linking to your website in Google's rankings? I notice our competition has a much higher number on keywords I'm analyzing. Thank you!
I've noticed our competition has a much higher number of "root domains" linking to their page than we do. Is this simply a result of more websites linking to them? How long does it normally take to build up these numbers/rankings? (I'm assuming it's a concerted effort, which I'll be researching.) Thank you!
Link Building | | mjfinet0 -
Drop in "Links to your site" in Google Webmaster Tools
Last week I noticed a substantial drop in "Links to Your Site" in Google Webmaster Tools on several of my websites. Upon further investigation the reduced links were from our sister sites. It appears that Google has reduced the number of links they are counting from links in our headers and footers from these sites to our main websites. It is not affecting our rankings or traffic. Is Google doing some clean up with what they are showing in webmaster tools for links from related, owned sites? Any cause for concern?
Link Building | | tdawson090 -
There is a New link to my Site but I can see it on link report ?why ?
Hi I found out using link campaign and Open site explorer that my competitors where getting links from this page http://www.visitgreece.gr/en/activities/land_sports/golf So I conatc the admins and added our golf courses too Is the dunes and the bay course in the middle of the page But this happened 3 weeks ago and still this linking domain is not appaering in my campaign why is that any ideas ? does it take so much time to appera to SEOmoz ? Or the problem is because they are linking 2 diffrent sites ? my site is www.costanavarinogolf.com Thanks a lot in advance for your help
Link Building | | grzontan0 -
Using an SEO Agency to build one-way links for you via link exchange
There are a number of SEO agencies which offer link building as part of their SEO offerings. I believe they build one-way links to the client site, by offering another link in exchange to the liking site. So, if the client site is "C", and link is being requested from site "A", the site "A" owner is offered a link from site "B" in return. Is this a good and/or recommended practice?
Link Building | | thinkvidya0 -
Link Building: No linked to content in industry
Hello, I'm doing link buidling for a small ecommerce niche. There's no resource content in the niche (looking at OSE Top Content) that has attracted backlinks (on any sites) How should this effect what resource content I should create. Thanks!
Link Building | | BobGW0 -
Value of Inbound Links to Pages With a lot of Outbound Links
Suppose you have a resource page of the Top 50 Awesome Sites in your niche. Since there are about 50 outbound links, then I would think there will be less Page Rank being passed to internal pages from internal links on the resource page. Since you are getting less PR passed to internal pages, are there other ways the inbound links can be beneficial, such as increasing the diversity of links of your domain? Sites like SEO Optimise seem to create a lot of Top 30 Resources lists and I have read that they think it is a strong tactic.
Link Building | | SparkplugDigital0