Does google scrape links from PDF files? do these links pass link juice?
-
Title is pretty much the whole question.
-
I made a test and it seems that yes, the links from pdf count for ranking.
The test is on my Romanian blog http://seogan.ro/link-building-pdf-urile-o-sursa-de-linkuri-test
You can find an English translation here: http://www.seogan.com/pdf-link-building
Hope it helps.
-
Yes it does according to Google tech spec http://code.google.com/apis/searchappliance/documentation/50/admin_crawl/Introduction.html
which specifically states if follows html links in pdf 'It follows HTML links in PDF files, Word documents, and Shockwave documents'. Google's own api docs carry more weight than a comment in a forum_._ If they are licencing this out as an application it would suggest that the same technology is available in the main engine as does Dunamis's comment about a listing in a pdf document being found in search results.
You can test for youself by publishing a pdf with a link to a info page that does not show up in any other links. Include the pdf in your sitemap but not the test page and check if it shows in googles index site:yoursite.com the next time it crawls.
This also gives some insight in an interview with Matt Cutts - http://www.stonetemple.com/articles/interview-matt-cutts-012510.shtml
Eric Enge: What about PDF files?
Matt Cutts: We absolutely do process PDF files. I am not going to talk about whether links in PDF files pass PageRank. But, a good way to think about PDFs is that they are kind of like Flash in that they aren't a file format that's inherent and native to the web, but they can be very useful. In the same way that we try to find useful content within a Flash file, we try to find the useful content within a PDF file. At the same time, users don't always like being sent to a PDF. If you can make your content in a Web-Native format, such as pure HTML, that's often a little more useful to users than just a pure PDF file.
-
This person seems to think no: http://www.google.fr/support/forum/p/Webmasters/thread?tid=14c5fe970fe84361&hl=en
but i'm not sure how much i can trust a random comment from a random source. any evidence for either argument?
EDIT: And this person seems to think they do pass link juice: http://www.whydowork.com/blog/link-building/274/
Could a mod remove the marked as answered? i don't think i am able to remove it, and the question isn't really answered.
-
yes, but do they crawl the links they find in these documents, or do they just index their contents.
-
Hmmm although i thought you had answered my question, i actually feel that you have not... Yes the links you provided state that google scrapes pdfs and even OCRs pdfs to get a better idea what is in them, but i don't see anywhere that they mention crawling the urls they find in these pdf documents.
-
Google definitely does index the contents of pdf files. I found this out the hard way as I had a real estate pdf on my site that I wanted to have listed in the index, but I didn't know that the contents would be crawled. The pdf contained some listings that I was not legally allowed to advertise on my site. (It was legal for me to give someone a report with the listings in it though).
When another realtor was searching for their own listing, my pdf came up. I got in trouble. I'm ok now though.
-
Have a look at this article http://searchenginewatch.com/article/2067225/Google-Does-PDF-Other-Changes it explains some of the doc library search for pdf files and Google's statement here http://googleblog.blogspot.com/2008/10/picture-of-thousand-words.html.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Can a company group have reciprocal links on their company websites without being penalised by Google?
A client which is part of a group of businesses all within one industry but have different purposes has come to us aith a "brilliant idea" that they all have a blog on their website that links to every business within their group. We are pretty sure that this will be seen as "black hat" by Google, but just wanted to see what you all thought? Thanks!
Link Building | | RedAntSolutions0 -
Grr SEO linking.. I am not understanding why I wouldn't have lots more links.. Please help. Thanks
I have done the whole moz open exploer and I am not understanding why my site wouldn’t have more links registering to my website.. I have lots of sites(directorys and 3rd party) with my website domain in them. The only one that is linking to my site is BBB.com and my advertsing with saint paul press. www.somerersetautodealer.com But if I have links with all kinds of automotive directories why wouldn’t they register? I am sure this a simply answer or that I am not understanding something. Thanks for your help! Scott
Link Building | | Scott12340 -
When we use 'link:' for who get the link, how come google show us the same domain as a link.
the search result show the domain of its own. what is is? and is it meaningful as a link?
Link Building | | onedaykorea0 -
Does Google drop links from page rank N/A sources?
Hi Everyone,
Link Building | | AMA-DataSet
I've started doing some link analysis on one of my site that has over nearly 800 links in total (which I got from the webmaster tools downloading the latest links). When I go on to Google and use the "link:www.mysite.com" directive it will only display 12 links. Does this mean Google is only counting 12 links from the link profile? Iv checked the freshness of some the links it wasn't displaying within Google to check they hadn't all expired. Links from march 2013 still don't appear. This sites link profile has been untouched for at least a year now and is full of directories many of which have a page rank of N/A hence my question. (I'm surprised it hasn't been hit by penguin!) Thanks in advance.0 -
Google Disavow File Update
Is there any specific format to update the Disavow file? Also if I submitted the file a months ago, and need to update it now... should I leave the old 'excluded domains' or should I remove them? Lets say this is what I have: How would you update it? #explanation from to Google went here... and ended here.
Link Building | | dhidalgo1
"domain:exampledomainalreadysubmitted1.com"
"domain:exampledomainalreadysubmitted2.com"
"domain:exampledomainalreadysubmitted3.com" Thanks for your input0 -
Links in google+ profile
THe links people can add in the about part of a google+ profile - do they have any effect on rankings? is this worth doing for link building?
Link Building | | pauledwards0 -
Free link on a Paid Link Blog
Hi there, I have been doing some outreaching, and managed to have a blog post accepted on a authority blog. They included links to my website, and I was very pleased with the placement. However, having browsed through the site, I was worried to see that they openly admit they allow 'reviews' of websites, with backlinks included, for $50 per review. I am worried I might be penalised without actually doing anything wrong. I did not pay for my link, but the link has been placed on a site which openly admits they accept payment for links. Should I be worried? Should I ask them to take it down? To date I have been told countless times by bloggers I am outreaching that if I pay $10, $50, $100 etc I can write a blog post. I have never accepted because of the risk of penalization. Now, unwittingly, I am linked to from a paid link site with a blog post that would look like I have paid for it because of the placement and style of back link. What do you think? Thanks,
Link Building | | giveacar0 -
Do-Follow link from Linkedin/Facebook/Twitter/About/Google Profile, how to:
Tell us, a technician method of building do-follow link on high quality social websites. Thanks
Link Building | | leadsprofi0