Can't get Google to Index .pdf in wp-content folder
-
We created an indepth case study/survey for a legal client and can't get Google to crawl the PDF which is hosted on Wordpress in the wp-content folder. It is linked to heavily from nearly all pages of the site by a global sidebar. Am I missing something obvious as to why Google won't crawl this PDF? We can't get much value from it unless it gets indexed. Any help is greatly appreciated. Thanks!
Here is the PDF itself:
http://www.billbonebikelaw.com/wp-content/uploads/2013/11/Whitepaper-Drivers-vs-cyclists-Floridas-Struggle-to-share-the-road.pdfHere is the page it is linked from:
http://www.billbonebikelaw.com/resources/drivers-vs-cyclists-study/ -
Egol,
Thank you for your advice and for your commendation. I really appreciated that. We worked hard on it and it is a fun niche to be in and lots of active and involved cyclists who appreciate this kind of piece. I will edit the properties of the document. I hadn't thought about that. Thanks very much!
-
PDF documents can be stubborn to get into the index. If you are hitting it with plenty of links, be patient.
This is a really classy document. Very nice. Well done.
If this was my document I would edit the "properties" and give it a title tag that will enable it to compete in the SERPs a little better. There is also some blue underlined text in the document. Where those supposed to be hyper links? they are not working for me. However, I see that BillBoneBikeLaw.com in the last line of the document is a working hyperlink. That will allow pagerank to flow back into the main site.
This is a "best on the web" document. Nice work.
I really enjoyed reading the document. I used to be a hard core biker. I've been honked at, yelled at, cussed at, throwed at, spat at, and hit by both accident and intent. A long time ago I won one of the age groups at The Great Floridian. I still remember Sugarloaf and a mad dog attacking out of an orange grove on the downhill.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google indexing despite robots.txt block
Hi This subdomain has about 4'000 URLs indexed in Google, although it's blocked via robots.txt: https://www.google.com/search?safe=off&q=site%3Awww1.swisscom.ch&oq=site%3Awww1.swisscom.ch This has been the case for almost a year now, and it does not look like Google tends to respect the blocking in http://www1.swisscom.ch/robots.txt Any clues why this is or what I could do to resolve it? Thanks!
Technical SEO | | zeepartner0 -
I don't understand how this site is ranking?
This website is ranking for a very high competitive keyword "bail bonds los angeles" http://www.bondgirlsbailbonds.com/ They maybe have one backlink and 10 citations. How are they ranking for 2nd spot? This doesn't seem possible. Almost 5 other domains on page have pr2 and higher and not able to beat this site. Can someone please explain what might be causing this? thanks in advance
Technical SEO | | webbutler130 -
Does this content get indexed?
A lot of content on this site is displayed in pop up pages. Eg. Visit the Title page http://www.landgate.wa.gov.au/corporate.nsf/web/Certificate+of+Title To access the sample report or fee details, the info is shown in a pop up page with a strange url. Example: http://www.landgate.wa.gov.au/corporate.nsf/web/Certificate+of+Title+-+Fee+Details I can't see any of these pages being indexed in Google or other search engines when I do a site search: http://www.landgate.wa.gov.au/corporate.nsf/web/Certificate+of+Title+-+Fee+Details Is there a way to get this content indexed besides telling the client to restructure this content?
Technical SEO | | Bigheadigital0 -
I always get this error "We have detected that the domain or subfolder does not respond to web requests." I don't know why. PLEASE help
subdomain www.nwexterminating.com subfolder www.nwexterminating.com/pest_control www.nwexterminating.com/termite_services www.nwexterminating.com/bed_bug_services
Technical SEO | | NWExterminating0 -
How can I optimise for Google Products?
Has anyone got experience of optimising Google Products (Google Base) feeds? I've noticed that, although my site doesn't often appear on page one in the standard results, we occasionally appear right at the top because of the "universal" shopping results. My question is: how can we make this happen more often? There seems to be a lot less competition (presumably because our competitors haven't worked out how to provide the feed to Google yet!), so I imagine it should be easier and quicker to reach the top this way than any other way. Thanks! Alex
Technical SEO | | reddogmusic0 -
Can double content be a reason to not have PR?
In a bigger project are several domains that show the same content like the main-site (there is a reason to have it like that). Now those "double-content domains" are indexed and ranking in Google. But now I see that all those double-content domains have no pagerank visible, despite they do all have their unique own backlinks. Do you know why those domains don't show Pagerank? Can it really have something to do with the double-content situation?
Technical SEO | | kenbrother0 -
Getting Google to index new pages
I have a site, called SiteB that has 200 pages of new, unique content. I made a table of contents (TOC) page on SiteB that points to about 50 pages of SiteB content. I would like to get SiteB's TOC page crawled and indexed by Google, as well as all the pages it points to. I submitted the TOC to Pingler 24 hours ago and from the logs I see the Googlebot visited the TOC page but it did not crawl any of the 50 pages that are linked to from the TOC. I do not have a robots.txt file on SiteB. There are no robot meta tags (nofollow, noindex). There are no 'rel=nofollow' attributes on the links. Why would Google crawl the TOC (when I Pinglered it) but not crawl any of the links on that page? One other fact, and I don't know if this matters, but SiteB lives on a subdomain and the URLs contain numbers, like this: http://subdomain.domain.com/category/34404 Yes, I know that the number part is suboptimal from an SEO point of view. I'm working on that, too. But first wanted to figure out why Google isn't crawling the TOC. The site is new and so hasn't been penalized by Google. Thanks for any ideas...
Technical SEO | | scanlin0 -
URL's for news content
We have made modifications to the URL structure for a particular client who publishes news articles in various niche industries. In line with SEO best practice we removed the article ID from the URL - an example is below: http://www.website.com/news/123/news-article-title
Technical SEO | | mccormackmorrison
http://www.website.com/news/read/news-article-title Since this has been done we have noticed a decline in traffic volumes (we have not as yet assessed the impact on number of pages indexed). Google have suggested that we need to include unique numerical IDs in the URL somewhere to aid spidering. Firstly, is this policy for news submissions? Secondly (if the previous answer is yes), is this to overcome the obvious issue with the velocity and trend based nature of news submissions resulting in false duplicate URL/ title tag violations? Thirdly, do you have any advice on the way to go? Thanks P.S. One final one (you can count this as two question credits if required), is it possible to check the volume of pages indexed at various points in the past i.e. if you think that the number of pages being indexed may have declined, is there any way of confirming this after the event? Thanks again! Neil0