Index pdf files but redirecto to site
-
Hi,
One of our clients has tons of PDFs (manuals, etc.) and frequently gets good rankings for the direct PDF link. While we're happy about the PDFs attracting users' attention, we'd like to redirect them to the site where the original PDF link is published and avoid that people open the pdf directly.
In short, we'd like to index the PDFs, but show to users the pdf link within a site - how should we proceed to do that?
Thanks,
GM
-
Thanks for the follow-up ... if it weren't for phrases like
- The page displayed to all users who visit from Google must be identical to the content that is shown to Googlebot.
I'd be quite comfortable with that ... in the meantime, however, I might try some pdf2html conversion tools to see if there is a viable way to present PDF-information on a HTML page and block the PDF link for robots.
Regards,
Gert
-
Hi Gret,
After further research, it might not be considered as cloacking that much as the Google First Click Free for Web Search system works the same way and check the HTTP referer.
For more details, read the official Google Webmaster Central blog post about it here :
http://googlewebmastercentral.blogspot.com/2008/10/first-click-free-for-web-search.htmlBest regards,
Guillaume Voyer. -
Thanks for your detailed reply, Guillaume,
I guess the possible "cloaking troubles" with this strategy are probably too risky for our project. However, I like the "click here" idea, we'll check if we can automate that somehow to drag users reading the PDFs back to our site.
-
Hi Gert,
Technically, this is not possible unless you use cloaking to display the PDF to the search engines and redirect the users to a different page.
What you could do to avoid cloacking is to include a banner at the top of your PDF with something like "Click here to see all our related PDFs" that would link to your website, this way users might be interested in going to your website.
Otherwise, you could detect the referer with htaccess and redirect the user to the user if he is coming from google, but this might be considered as cloaking. Here's an example :
RewriteEngine On
RewriteCond %{HTTP_REFERER} (.)google.(.)
RewriteRule ^pdf/(.*).pdf /pdf-list [R=302]If you are running a apache server and you put this in your .htaccess file, the first line activate mod_rewrite, the second line check if the referer matches anythinggoogle.anything and the third line redirect all .pdf files in the pdf folder to the /pdf-list page if the referer matches.
Best regards,
Guillaume Voyer.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Embed WordPress blog into site or /blog?
Hi, Just a quick question - I assumed it would be a better to have a WordPress blog in a sub folder rather than a sub domain however I just wondered would it offer any more / less value if I just embedded the WordPress blog into my existing code? Thanks, Dan
Content Development | | Sparkstone0 -
In Index but not in Serps
Hi, I have a situation with a client site which is quite frustrating. Basically, most "recent" (by that I mean for the last couple of months) blog posts are failing to reach the SERPS (actually, one has and a couple have from the early days but it's taken months for them to arrive). Previously the blog posts were indexed very quickly - often instantly. Now, I've checked WMT etc and I've submitted each post manually but still nothing. The Sitemap is valid etc. However, pages (not blog posts) seem to be getting into the serps very quickly. Another complication is that if I search: site:www.domainname.com and set the date filter to a month I can see some of the earlier blog posts in that result set. However, if I scrape a bit of unique content from one of those posts and search - nothing in the SERPS. And my Moz report tells me that the page is not to be found in the top 50 either (so I'm confident these pages are not in the SERPS). Any ideas why this would happen to just blog posts? Is it something to do with the parent blog landing perhaps being too strong in the rankings? Any ideas appreciated. Thanks.
Content Development | | KMUK0 -
Would you allow guest writers to have google adsense on your site
Hi, i am thinking of building a website where 30% of the writers would be guest bloggers and i was thinking about letting guest bloggers have adsense on the site. What i was thinking of doing was, allowing the guest blogger to have one google adsense advert on the page of the article but not sure what people think. The new magazine would be on one topic but i am worried incase the person kept on pressing the google adsense advert to get money in. The google adsense code would be there code, does anyone know if there is a way from hiding their code so they do not know which one is their google adsense advert, and do people think this is a good idea. This would only be offered to people who write good quality interest content. will be interested to hear your thoughts
Content Development | | ClaireH-1848860 -
What are the best content writer sites?
Hi, I'm doing some work on a new blog and wondered if anyone could recommend some low cost content writers? I have only justed started researching this service, so any advice the SEOmoz community could give would be grately appreciated. Thanks in advance.
Content Development | | RBH0 -
Too many articles within ecommerce site?
So recently purchased an ecommerce site to work on with lots of potential. The site was effected by Penguin update and now I am trying to fix up some issues. One of the problems I think may be the articles that he had written within the site. The site is an ecommerce site but has lots of articles like how to's and tips. A lot of these articles are not very fluent but seems to be written for SEO. Should I just remove them entirely or should I just stuff them in the back somewhere.
Content Development | | William.Lau0 -
Blog and Website = 2 different URL's - Is it WORTH to merge content on to one site
Good day Mozzers! A friend of mine recently asked for my help in regards to online marketing. While getting familiar with his online presence, I realized that he has a blog hosted under a completely different URL Main Site = http://pardons.org/ (page rank 4)
Content Development | | vip4service
Blog = http://pardons.wordpress.com/ (page rank 3) What I am battling with is whether or not he should take all of the blog content he has, and merge it on to his main site. It has over 280+ blog posts spanning over a few years, so there is A LOT of content that could benefit his main site. However is it worth it, or should he continue to run everything as 2 different sites? Also, of you suggest moving the content over, what would be the best way to do it in your opinion? He currently has links on his blog TO his main site, so there is a little bit of link juice there, but with a average of 300 views a day, he only get about 10 clicks to his main site from the blog. Thanks a ton for your help!0 -
Mobile Sites / Useragent detection
I've got a question about how search engines declare that they're mobile browsers... Our website is based on wordpress, and uses the caching plugin W3TC to send a different site template to mobile useragents - i believe from the HTTP useragent string; (the same content is served on every page whether it's a desktop or mobile - just different themes). After having this mobile site online for a few months, we're a little confused as to why google still shows the instant preview of the desktop version for mobile users, and it doesn't show the little mobile phone icon in our SERPs for mobile devices (it's as if it doesn't realise the mobile site exists). I was reading today that the "old" method of serving different content based on the browser is to use the HTTP useragent string; and there's a "new" object checking method which is more robust (although I can't find a lot of information about it). Can anyone explain the "new" method? Would this be the reason that google is so far ignorant of our mobile site?
Content Development | | AlecPR0 -
Index.html vs. default.html
Hi, I have a website that is about 7 years old. I had been using index.html as the home page. When I redesigned my site about 3 months ago I changed it to default.html. The old index.html page was still on my server. I just realized my mistake. All of my links to the home page lead to the new default.html. However, people are still landing on the old index.html. I have change the old index.html to the new design but that means i have 2 "home" pages out there. Should i delete one? Should I leave them both there but use the canonical tag for one so it is not considered duplicate content? What is best for my rankings?
Content Development | | bhsiao0