I am really interested in hearing what others have to say about this.
I know that .pdfs can be very valuable content. They can be optimized, they rank in the SERPs, they accumulate PR and they can pass linkvalue. So, to me it would be a mistake to block them from the index...
However, I see your point about dupe content... they could also be thin content. Will panda whack you for thin and dupes in your PDFs?
How can canonical be used... what about author?
Anybody know anything about this?