Rel="canonical" for PFDs?
-
Hello there,
We have a lot of PDFs that seem to end up on other websites. I was wondering if there was a way to make sure that our website gets the credit/authority as the original creator. Besides linking directly from the PDF copy to our pages, is anyone aware of strategy for letting Google know that we are the original publishers?
I know search engines can index HTML versions of PDFs, so is there anyway to get them to index a rel="canonical" tag as well?
Thoughts/Ideas?
-
I stand corrected on that point.
Thank you Jassy for sharing the link. I was not aware Google made that change.
-
I'm not sure that statement about rel canonical only working within your own domain - if you have some test data/similar that shows this to be the case, I'd love to hear about it.
Matt Cutts specifically says that cross-domain rel canonical is supported, see: the webmaster video on: iwww.youtube.com/watch?v=zI6L2N4A0hA
-
Canonical tags are only effective within your domain. They have no value if someone else was to take your work and share it elsewhere.
A few things you can do to establish yourself as the original content creator:
-
publish it first on your site. Wait until you see your content in Google before actively distributing the pdf to others. This would be one indicator that can be used to demonstrate you are the original author.
-
as you shared, ensure there are links back to your site within the PDF. This would be another good indicator to Google that you are the content creator.
-
lock the PDF so changes cannot be made to the content.
-
Earlier today Google announced the new schema.org microdata offers an author tag so you can determine the original author. That system has been tested and is available to use now.
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
how to set rel canonical on wordpress.com sites
I know how to do this with a wordpress.org site but I have a client that does not want to switch and without a plugin I am lost. any help would be greatly appreciated. Jeremy Wood
Technical SEO | | SOtBOrlando0 -
"non-WWW" vs "WWW" in Google SERPS and Lost Back Link Connection
A Screaming Frog report indicates that Google is indexing a client's site for both: www and non-www URLs. To me this means that Google is seeing both URLs as different even though the page content is identical. The client has not set up a preferred URL in GWMTs. Google says to do a 301 redirect from the non-preferred domain to the preferred version but I believe there is a way to do this in HTTP Access and an easier solution than canonical.
Technical SEO | | RosemaryB
https://support.google.com/webmasters/answer/44231?hl=en GWMTs also shows that over the past few months this client has lost more than half of their backlinks. (But there are no penalties and the client swears they haven't done anything to be blacklisted in this regard. I'm curious as to whether Google figured out that the entire site was in their index under both "www" and "non-www" and therefore discounted half of the links. Has anyone seen evidence of Google discounting links (both external and internal) due to duplicate content? Thanks for your feedback. Rosemary0 -
How do I get my pages to go from "Submitted" to "Indexed" in Google Webmaster Tools?
Background: I recently launched a new site and it's performing much better than the old site in terms of bounce rate, page view, pages per session, session duration, and conversions. As suspected, sessions, users, and % new sessions are all down. Which I'm okay with because the the old site had a lot of low quality traffic going to it. The traffic we have now is much more engaged and targeted. Lastly, the site was built using Squarespace and was launched the middle of August. **Question: **When reviewing Google Webmaster Tools' Sitemaps section, I noticed it says 57 web pages Submitted, but only 5 Indexed! The sitemap that's submitted seems to be all there. I'm not sure if this is a Squarespace thing or what. Anyone have any ideas? Thanks!!
Technical SEO | | Nate_D0 -
Duplication, pagination and the canonical
Hi all, and thank you in advance for your assistance. We have an issue of paginated pages being seen as duplicates by pro.moz crawlers. The paginated pages do have duplicated by content, but are not duplicates of each other. Rather they pull through a summary of the product descriptions from other landing pages on the site. I was planing to use rel=canonical to deal with them, however I am concerned as the paginated pages are not identical to each other, but do feature their own set of duplicate content! We have a similar issue with pages that are not paginated but feature tabs that alter the URL parameters like so: ?st=BlueWidgets ?st=RedSocks ?st=Offers These are being seen as duplicates of the main URL, and again all feature duplicate content pulled from elsewhere in the site, but are not duplicates of each other. Would a canonical tag be suitable here? Many Thanks
Technical SEO | | .egg0 -
Moving content from CMS pages to a blog - 301 or rel canonical?
Our site has some useful information buried in out-of-the-way CMS pages, and I feel like this content is more suited to our blog. What's my best method here? 1. Move the content to a blog post, delete the original page, and 301. 2. Move the content to a blog post, leave the original page up, and rel canonical. 3. Rewrite the content so it's not a duplicate, keep original page up, and post rewritten content on the blog. 4. Something else. Some of this content has inbound links and some does not. Quite a bit of it gets long-tail traffic already. It just looks kludgy because it's on pages that really aren't designed for articles. It would look much nicer and be much more readable/shareable/linkable on the blog.
Technical SEO | | CMC-SD0 -
Problem with Rel Canonical
Background: We check to make sure that IF you use canonical URL tags, it points to the right page. If the canonical tag points to a different URL, engines will not count this page as the reference resource and thus, it won't have an opportunity to rank. If you've not made this page the rel=canonical target, change the reference to this URL. NOTE: For pages not employing canonical URL tags, this factor does not apply. Clearly I am doing something wrong here, how do I check my various pages to see where the problem lies and how do I go about fixing it?
Technical SEO | | SallySerfas0 -
NoIndex/NoFollow pages showing up when doing a Google search using "Site:" parameter
We recently launched a beta version of our new website in a subdomain of our existing site. The existing site is www.fonts.com with the beta living at new.fonts.com. We do not want Google to crawl the new site until it's out of beta so we have added the following on all pages: However, one of our team members noticed that google is displaying results from new.fonts.com when doing an "site:new.fonts.com" search (see attached screenshot). Is it possible that Google is indexing the content despite the noindex, nofollow tags? We have double checked the syntax and it seems correct except the trailing "/". I know Google still crawls noindexed pages, however, the fact that they're showing up in search results using the site search syntax is unsettling. Any thoughts would be appreciated! DyWRP.png
Technical SEO | | ChrisRoberts-MTI0 -
Canonical URL
In our campaign, I see this notices Tag value
Technical SEO | | shebinhassan
florahospitality.com/ar/careers.aspx Description
Using rel=canonical suggests to search engines which URL should be seen as canonical. What does it mean? Because If I try to view the source code of our site, it clearly gives me the canonical url.0