Rel=Canonical to Longer Page?
-
We've got a series of articles on the same topic and we consolidated the content and pasted it altogether on a single page. We linked from each individual article to the consolidated page. We put a noindex on the consolidated page.
The problem: Inbound links to individual articles in the series will only count toward the authority of those individual pages, and inbound links to the full article will be worthless.
I am considering removing the noindex from the consolidated article and putting rel=canonicals on each individual post pointing to the consolidated article. That should consolidate the PageRank. But I am concerned about pointing****a rel=canonical to an article that is not an exact duplicate (although it does contain the full text of the original--it's just that it contains quite a bit of additional text).
An alternative would be not to use rel=canonicals, nor to place a noindex on the consolidated article. But then my concern would be duplicate content and unconsolidated PageRank.
Any thoughts?
-
Nice.
-
I am doubting it. Seems like an either/or thing. I'll probably do rel=canonical to the view-all.
Thanks everybody.
-
Now I just want to know if that usage of rel=canonical can coexist with rel=next & rel=prev.
-
"rel=”canonical” can specify the superset of content" -http://googlewebmastercentral.blogspot.com/2011/09/view-all-in-search-results.html
-
Rel=canonical is supposed to point to an identical page.
-
Consolidating for usability - I want to provide access to both formats. I hate the idea of noindexing a page that could be the target of inbound links.
-
Using rel=prev and rel=next would do the trick if it's still cool to point the rel=canonical to the view-all page. Anyone know?
-
Maybe you're better off noindexing the partial articles and linking from them to the main article with a "Read the full article" link, or something like that.
How many of these articles you have relative to the rest of your content could make a difference--a very small percentage probably wouldn't be an issue in the overall health of your site if you just left them as is.
Why are you consolidating?
-
"Should I make the consolidated article a PDF download? Let them all be indexed without canonicals or 301s? I just don't know the best practice here."
First off I would say I would not use a PDF unless you have the photographs and other content to make flow and have a good end-user experience
PDFs in Google search results
Thursday, September 01, 2011 at 7:23 AM
Webmaster level: All
Our mission is to organize the world’s information and make it universally accessible and useful. During this ambitious quest, we sometimes encounter non-HTML files such as PDFs, spreadsheets, and presentations. Our algorithms don’t let different filetypes slow them down; we work hard to extract the relevant content and to index it appropriately for our search results. But how do we actually index these filetypes, and—since they often differ so much from standard HTML—what guidelines apply to these files? What if a webmaster doesn’t want us to index them?
Google first started indexing PDF files in 2001 and currently has hundreds of millions of PDF files indexed. We’ve collected the most often-asked questions about PDF indexing; here are the answers:
Q: Can Google index any type of PDF file?
A: Generally we can index textual content (written in any language) from PDF files that use various kinds of character encodings, provided they’re not password protected or encrypted. If the text is embedded as images, we may process the images with OCR algorithms to extract the text. The general rule of the thumb is that if you can copy and paste the text from a PDF document into a standard text document, we should be able to index that text.Q: What happens with the images in PDF files?
A: Currently the images are not indexed. In order for us to index your images, you should create HTML pages for them. To increase the likelihood of us returning your images in our search results, please read the tips in our Help Center.Q: How are links treated in PDF documents?
A: Generally links in PDF files are treated similarly to links in HTML: they can pass PageRank and other indexing signals, and we may follow them after we have crawled the PDF file. It’s currently not possible to "nofollow" links within a PDF document.Q: How can I prevent my PDF files from appearing in search results; or if they already do, how can I remove them?
A: The simplest way to prevent PDF documents from appearing in search results is to add an X-Robots-Tag: noindex in the HTTP header used to serve the file. If they’re already indexed, they’ll drop out over time if you use the X-Robot-Tag with the noindex directive. For faster removals, you can use the URL removal tool in Google Webmaster Tools.Q: Can PDF files rank highly in the search results?
A: Sure! They’ll generally rank similarly to other webpages. For example, at the time of this post, [mortgage market review], [irs form 2011] or [paracetamol expert report] all return PDF documents that manage to rank highly in our search results, thanks to their content and the way they’re embedded and linked from other webpages.Q: Is it considered duplicate content if I have a copy of my pages in both HTML and PDF?
A: Whenever possible, we recommend serving a single copy of your content. If this isn’t possible, make sure you indicate your preferred version by, for example, including the preferred URL in your Sitemap or by specifying the canonical version in the HTML or in the HTTP headersof the PDF resource. For more tips, read our Help Center article about canonicalization.Q: How can I influence the title shown in search results for my PDF document?
A: We use two main elements to determine the title shown: the title metadata within the file, and the anchor text of links pointing to the PDF file. To give our algorithms a strong signal about the proper title to use, we recommend updating both.If you want to learn more, watch Matt Cutt’s video about PDF files’ optimization for search, and visit our Help Center for information about the content types we’re able to index. If you have feedback or suggestions, please let us know in the Webmaster Help Forum.
As an end-user my personal preference would be to have a regular webpage over a PDF. However a PDF is always going to be indexed course and can be of great value. I personally would go with the individual page versus the extra clicking that it takes to download or view a PDF person does not have the right plug-in install another browser then they cannot appear in their browser and will have to download some people have apprehension about downloading anything at all.For instance this PDF is something that does not rank as well the website counterpart
I hope this is been of help to you sincerely,Thomas
-
I apologize for that confusing post. Using a voice recognition system. So I apologize for any errors
I should've worded my question differently instead of asking I assume that you're using WordPress correct?
I should've asked are you using WordPress or not?
If you were using WordPress in all honesty it would be easier to reference a plug-in for Pagination and that's why I was asking.
What I said below unfortunately it was an error I apologize for that.
"I would of course index your house.
I don't know what this means."
I would of course indexed the webpage is what I meant essentially. I'm sorry for the error
"What exactly should I check using OSE, and what actions should I take in response to what findings?"
Please check your inbound links going to articles that you know are from quality websites if you were to lose those even if they're pointing to partial articles for full articles do you believe that your site would lose page rank?
My reference to open site Explorer was in reply to this you stating the problem with inbound links not counting towards the authority of individual pages and full article inbound links will be worthless.
"The problem: Inbound links to individual articles in the series will only count toward the authority of those individual pages, and inbound links to the full article will be worthless."
Using open site Explorer you can figure out how many inbound links of value are pointing to your articles and they would. OSE gives you the total for your root domain as well as your page and subdomain. Good links regardless of if their pointed at articles or pointed at a secular page would Make a stronger website altogether if they are pointing to any page on your site at all. They would give you stronger domain trust along with stronger page rank.
Please review the articles about new Pagination
However I would strongly recommend handling Pages this way in this article here.
http://searchengineland.com/google-provides-new-options-for-paginated-content-92906
http://googlewebmastercentral.blogspot.com/2011/09/view-all-in-search-results.html
http://googlewebmastercentral.blogspot.com/2011/09/pagination-with-relnext-and-relprev.html
I am very sorry that my first answer was not very helpful to you. I hope this one is of more help and I appreciate you letting me know that I have made an error.
I sincerely hope that this will be of more help to you and answer your question fully.
New Handling of View All Pages
Google has been evolving their detection of a series of component pages and the corresponding view all page. When you have a view all page and paginated URLs with a detectable pattern, Google clusters those together and consolidates the PageRank value and indexing relevance. Basically, all of the paginated URLs are seen as components in a series that rolls up to the view all page. In most cases, Google has found that the best experience for searchers is to rank the view all page in search results. (You can help this process along by using the rel=”canonical” attribute to point all pages to the view all version.)
If You Don’t Want The View All Page To Rank Instead of Paginated URLs
If you don’t want the view all version of your page shown and instead want individual paginated URLs to rank, you can block the view all version with robots.txt or meta noindex. You can also use the all new rel=”next”/rel=”prev” attributes, so read on!
New Pagination Options
If you don’t have a view all page, or you don’t want the view all page to be what appears in search results, you can use the new attributes rel=”next” and rel=”prev” to cluster all of the component pages into a single series. All of the indexing properties for all components in the series are consolidated and the most relevant page in the series will rank for each query. (Yay!)
You can use these attributes for article pagination, product lists, and any other types of pagination your site might have. The first page of the series has only a rel=”next” attribute and the last page of the series has only a rel=”prev” attribute, and all other pages have both. You can still use the rel=”canonical” attribute on all pages in conjunction.
Typically, in this setup, as Google sees all of these component pages as series, the first page of the series will rank, but there may be times when another page is more relevant and will rank instead. In either case, the indexing signals (such as incoming links) are consolidated and shared by the series.
Make sure that the value of rel=”next” and rel=”prev” match the URL (even if it’s non-canonical) as the rel/next values in the series have to match up (you likely will need to dynamically write the values based on the display URL).
There are lots of intricacies to consider here, and I’m working on an in-depth article that runs through everything that came up in the session, so if you have questions, post them here and I’ll add them in!
if you strongly desire your view-all page not to appear in search results: 1) make sure the component pages in the series don’t include rel=”canonical” to the view-all page, and 2) mark the view-all page as “noindex” using any of the standard methods
A few points to mention:
-
The first page only contains rel=”next” and no rel=”prev” markup.
-
Pages two to the second-to-last page should be doubly-linked with both rel=”next” and rel=”prev” markup.
-
The last page only contains markup for rel=”prev”, not rel=”next”.
-
rel=”next” and rel=”prev” values can be either relative or absolute URLs (as allowed by the tag). And, if you include a
<base>
link in your document, relative paths will resolve according to the base URL. -
rel=”next” and rel=”prev” only need to be declared within the section, not within the document .
-
We allow rel=”previous” as a syntactic variant of rel=”prev” links.
-
rel="next" and rel="previous" on the one hand and rel="canonical" on the other constitute independent concepts. Both declarations can be included in the same page. For example, http://www.example.com/article?story=abc&page=2&sessionid=123 may contain:
-
rel=”prev” and rel=”next” act as hints to Google, not absolute directives.
-
When implemented incorrectly, such as omitting an expected rel="prev" or rel="next" designation in the series, we'll continue to index the page(s), and rely on our own heuristics to understand your content.
Sincerely,
Thomas
-
-
Well, the individual articles are entries in a blog while the consolidated article is a separate page altogether. I could 301 the individual article URLs all over to the consolidated article I suppose, but I'd rather not for usability reasons. For the end users I think the ideal really is to have all the individual blog posts as they are, but provide access to the consolidated article.
Should I make the consolidated article a PDF download? Let them all be indexed without canonicals or 301s? I just don't know the best practice here.
-
Since you have consolidated the content and pasted it altogether on a single page who not 301 the old pages to the new consolidated page which has got all the content from the old ones ?
If you can answer NO to all these then I would suggest doing a 301
- Can a user land on the old page and get any information he/she would not have received from the new page?
- Can you think of any reason why the user would want to see the old page ( as opposed to the new one ) ?
-
Expresso, why not just 301 the individual pages to the consolidated article?
-
I assume that you're using WordPress correct?
Nope.
Normally I would say use [rel=canonicals] 100% of the time. For instance if I had a blog post example.com/ABC/
I would obviously use a [rel=canonical]
Same thing if it is example.com/blog/ABC/
use rel=canonicals
To clarify: I am not talking about placing a rel=canonical on a resource located by more than one URL (eg. www.example.com/article-one/ and www.example.com/article-one/?fp=email). I am talking about placing a rel=canonical on each resource in a series, that points to a distinct resource that contains all of the content from all of the resources (eg. www.example.com/article-one/ -> www.example.com/article-all/, www.example.com/article-two/ -> www.example.com/article-all/, etc.).
I would of course index your house.
I don't know what this means.
In all honesty I would check using open site Explorer before you actually change anything. And make sure that your inbound links are not the problem that you're talking about.
What exactly should I check using OSE, and what actions should I take in response to what findings?
-
I assume that you're using WordPress correct?
If you were willing to post the domain I can tell you whether or not you should use rel=canonicals
Normally I would say use the 100% of the time. For instance if I had a blog post example.com/ABC/
I would obviously use a rel=canonicals
Same thing if it is example.com/blog/ABC/
use rel=canonicals
I would of course index your website if it is only showing partial views of your content if you use rel=canonical you will not have to worry about duplicate content issues.
If you're talking about simply changing the post and that will count as a full page I believe you can already do that and you don't have to worry about the page example.com/blog/ taking all the page rank and leaving you with nothing and your articles will rank. However you can simply create new pages instead of new posts in WordPress and that way you would be getting complete inbound link juice to that one secular page.
In all honesty I would check using open site Explorer before you actually change anything. And make sure that your inbound links are not the problem that you're talking about.
http://www.opensiteexplorer.org/
I also recommend using benchwork press hosting from the WPEngine, ZippyKid, Web synthesis, or Pagely they truly do it are worth every cent with the added speed and helpfulness of the WordPress only host.
I hope this is of help sincerely,
Thomas
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why has my home page replaced my sub-category page for set of keywords? Happened 2x in last 2 weeks for day or so only to fix itself. What is going on?
Today I noticed a really weird problem. Our LED Step Lights page (https://www.pegasuslighting.com/led-step-lights.html) has been replaced in the search results with our home page. See screenshot below. As I started to research what was going on, I noticed that this same thing must have happened on January 26 and 27 because in my Analytics I can see that our LED Step Lights sub-cat page had a sudden drop in traffic on those two days only to bounce back again on the 28th. See screenshot below. Our LED Step Lights page has had no changes in content, meta information, or anything in months. We have done no recent link building to this page in years. I don't understand what is going on. This is a popular page for us generating decent traffic. I really don't understand what is going on or even how to try and resolve this problem. I checked our Search Console. No messages. No manual web spam actions. Nothing to suggest that anything is going on except for the weird drops in traffic. Has anyone ever seen this happen before? Does anyone have any ideas as to what may be going on? serp-led-step-lights.png organic-traffic-drops.png search-console-led-step-lights.png
Intermediate & Advanced SEO | | cajohnson0 -
Pagination and View All Pages Question. We currently don't have a canonical tag pointing to View all as I don't believe it's a good user experience so how best we deal with this.
Hello All, I have an eCommerce site and have implemented the use rel="prev" and rel="next" for Page Pagination. However, we also have a View All which shows all the products but we currently don't have a canonical tag pointing to this as I don't believe showing the user a page with shed loads of products on it is actually a good user experience so we havent done anything with this page. I have a sample url from one of our categories which may help - http://goo.gl/9LPDOZ This is obviously causing me duplication issues as well . Also , the main category pages has historically been the pages which ranks better as opposed to Page 2, Page 3 etc etc. I am wondering what I should do about the View All Page and has anyone else had this same issue and how did they deal with it. Do we just get rid of the View All even though Google says it prefers you to have it ? I also want to concentrate my link juice on the main category pages as opposed being diluted between all my paginated pages ? - Does anyone have any tips on how to best do this and have you seen any ranking improvement from this ? Any ideas greatly appreciated. thanks Peter
Intermediate & Advanced SEO | | PeteC120 -
301 redirect for page 2, page 3 etc of an article or feed
Hey guys, We're looking to move a blog feed we have to a new static URL page. We are using 301 redirects but I'm unsure of what to regarding page 2, page 3 etc. of the feed. How do I make sure those urls are being redirected as well? For example: Moving FloridaDentist.com/blog/dental-tips/ to a new page url FloridaDentist.com/dental-tips. So, we are using a 301 on that old url to the new one. My questions is what to do with the other pages like FloridaDentist.com/blog/dental-tips/page/3. How do we make sure that page is also 301'd to the new main url?
Intermediate & Advanced SEO | | RickyShockley0 -
Rel Canonical attribute order
So the position of the attribute effect the rel canonical tags' ability to function? is the way I see it across multiple documents and websites. Having a discussion with someone in the office and there is a website with it set up as: Will that cause any problems? The website is inquestion still has both pages indexed within Google using the SITE:domain.com/product as well as SITE:domain.com/category/product
Intermediate & Advanced SEO | | jasondexter0 -
A Landing Page Goldmine?
If anyone can take a minute to help me out with this, I'd really love to get some expert opinions. I can produce really strong content like a machine and, over the years, I've had tons of pages on my website that had links pointing to them (didn't know about SEO then) deleted and now I'm starting to dig them up. I have dozens with a moz rank higher than 25. My question is what do I do with these urls, should I rewrite them and get the innerlinking strength or should I do a 301 redirect to a similar page? Considering the incoming links and individual seomoz pr rank of these pages , am I sitting on something valuable?
Intermediate & Advanced SEO | | ksundheim10 -
Links to images on a page diluting page value?
We have been doing some testing with additional images on a page. For example, the page here:
Intermediate & Advanced SEO | | Peter264
http://flyawaysimulation.com/downloads/files/2550/sukhoi-su-27-flanker-package-for-fsx/ Notice the images under the heading Images/Screenshots After adding these images, we noticed a ranking drop for that page (-27 places) in the SERPS. Could the large amount of images - in particular the links on the images (links to the larger versions) be causing it to dilute the value of the actual page? Any suggestions, advice or opinions will be much appreciated.0 -
Where is the best place for Landing Pages to reside on the Home Page?
On this site http://www.austintenantadvisors.com/ I have my main landing pages listed in the navigation under "Types". The reason why I did this is because I am not sure where to insert those on the home page where it does not look spammy to Google and looks natural for users. Obviously they need to appear somewhere on the home page for Google to be able to continue crawling and indexing them. Any thoughts or suggestions would be appreciated.
Intermediate & Advanced SEO | | webestate0 -
Rel Alternate tag and canonical tag implementation question
Hello, I have a question about the correct way to implement the canoncial and alternate tags for a site supporting multiple languages and markets. Here's our setup. We have 3 sites, each serving a specific region, and each available in 3 languages. www.example.com : serves the US, default language is English www.example.ca : serves Canada, default language is English www.example.com.mx : serves Mexico, default language is Spanish In addition, each sites can be viewed in English, French or Spanish, by adding a language specific sub-directory prefix ( /fr , /en, /es). The implementation of the alternate tag is fairly straightforward. For the homepage, on www.example.com, it would be: -MX” href=“http://www.example.com.mx/index.html” /> -MX” href=”http://www.example.com.mx/fr/index.html“ />
Intermediate & Advanced SEO | | Amiee
-MX” href=”http://www.example.com.mx/en/index.html“ />
-US” href=”http://www.example.com/fr/index.html” />
-US” href=”http://www.example.com/es/index.html“ />
-CA” href=”http://www.example.ca/fr/index.html” />
-CA” href=”http://www.example.ca/index.html” />
-CA” href=”http://www.example.ca/es/index.html” /> My question is about the implementation of the canonical tag. Currently, each domain has its own canonical tag, as follows: rel="canonical" href="http://www.example.com/index.html"> <link rel="canonical" href="http: www.example.ca="" index.html"=""></link rel="canonical" href="http:>
<link rel="canonical" href="http: www.example.com.mx="" index.html"=""></link rel="canonical" href="http:> I am now wondering is I should set the canonical tag for all my domains to: <link rel="canonical" href="http: www.example.com="" index.html"=""></link rel="canonical" href="http:> This is what seems to be suggested on this example from the Google help center. http://support.google.com/webmasters/bin/answer.py?hl=en&answer=189077 What do you think?0