Are pages with a canonical tag indexed?
-
Hello here,
here are my questions for you related to the canonical tag:
1. If I put online a new webpage with a canonical tag pointing to a different page, will this new page be indexed by Google and will I be able to find it in the index?
2. If instead I apply the canonical tag to a page already in the index, will this page be removed from the index?
Thank you in advance for any insights!
Fabrizio
-
Yes, I will look into doing that on GWT.
Was a nice and useful chat indeed! Thank you again.
-
Sorry Fabrizio I got mad with my old answer
that canonical doesn't make sense with a noindex, with noindex follow.you're completely fine.
Summing up I think that you have many parameters so you should try to write them down and define the role of each one.
Then add them in GWT and choose there which are the ones which doesn't add any value and which you want to "block" (instead of putting a noindex).
The valuable ones (the one which adds value and changes content) should contain the self canonical and paginated next/prev. If you can get rid of unesful parameters it could be better so to have cleaner and shorter urls.
Just be sure that you're mainly using the most important parameters so you're consistent with your strategy.
Hope this will clear your doubts, it was a nice chat!
-
Yes, actually I could get rid of the lpg parameter (it wasn't really needed!), so now the tag definitions are (for the 3rd page of the Guitar index):
<LINK rel="<a class="attribute-value">next</a>" href="[http://www.virtualsheetmusic.com/downloads/Indici/Guitar.html?cp=4](view-source:http://www.virtualsheetmusic.com/downloads/Indici/Guitar.html?cp=4)"> <LINK rel="<a class="attribute-value">prev</a>" href="[http://www.virtualsheetmusic.com/downloads/Indici/Guitar.html?cp=2](view-source:http://www.virtualsheetmusic.com/downloads/Indici/Guitar.html?cp=2)"> <LINK rel="<a class="attribute-value">canonical</a>" href="[http://www.virtualsheetmusic.com/downloads/Indici/Guitar.html?cp=3](view-source:http://www.virtualsheetmusic.com/downloads/Indici/Guitar.html?cp=3)">
Now, the only doubt I still have is to add or not add the noindex tag to the page when it is requested to be displayed in a different way (such as the "table view" or a different item display order). In my opinion, if I stick with the canonical tag I don't need a noindex directive. What do you think?
-
Yeah, to be fair, I'm not clear on what all of the additional parameters (like "lpg=") do, so this can get tricky fast. Basically, look at it this way:
If the URL is:
example.com/page=3?param=xThen the tags should point to:
Rel=prev:
example.com/page=2?param=xRel=next:
example.com/page=4?param=xRel=canonical:
example.com/page=3 (no parameters)Some parameters may not be indexed and/or functional, though, so individual cases can vary. You may choose to ignore some parameters in Google Webmaster Tools, for example. It gets tricky as the parameter list grows.
-
Mememax, after thinking I have some doubts though about what you have suggested.
Why I want to put a noindex tag to the page displaying the list in "table view" if I already have a canonical tag pointing to the "regular view" page? Wouldn't the canonical tag be enough for the purpose of telling that the "real" canonical page is the "regular view" version? I am asking this because if I want to apply a noindex tag to that kind of different view, I may want to do the same to the list displayed with a different order, and for any other different way of displaying the list, etc... hence just using the canonical tag would be appropriate, pointing always to the "regular list" view, no matter what kind of "filtering" or "different view' option is selected. What do you think?
In other words, I don't think I need to include a noindex tag for any different kind of view the user requests as long as I provide a canonical tag pointing to the regular view list.
Am I correct?
-
Yes, thank you Mememax, I agree with you 100%. That makes perfect sense and I will work on that tomorrow morning. I am eager to know Dr. Peter thoughts and confirmation.
On my side, I think I got it cleared-up now. Thank you very much again!
-
Thank you ! That makes sense now.
-
Hey Fabrizio, I think that what Google states in their guidelines is that you have two choices:
- if you have a view all page, you should noindex and follow all your other pages so google will deliver only that page
- if you don't have a view all page or if you prefer to show paginated series (i.e. to make pages lighter and quicker to deliver to users) you may consider to use rel next/prev.
In this second case it may happen that you also add filters or session ids in the urls of those pages, in that case you should consider adding a self referentail canonical tag to avoid duplicates. But this is only if you cover this case, if you're looking to canonicalize correctly your paginated series you may not use the self canonical tag, because if not properly implemented this may get you a bit of extra work.
In this page for example
I found this:
http://www.virtualsheetmusic.com/downloads/Indici/Guitar.html?cp=3&lpg=0">
Which I don't think is what you want to do.
Also if you set the page to view as a table: your url changes to http://www.virtualsheetmusic.com/downloads/Indici/Guitar.html?cp=3&viewlistflag=1
and while the canonical should remain the same (well done but I think you should get rid of the lpg parameter in the canonical), the rel next prev should change accordingly IMO.
So instead of being:
prev: http://www.virtualsheetmusic.com/downloads/Indici/Guitar.html?cp=2&lpg=20
next: http://www.virtualsheetmusic.com/downloads/Indici/Guitar.html?cp=4&lpg=60you should offer the next and prev page of the filtered url:
next: http://www.virtualsheetmusic.com/downloads/Indici/Guitar.html?cp=4&lpg=60&viewlistflag=1
prev: http://www.virtualsheetmusic.com/downloads/Indici/Guitar.html?cp=2&lpg=20&viewlistflag=1Or in this case (since the content is almost the same you may consider the list page as the canonical of the table one putting there a noindex.
Summing up, IMO: in this page http://www.virtualsheetmusic.com/downloads/Indici/Guitar.html?cp=3
you'll have:
prev: http://www.virtualsheetmusic.com/downloads/Indici/Guitar.html?cp=4&lpg=60
next: http://www.virtualsheetmusic.com/downloads/Indici/Guitar.html?cp=2&lpg=20
(optional) a self canonical to http://www.virtualsheetmusic.com/downloads/Indici/Guitar.html?cp=3In this page (and in other filtered pages if you have apply the same idea):
http://www.virtualsheetmusic.com/downloads/Indici/Guitar.html?cp=3&viewlistflag=1You'll have:
noindex,follow and canonical to the list page:
http://www.virtualsheetmusic.com/downloads/Indici/Guitar.html?cp=3maybe dr peter can correct me if I'm wrong but I think this should be more consistent like this. Sorry for the huge answer
-
Wow, yes - sorry about that. I've updated it. Google original write-up actually covers this case, too (it's toward the end):
http://googlewebmastercentral.blogspot.com/2011/09/pagination-with-relnext-and-relprev.html
-
Please, have a look at the page below, I have modified the canonical tag as suggested:
http://www.virtualsheetmusic.com/downloads/Indici/Guitar.html?cp=3&lpg=40
Is that correct?
Thank you again very much.
-
Thank you Peter, I guess you meant to have the "canonical" tag as last tag in your example above, and also the previous rel=next and rel=prev definitions should be inverted:
Am I correct? That makes sense. If so, I will update my site to reflect this.
Thank you for the link!
-
This gets tricky fast. Google currently wants rel=prev/next to contain the parameters currently in use (like sorts) for the page you're on and then wants you rel-canonical to the non-parameterized version. So, if the URL is:
http://www.virtualsheetmusic.com/downloads/Indici/Guitar.html?cp=3&lpg=40
...then the tags should be...
Yeah, it's a bit strange. They have suggested that it's ok to rel-canonical to a "View All" page, but with the kind of product volume you have, that's generally a bad idea (for users and search). The have specifically recommended against setting rel-canonical to Page 1 of search results, especially if you use rel=prev/next.
Rel=prev/next will still show pages in the index, but I've found it to work pretty well. The other option is the more classic approach to simple META NOINDEX, FOLLOW pages 2+. That can still be effective, but it's getting less common.
Adam Audette has generally strong posts about this topic - here's a good, recent one:
http://searchengineland.com/the-latest-greatest-on-seo-pagination-114284
-
Thank you for your post, and I think you have just opened a doubt I had, and that's exactly what also concerned me.
Have a look at this typical category page of ours:
http://www.virtualsheetmusic.com/downloads/Indici/Guitar.html
For that category pagination, I have implemented the rel=prev/next as suggested by Google, but being afraid to be penalized for duplicate content, I also put a canonical tag pointing at the first page of that index. Should I have put the canonical tag pointing to the page series itself?
Something like:
http://www.virtualsheetmusic.com/downloads/Indici/Guitar.html?cp=2
for the second page instead of the general:
http://www.virtualsheetmusic.com/downloads/Indici/Guitar.html
as I am currently doing?
Thanks!
-
I have to disagree on this one. If Google honors a canonical tag, the non-canonical page will generally disappear from the index, at least inasmuch as we can measure it (with "site:", getting it to rank, etc.). It's a strong signal in many cases.
This is part of the reason Google introduced rel=prev/next for paginated content. With canonical, pages in the series aren't usually able to rank. Rel=prev/next allows them to rank without clogging up the index (theoretically). For search pagination, it's generally a better solution.
If your paginated content is still showing in large quantities in the index, Google may not be honoring the canonical tag properly, and they could be causing duplicate content issues. It depends on the implementation, but they recommend these days that you don't canonical to the first page of search results. Google may choose to ignore the tag in some cases.
-
Thank you very much, that makes perfect sense. In my case, I am talking exactly about paginated content, and that's probably why all pages are in the index despite they are canonicalized to point to the main page. So, I guess that even if you have thousands of paginated pages indexed (mine is a pretty big e-commerce website), that's not going to be an issue. Am I right?
-
Normally the only thing which will prevent a page from ranking is noindex tag. If you don't want to have it indexed just noindex it, if that page has been laready indexed, put the noindex tag and delete from index using GWT option.
Concerning the canonical tag thing, it will consolidate the seo value in one page but it won't prevent those page to appear in rankings, however you may have two cases:
- the two or more pages are identical. In that case google may accept the canonicalization and show always the original page.
- the two or more pages are slightly different, it's the case of paginated pages which are canonicalized using rel next/prev. In that sense the whole value will be consolidated in page 1 but then the page which will be shown in the rankings will be the one which responds to that query, for example if someone is looking for blue glass, google will return the page which shows blue glass listing if that's different from the first one.
Hope this may help you!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Paginated Pages Page Depth
Hi Everyone, I was wondering how Google counts the page depth on paginated pages. DeepCrawl is showing our primary pages as being 6+ levels deep, but without the blog or with an infinite scroll on the /blog/ page, I believe it would be only 2 or 3 levels deep. Using Moz's blog as an example, is https://moz.com/blog?page=2 treated to be on the same level in terms of page depth as https://moz.com/blog? If so is it the https://site.comcom/blog" /> and https://site.com/blog?page=3" /> code that helps Google recognize this? Or does Google treat the page depth the same way that DeepCrawl is showing it with the blog posts on page 2 being +1 in page depth compared to the ones on page 1, for example? Thanks, Andy
Intermediate & Advanced SEO | | AndyRSB0 -
Why do I have so many extra indexed pages?
Stats- Webmaster Tools Indexed Pages- 96,995 Site: Search- 97,800 Pages Sitemap Submitted- 18,832 Sitemap Indexed- 9,746 I went through the search results through page 28 and every item it showed was correct. How do I figure out where these extra 80,000 items are coming from? I tried crawling the site with screaming frog awhile back but it locked because of so many urls. The site is a Magento site so there are a million urls, but I checked and all of the canonicals are setup properly. Where should I start looking?
Intermediate & Advanced SEO | | Tylerj0 -
Alternative HTML Structure for indexation of JavaScript Single Page Content
Hi there, we are currently setting up a pure html version for Bots on our site amazine.com so the content as well as navigation will be fully indexed by google. We will show google exactly the same content the user sees (except for the fancy JS effects). So all bots get pure html and real users see the JS based version. My questions are first, if everyone agrees that this is the way to go or if there are alternatives to this to get the content indexed. Are there best practices? All JS-based websites must have this problem, so I am hoping someone can share their experience. The second question regards the optimal number of content pieces ('Stories') displayed per page and the best method to paginate. Should we display e.g. 10 stories and use ?offset in the URL or display 100 stories to google per page and maybe use rel=”next”/"pref" instead. Generally, I would really appreciate any pointers and experiences from you guys as we haven't done this sort of thing before! Cheers, Frank
Intermediate & Advanced SEO | | FranktheTank-474970 -
Rel=Canonical to Longer Page?
We've got a series of articles on the same topic and we consolidated the content and pasted it altogether on a single page. We linked from each individual article to the consolidated page. We put a noindex on the consolidated page. The problem: Inbound links to individual articles in the series will only count toward the authority of those individual pages, and inbound links to the full article will be worthless. I am considering removing the noindex from the consolidated article and putting rel=canonicals on each individual post pointing to the consolidated article. That should consolidate the PageRank. But I am concerned about pointing****a rel=canonical to an article that is not an exact duplicate (although it does contain the full text of the original--it's just that it contains quite a bit of additional text). An alternative would be not to use rel=canonicals, nor to place a noindex on the consolidated article. But then my concern would be duplicate content and unconsolidated PageRank. Any thoughts?
Intermediate & Advanced SEO | | TheEspresseo0 -
Why the archive sub pages are still indexed by Google?
Why the archive sub pages are still indexed by Google? I am using the WordPress SEO by Yoast, and selected the needed option to get these pages no-index in order to avoid the duplicate content.
Intermediate & Advanced SEO | | MichaelNewman1 -
Rel=canonical on image pages
Hi, Im working on a Wordpress hosted blog site. I recently did a "site:search" in Google for a specific article page to make sure it was getting crawled, and it returned three separate URLs in the search results. One was the article page, and the other two were the URLs that hosted the images that are found in the article. Would you suggest adding the rel=canonical tag to the pages that host the images so they point back to the actual context article page? Or are they fine being left alone? Thank you!
Intermediate & Advanced SEO | | dbfrench0 -
Getting 260,000 pages re-indexed?
Hey there guys, I was recently hired to do SEO for a big forum to move the site to a new domain and to get them back up to their ranks after this move. This all went quite well, except for the fact that we lost about 1/3rd of our traffic. Although I expected some traffic to drop, this is quite a lot and I'm wondering what it is. The big keywords are still pulling the same traffic but I feel that a lot of the small threads on the forums have been de-indexed. Now, with a site with 260,000 threads, do I just take my loss and focus on new keywords? Or is there something I can do to get all these threads re-indexed? Thanks!
Intermediate & Advanced SEO | | StefanJDorresteijn0 -
301 Redirect or Canonical Tag or Leave Them Alone? Different Pages - Similar Content
We currently have 3 different versions of our State Business-for-Sale listings pages - the versions are: **Version 1 -- Preferred Version: ** http://www.businessbroker.net/State/California-Businesses_For_Sale.aspx Title = California Business for Sale Ads - California Businesses for Sale & Business Brokers - Sell a Business on Business Broker Version 2: http://www.businessbroker.net/Businesses_For_Sale-State-California.aspx Title = California Business for Sale | 3124 California Businesses for Sale | BusinessBroker.net Version 3: http://www.businessbroker.net/listings/business_for_sale_california.ihtml Title = California Businesses for Sale at BusinessBroker.net - California Business for Sale While the page titles and meta data are a bit different, the bulk of the page content (which is the listings rendered) are identical. We were wondering if it would make good sense to either (A) 301 redirect Versions 2 and 3 to the preferred Version 1 page or (B) put Canonical Tags on Versions 2 and 3 labeling Version 1 as the preferred version. We have this issue for all 50 U.S. States -- I've mentioned California here but the same applies for Alabama through Wyoming - same issue. Given that there are 3 different flavors and all are showing up in the Search Results -- some on the same 1st page of results -- which probably is a good thing for now -- should we do a 301 redirect or a Canonical Tag on Versions 2 and 3? Seems like with Google cracking down on duplicate content, it might be wise to be proactive. Any thoughts or suggestions would be greatly appreciated! Thanks. Matt M
Intermediate & Advanced SEO | | MWM37720