Rel=canonical + no index
-
We have been doing an a/b test of our hp and although we placed a rel=canonical tag on the testing page it is still being indexed. In fact at one point google even had it showing as a sitelink . We have this problem through out our website. My question is:
What is the best practice for duplicate pages?
1. put only a rel= canonical pointing to the "wanted original page"
2. put a rel= canonical (pointing to the wanted original page) and a no index on the duplicate version
Has anyone seen any detrimental effect doing # 2?
Thanks
-
Interesting - I've very rarely had issues with GWO, but if a new URL was created and someone linked to it, I can see where you might have a problem.
(1) None of these things are absolute, I'm afraid, but typically, yes - a rel=canonical to a different page should keep the first page out of the index.
(2) Usually, but it depends. The problem here may be that Google just isn't crawling the test variant very often, so they may not be processing the rel=canonical yet.
If it's just a couple of pages, I'd give it time - it's probably not an emergency situation. Again, you could just tell Google to remove them in GWT. I think you're doing the right thing with the canonical tags, but it can take Google time to process them the way you want to, in practice.
-
To answer the second question :
We actually use google's website optimizer to run our test -- the problem started when someone linked to the test page....
Not sure if these scenarios are different for google -- but just trying to understand it
1. if a page was never indexed before and you put a rel= canonical on it (pointing to a different page) than the rel = canonical will keep it out of the index?
2. If a page was already in the index and you put on rel=canonical is that a strong enough signal for google to go and remove it from the index?
obviously both these scenarios are once the pages have been crawled
-
I wouldn't mix those signals - it's nearly impossible to tell what's working if you do. If the canonical on the test page isn't working, there may be a couple of issues:
(1) It could just be taking time. Honestly, it's never as fast as you want it to be.
(2) It may be that the test versions got crawled originally, but now aren't being crawled (on the canonical isn't being processed). Check the cache date on the test page.
The big question is how they got crawled in the first place. It's often better to use some sort of cookie-based implementation so that Google never even sees the B version. That's how most of the A/B test implementations work (specifically to avoid this problem).
If it's just a couple of URLs and you can't shake them, you could request manual removal in GWT. That really depends on the scope and URL structure, though.
-
Good point, i was thinking of robots.txt, where the page would not eb read.
But I have not thought about that situation. i am not sure what search engines would do.
But still, just the canonical is needed.
-
A page that has a no index on it still gets crawled and therefore the rel=canonical directive is still "seen" by the bot --- so why wouldn't the rel=canonical pass the credit over?
-
Just the rel canonical
if you no index the page, the rel canonical can not be indexed and can not work
Rel canonical simply passes the credit for the content to the canonical page.
no index is like cutting off your hand because you have a splinter. links pointing to a non indexed page are puring link juice into thin air.
You can use a mete noindex , follow so that some of the link juice is returned, but canonical is best for duplicate content.
Actualy getting rid of the duplicate content is best
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Canonicalization, does it still index
If I have 2 pages that are identical but on different domains that our team manages, if we place a rel=canonical tag on the page we prefer/should display, will the page that doesn't have the canonical tag still be indexed and show on SERPs?
Technical SEO | | kroe10 -
Google Indexing of Site Map
We recently launched a new site - on June 4th we submitted our site map to google and almost instantly had all 25,000 URL's crawled (yay!). On June 18th, we made some updates to the title & description tags for the majority of pages on our site and added new content to our home page so we submitted a new sitemap. So far the results have been underwhelming and google has indexed a very low number of the updated pages. As a result, only a handful of the new titles and descriptions are showing up on the SERP pages. Any ideas as to why this might be? What are the tricks to having google re-index all of the URLs in a sitemap?
Technical SEO | | Emily_A0 -
Google not indexing my website
Hi guys, We have this website http://www.m-health-expo.nl/ but it is not indexed by google. In webmaster tools google says that it can not fetch the site due to the robots.txt but i do not see any faults in it. http://www.m-health-expo.nl/robots.txt Do you see something strange, it really bothers me.
Technical SEO | | RuudHeijnen0 -
Pages to be indexed in Google
Hi, We have 70K posts in our site but Google has scanned 500K pages and these extra pages are category pages or User profile pages. Each category has a page and each user has a page. When we have 90K users so Google has indexed 90K pages of users alone. My question is. Should we leave it as they are or should we block them from being indexed? As we get unwanted landings to the pages and huge bounce rate. If we need to remove what needs to be done? Robots block or Noindex/Nofollow Regards
Technical SEO | | mtthompsons0 -
Rel="canonical"
HI, I have site named www.cufflinksman.com related to Cufflinks. I have also install WordPress in sub domain blog.cufflinksman.com. I am getting issue of duplicate content a site and blog have same categories but content different. Now I would like to rel="canonical" blog categories to site categories. http://www.cufflinksman.com/shop-cufflinks-by-hobbies-interests-movies-superhero-cufflinks.html http://blog.cufflinksman.com/category/superhero-cufflinks-2/ Is possible and also have any problem with Google with this trick?
Technical SEO | | cufflinksman0 -
Rel="canonical" and rewrite
Hi, I'm going to describe a scenario on one of my sites, I was wondering if someone could tell me what is the correct use of rel="canonical" here. Suppose I have a rewrite rule that has a rule like this: RewriteRule ^Online-Games /main/index.php So, in the index file, do I set the rel="canonical" to Online-Games or /main/index.php? Thanks.
Technical SEO | | webtarget0 -
On page audit throws a rel="canonical" curve ball :-(
Good Morning from -3 Degrees C, still no paths gritted wetherby UK 😞 Following an on page audit one recommendation instructs me to ad:
Technical SEO | | Nightwing
http://www.barrettsteel.com/" /> on the home page of barrett steel. I'm confused, i thought i only had to add this to duplications
the home page which to my knowledge dont exist. So my question is please: "Why shoul i ad this snippet of code on the home page of http://www.barrettsteel.com http://www.barrettsteel.com/" /> Any insights welcome 🙂0 -
Should i use NoIndex, Follow & Rel=Canonical Tag In One Page?
I am having pagination problem with one of my clients site , So I am deciding to use noindex, follow tag for the Page 2,3,4 etc for not to have duplicated content issue, Because obviously SEOMoz Crawl Diagnostics showing me lot of duplicate page contents. And past 2 days i was in constant battle whether to use noindex, follow tag or rel=canonical tag for the Page 2,3,4 and after going through all the Q&A,None of them gives me crystal clear answer. So i thought "Why can't i use 2 of them together in one page"? Because I think (correct me if i am wrong) 1.noindex, follow is old and traditional way to battle with dup contents
Technical SEO | | DigitalJungle
2.rel=canonical is new way to battle with dup contents Reason to use 2 of them together is: Bot finds to the non-canonical page first and looks at the tag nofollow,index and he knows not to index that page,meantime he finds out that canonical url is something something according to the url given in the tag,NO? Help Please???0