Rel=canonical + no index
-
We have been doing an a/b test of our hp and although we placed a rel=canonical tag on the testing page it is still being indexed. In fact at one point google even had it showing as a sitelink . We have this problem through out our website. My question is:
What is the best practice for duplicate pages?
1. put only a rel= canonical pointing to the "wanted original page"
2. put a rel= canonical (pointing to the wanted original page) and a no index on the duplicate version
Has anyone seen any detrimental effect doing # 2?
Thanks
-
Interesting - I've very rarely had issues with GWO, but if a new URL was created and someone linked to it, I can see where you might have a problem.
(1) None of these things are absolute, I'm afraid, but typically, yes - a rel=canonical to a different page should keep the first page out of the index.
(2) Usually, but it depends. The problem here may be that Google just isn't crawling the test variant very often, so they may not be processing the rel=canonical yet.
If it's just a couple of pages, I'd give it time - it's probably not an emergency situation. Again, you could just tell Google to remove them in GWT. I think you're doing the right thing with the canonical tags, but it can take Google time to process them the way you want to, in practice.
-
To answer the second question :
We actually use google's website optimizer to run our test -- the problem started when someone linked to the test page....
Not sure if these scenarios are different for google -- but just trying to understand it
1. if a page was never indexed before and you put a rel= canonical on it (pointing to a different page) than the rel = canonical will keep it out of the index?
2. If a page was already in the index and you put on rel=canonical is that a strong enough signal for google to go and remove it from the index?
obviously both these scenarios are once the pages have been crawled
-
I wouldn't mix those signals - it's nearly impossible to tell what's working if you do. If the canonical on the test page isn't working, there may be a couple of issues:
(1) It could just be taking time. Honestly, it's never as fast as you want it to be.
(2) It may be that the test versions got crawled originally, but now aren't being crawled (on the canonical isn't being processed). Check the cache date on the test page.
The big question is how they got crawled in the first place. It's often better to use some sort of cookie-based implementation so that Google never even sees the B version. That's how most of the A/B test implementations work (specifically to avoid this problem).
If it's just a couple of URLs and you can't shake them, you could request manual removal in GWT. That really depends on the scope and URL structure, though.
-
Good point, i was thinking of robots.txt, where the page would not eb read.
But I have not thought about that situation. i am not sure what search engines would do.
But still, just the canonical is needed.
-
A page that has a no index on it still gets crawled and therefore the rel=canonical directive is still "seen" by the bot --- so why wouldn't the rel=canonical pass the credit over?
-
Just the rel canonical
if you no index the page, the rel canonical can not be indexed and can not work
Rel canonical simply passes the credit for the content to the canonical page.
no index is like cutting off your hand because you have a splinter. links pointing to a non indexed page are puring link juice into thin air.
You can use a mete noindex , follow so that some of the link juice is returned, but canonical is best for duplicate content.
Actualy getting rid of the duplicate content is best
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Regarding Mobile first Indexing
My Site name GiftaLove.com Desktop version - https://www.giftalove.com/
Technical SEO | | Packersmove
Mobile version - https://m.giftalove.com/ How to enable mobile first Indexing in Desktop and Mobile version sites. Not found any message from both sites desktop and mobile version. Please resolve my Issue.0 -
Rel=canonical Weebly
My problem is with my website as it says I have duplicate page titles and contents because of a /index.html. It says the duplicate content is due to the fact that my homepage on my website is www.seacandytackle.com but it is also www.seacandytackle.com/index.html because I use weebly. How can I use the tag to fix this? It won't let me do a 301 redirect because it is a home page. How can I fix this? What code would I have to use and which url? Also it says that I have duplicate page content between http://www.seacandytackle.com/index.html and http://www.seacandytackle.comhttp://www.seacandytackle.com but I don't recall having any page that looks like http://www.seacandytackle.com http://www.seacandytackle.com from weebly. How can I fix this issue as well? Thank you for any help. Step by step implementation would be particularly helpful in using the rel= tags to fix these duplicate issues.
Technical SEO | | SeaCandyTackle0 -
CDN Being Crawled and Indexed by Google
I'm doing a SEO site audit, and I've discovered that the site uses a Content Delivery Network (CDN) that's being crawled and indexed by Google. There are two sub-domains from the CDN that are being crawled and indexed. A small number of organic search visitors have come through these two sub domains. So the CDN based content is out-ranking the root domain, in a small number of cases. It's a huge duplicate content issue (tens of thousands of URLs being crawled) - what's the best way to prevent the crawling and indexing of a CDN like this? Exclude via robots.txt? Additionally, the use of relative canonical tags (instead of absolute) appear to be contributing to this problem as well. As I understand it, these canonical tags are telling the SEs that each sub domain is the "home" of the content/URL. Thanks! Scott
Technical SEO | | Scott-Thomas0 -
WordPress post indexation speed
Has anyone noticed any increases in the length of time it takes for WP posts to get indexed by Google? I have a website with the following: domain.com - CMS with lots of pages/content blog.domain.com - subdomain for the blog using WP It's odd.. the pages on the main site are indexed almost immediately. The posts on the blog are taking between 2-5 days. The blog posts are all unique content, here's an example of a recent one: blog.looksfishy.co.uk/2013/three-rivers-angling/
Technical SEO | | edwardlewis0 -
Should I implement pagination(rel=next, rel=prev) if I have duplicate meta tags?
Hi, I just want to ask if it is necessary to implement pagination(rel=next, rel=prev) to my category pages because Google webmaster tools is telling me that these pages are having similar meta title and meta description. Ex. page1: http://www.site.com/iphone-resellers/1 meta title:Search for iphone resellers in US page2:http://www.site.com/iphone-resellers/2 meta title:Search for iphone resellers in US page3:http://www.site.com/iphone-resellers/3 meta title:Search for iphone resellers in US Thanks in advance. 🙂
Technical SEO | | esiow20130 -
Will I still get Duplicate Meta Data Errors with the correct use of the rel="next" and rel="prev" tags?
Hi Guys, One of our sites has an extensive number category page lsitings, so we implemented the rel="next" and rel="prev" tags for these pages (as suggested by Google below), However, we still see duplicate meta data errors in SEOMoz crawl reports and also in Google webmaster tools. Does the SEOMoz crawl tool test for the correct use of rel="next" and "prev" tags and not list meta data errors, if the tags are correctly implemented? Or, is it necessary to still use unique meta titles and meta descriptions on every page, even though we are using the rel="next" and "prev" tags, as recommended by Google? Thanks, George Implementing rel=”next” and rel=”prev” If you prefer option 3 (above) for your site, let’s get started! Let’s say you have content paginated into the URLs: http://www.example.com/article?story=abc&page=1
Technical SEO | | gkgrant
http://www.example.com/article?story=abc&page=2
http://www.example.com/article?story=abc&page=3
http://www.example.com/article?story=abc&page=4 On the first page, http://www.example.com/article?story=abc&page=1, you’d include in the section: On the second page, http://www.example.com/article?story=abc&page=2: On the third page, http://www.example.com/article?story=abc&page=3: And on the last page, http://www.example.com/article?story=abc&page=4: A few points to mention: The first page only contains rel=”next” and no rel=”prev” markup. Pages two to the second-to-last page should be doubly-linked with both rel=”next” and rel=”prev” markup. The last page only contains markup for rel=”prev”, not rel=”next”. rel=”next” and rel=”prev” values can be either relative or absolute URLs (as allowed by the tag). And, if you include a <base> link in your document, relative paths will resolve according to the base URL. rel=”next” and rel=”prev” only need to be declared within the section, not within the document . We allow rel=”previous” as a syntactic variant of rel=”prev” links. rel="next" and rel="previous" on the one hand and rel="canonical" on the other constitute independent concepts. Both declarations can be included in the same page. For example, http://www.example.com/article?story=abc&page=2&sessionid=123 may contain: rel=”prev” and rel=”next” act as hints to Google, not absolute directives. When implemented incorrectly, such as omitting an expected rel="prev" or rel="next" designation in the series, we'll continue to index the page(s), and rely on our own heuristics to understand your content.0 -
Rel="canonical" and rewrite
Hi, I'm going to describe a scenario on one of my sites, I was wondering if someone could tell me what is the correct use of rel="canonical" here. Suppose I have a rewrite rule that has a rule like this: RewriteRule ^Online-Games /main/index.php So, in the index file, do I set the rel="canonical" to Online-Games or /main/index.php? Thanks.
Technical SEO | | webtarget0 -
Is this 404 page indexed?
I have a URL that when searched for shows up in the Google index as the first result but does not have any title or description attached to it. When you click on the link it goes to a 404 page. Is it simply that Google is removing it from the index and is in some sort of transitional phase or could there be another reason.
Technical SEO | | bfinternet0