Include Cross Domain Canonical URL's in Sitemap - Yes or No?
-
I have several sites that have cross domain canonical tags setup on similar pages. I am unsure if these pages that are canonicalized to a different domain should be included in the sitemap. My first thought is no, because I should only include pages in the sitemap that I want indexed.
On the other hand, if I include ALL pages on my site in the sitemap, once Google gets to a page that has a cross domain canonical tag, I'm assuming it will just note that and determine if the canonicalized page is the better version. I have yet to see any errors in GWT about this. I have seen errors where I included a 301 redirect in my sitemap file. I suspect its ok, but to me, it seems that Google would rather not find these URL's in a sitemap, have to crawl them time and time again to determine if they are the best page, even though I'm indicating that this page has a similar page that I'd rather have indexed.
-
I looked at the sitemap, and they are including the http://www.seomoz.org/blog/the-story-of-seomoz but not the canonical page - http://www.masternewmedia.org/entrepreneurship-the-full-story-of-seomoz-told-by-rand-fishkin/
So based on this example, the page on SEOMoz is still included in the sitemap, regardless if it has a canonical or not.
This seems to make sense, since canonical links are used only as a hint and not an absolute directive.
I also noticed that Google is choosing to index and rank both pages, on Page 1.
SEOMoz is ranking higher on my browser for "the full story of seomoz". A few things going on here.
-
Why is google choosing to rank SEOMoz higher than Mastermedia.org for this page? There's a canonical setup, but google is choosing not to follow it. (again its a hint not an absolute) this doesn't always work.
-
I would think Google would be able to filter out the duplicate content easy. In this example, they are clearly not. SEOMoz is ranking #4 and Masternewmedia.org is ranking #5 for query "the full story of seomoz"
-
-
Right - as far as I know, you're supposed to put end URLs into a sitemap, not urls which 301 redirect. Cross domain canonical is still kind of new, but I would treat them as a 301 redirect and not include them in a sitemap.
Now, if you're curious, SEO Moz did a whiteboard Friday where they talked about this same exact issue (cross domain canonical), and as an experiment, re-posted a blog article from another blogger on SEO Moz.
http://www.seomoz.org/blog/cross-domain-canonical-the-new-301-whiteboard-friday
http://www.seomoz.org/blog-sitemap.xml
http://www.seomoz.org/blog/the-story-of-seomoz
The blog is still included in the blog sitemap. I think it probably won't 'hurt' to keep those pages in the sitemap, since a lot of sitemaps automatically generated CMS tools won't have been updated to deal with this yet.
-
There is no BIG problem if you add the pages that contain cross domain canonical tag on them. Why?
The reason why I can say this is because Google is not only indexing the pages from sitemap.xml file, Google have their own crawler and they have the ability to crawl and index the website no matter if you do not have an xml sitemap.
Google is very good at (in my opinion) picking the instructions that are available on the page so if you add the page in the xml sitemap, the crawler will read the instructions on the page and will only index the page that contain original content.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Ridding of taxonomies, so that articles enhance related page's value
Hello, I'm developing a website for a law firm, which offers a variety of services. The site will also feature a blog, which would have similarly-named topics. As is customary, these topics were taxonomies. But I want the articles to enhance the value of the service pages themselves and because the taxonomy url /category/divorce has no relationship to the actual service page url /practice-areas/divorce, I'm worried that if anything, a redundantly-titled taxonomy url would dilute the value of the service page it's related to. Sure, I could show some of the related posts on the service page but if I wanted to view more, I'm suddenly bounced over to a taxonomy page which is stealing thunder away from the more important service page. So I did away with these taxonomies all together, and posts are associatable with pages directly with a custom db table. And now if I visit the blog page, instead of a list of category terms, it would technically be a list of the service pages and so if a visitor clicks on a topic they are directed to /practice-areas/divorce/resources (the subpages are created dynamically) and the posts are shown there. I'll have to use custom breadcrumbs to make it all work. Just wondering if you guys had any thoughts on this. Really appreciate any you might have and thanks for reading
Intermediate & Advanced SEO | | utopianwp0 -
Is a Rel Canonical Sufficient or Should I 'NoIndex'
Hey everyone, I know there is literature about this, but I'm always frustrated by technical questions and prefer a direct answer or opinion. Right now, we've got recanonicals set up to deal with parameters caused by filters on our ticketing site. An example is that this: http://www.charged.fm/billy-joel-tickets?location=il&time=day relcanonicals to... http://www.charged.fm/billy-joel-tickets My question is if this is good enough to deal with the duplicate content, or if it should be de-indexed. Assuming so, is the best way to do this by using the Robots.txt? Or do you have to individually 'noindex' these pages? This site has 650k indexed pages and I'm thinking that the majority of these are caused by url parameters, and while they're all canonicaled to the proper place, I am thinking that it would be best to have these de-indexed to clean things up a bit. Thanks for any input.
Intermediate & Advanced SEO | | keL.A.xT.o0 -
Is this all that is needed for a 'canonical' tag?
Hello, I have a Joomla site. I have put in a plugin to make the page source show: eg. <link href="[http://www.ditalia.com.au/designer-fabrics-designer-fabric-italian-material-and-french-lace](view-source:http://www.ditalia.com.au/designer-fabrics-designer-fabric-italian-material-and-french-lace)" rel="<a class="attribute-value">canonical</a>" /> Is this all that is need to tell the search engines to ignore the any other links or indexed pages with a url which is created automatically by the system before the SEF urls are initiated?
Intermediate & Advanced SEO | | infinart0 -
Other domains hosted on same server showing up in SERP for 1st site's keywords
For the website in question, the first domain alphabetically on the shared hosting space, strange search results are appearing on the SERP for keywords associated with the site. Here is an example: A search for "unique company name" shows the results: www.uniquecompanyname.com as the top result. But on pages 2 and 3, we are getting results for the same content but for domains hosted on the same server. Here are some examples with the domain name replaced: UNIQUE DOMAIN NAME PAGE TITLE
Intermediate & Advanced SEO | | Motava
ftp.DOMAIN2.com/?action=news&id=63
META DESCRIPTION TEXT UNIQUE DOMAIN NAME PAGE TITLE 2
www.DOMAIN3.com/?action=news&id=120
META DESCRIPTION TEXT2 UNIQUE DOMAIN NAME PAGE TITLE 2
www.DOMAIN4.com/?action=news&id=120
META DESCRIPTION TEXT2 UNIQUE DOMAIN NAME PAGE TITLE 3
mail.DOMAIN5.com/?action=category&id=17
META DESCRIPTION TEXT3 ns5.DOMAIN6.com/?action=article&id=27 There are more but those are just some examples. These other domain names being listed are other customer domains on the same VPS shared server. When clicking the result the browser URL still shows the other customer domain name B but the content is usually the 404 page. The page title and meta description on that page is not displayed the same as on the SERP.As far as we can tell, this is the only domain this is occurring for.So far, no crawl errors detected in Webmaster Tools and moz crawl not completed yet.0 -
Canonical url issue
Canonical url issue My site https://ladydecosmetic.com on seomoz crawl showing duplicate page title, duplicate page content errors. I have downloaded the error reports csv and checked. From the report, The below url contains duplicate page content.
Intermediate & Advanced SEO | | trixmediainc
https://www.ladydecosmetic.com/unik-colours-lipstick-caribbean-peach-o-27-item-162&category_id=40&brands=66&click=brnd And other duplicate urls as per report are,
https://www.ladydecosmetic.com/unik-colours-lipstick-plum-red-o-14-item-157&category_id=40&click=colorsu&brands=66 https://www.ladydecosmetic.com/unik-colours-lipstick-plum-red-o-14-item-157&category_id=40 https://www.ladydecosmetic.com/unik-colours-lipstick-plum-red-o-14-item-157&category_id=40&brands=66&click=brnd But on every these url(all 4) I have set canonical url. That is the original url and an existing one(not 404). https://www.ladydecosmetic.com/unik-colours-lipstick-caribbean-peach-o-27-item-162&category_id=0 Then how this issues are showing like duplicate page content. Please give me an answer ASAP.0 -
Can some brilliant mozzer out there teach a moron/newbie like me how to 301 redirect several URL's I have?
Okay - I am a supermodel. I look pretty. My legs are amazing. My cheekbones are high. But when it comes to 301 redirects I am the ugliest supermodel on the block. Crap, here is the truth: I am not even a supermodel. I am just a middle-aged, goofy looking dude who is a newbie to fixing websites. I have inherited several sites from a friend and I have been helping by creating solid contextual links internally and externally for a while. But, when Roger the wondrous SEOMoz robot talks to me, he says, "oops, it looks like your foolish freak self has a site that has both a www. and a non-www, which can create competition for yourself." What do I do when he says that? I just whisper a "thank-you" but gently press the skip this step button and go on with my life because I do not know how to make my non-www.'s redirect into the www. sites... Now, I have sort of asked this question on the site before, but I was answered by someone who does not understand my level of ignorance. any use of the word canonical or just put this lfwjkshj.htp/php inside the left ear of your mom, does not tell me anything so, is there any willing and kind soul who can walk me through redirecting several of my sites to their proper home - kind of like Carl Chubbs Weathers did for Happy Gilmore in that Academy Award winning classic? Thanks for the help in advance best, dumbhead
Intermediate & Advanced SEO | | creativeguy0 -
What's going on with my organic traffic from Google?
I am working on eCommerce website Vista Stores. My website's traffic is going down due to certain reason. I have done R & D and have assumption with auto generated content which I have added on few product pages. You can find out attachment to know more about current situation of traffic. 6789134845_d1a1578960_b.jpg
Intermediate & Advanced SEO | | CommercePundit0 -
Canonical URL redirect to different domain - SEO benefits?
Hello Folks, We are having a SEO situation here, and hope your support will help us figure out that. Let's say there are two different domains www.subdomian.domianA.com and www.domainB.com. subdomain.domainA is what we want to promote and drive SEO traffic. But all our content lies in domainB. So one of the thoughts we had is to duplicate the domainB's content on subdomian.domainA and have a canonical URL redirect implemented. Questions: Will subdomain.domainA.com get indexed in search engines for the content in domainB by canonical redirect? Do we get the SEO benefits? So is there any other better way to attain this objective? Thanks in advance.
Intermediate & Advanced SEO | | NortonSupportSEO0