Include Cross Domain Canonical URL's in Sitemap - Yes or No?
-
I have several sites that have cross domain canonical tags setup on similar pages. I am unsure if these pages that are canonicalized to a different domain should be included in the sitemap. My first thought is no, because I should only include pages in the sitemap that I want indexed.
On the other hand, if I include ALL pages on my site in the sitemap, once Google gets to a page that has a cross domain canonical tag, I'm assuming it will just note that and determine if the canonicalized page is the better version. I have yet to see any errors in GWT about this. I have seen errors where I included a 301 redirect in my sitemap file. I suspect its ok, but to me, it seems that Google would rather not find these URL's in a sitemap, have to crawl them time and time again to determine if they are the best page, even though I'm indicating that this page has a similar page that I'd rather have indexed.
-
I looked at the sitemap, and they are including the http://www.seomoz.org/blog/the-story-of-seomoz but not the canonical page - http://www.masternewmedia.org/entrepreneurship-the-full-story-of-seomoz-told-by-rand-fishkin/
So based on this example, the page on SEOMoz is still included in the sitemap, regardless if it has a canonical or not.
This seems to make sense, since canonical links are used only as a hint and not an absolute directive.
I also noticed that Google is choosing to index and rank both pages, on Page 1.
SEOMoz is ranking higher on my browser for "the full story of seomoz". A few things going on here.
-
Why is google choosing to rank SEOMoz higher than Mastermedia.org for this page? There's a canonical setup, but google is choosing not to follow it. (again its a hint not an absolute) this doesn't always work.
-
I would think Google would be able to filter out the duplicate content easy. In this example, they are clearly not. SEOMoz is ranking #4 and Masternewmedia.org is ranking #5 for query "the full story of seomoz"
-
-
Right - as far as I know, you're supposed to put end URLs into a sitemap, not urls which 301 redirect. Cross domain canonical is still kind of new, but I would treat them as a 301 redirect and not include them in a sitemap.
Now, if you're curious, SEO Moz did a whiteboard Friday where they talked about this same exact issue (cross domain canonical), and as an experiment, re-posted a blog article from another blogger on SEO Moz.
http://www.seomoz.org/blog/cross-domain-canonical-the-new-301-whiteboard-friday
http://www.seomoz.org/blog-sitemap.xml
http://www.seomoz.org/blog/the-story-of-seomoz
The blog is still included in the blog sitemap. I think it probably won't 'hurt' to keep those pages in the sitemap, since a lot of sitemaps automatically generated CMS tools won't have been updated to deal with this yet.
-
There is no BIG problem if you add the pages that contain cross domain canonical tag on them. Why?
The reason why I can say this is because Google is not only indexing the pages from sitemap.xml file, Google have their own crawler and they have the ability to crawl and index the website no matter if you do not have an xml sitemap.
Google is very good at (in my opinion) picking the instructions that are available on the page so if you add the page in the xml sitemap, the crawler will read the instructions on the page and will only index the page that contain original content.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Print pages returning 404's
Print pages on one of our sister sites are returning 404's in our crawl but are visible when clicked on. Here is one example: https://www.theelementsofliving.com/recipe/citrus-energy-boosting-smoothie/print Any ideas as to why these are returning errors? Thank you!
Intermediate & Advanced SEO | | FirstService0 -
Canonical URL on search result pages
Hi there, Our company sells educational videos to Nurses via subscription. I've been looking at their video search results page:
Intermediate & Advanced SEO | | 9868john
http://www.nursesfornurses.com.au/cpd When you click on a category, the URL appears like this:
http://www.nursesfornurses.com.au/cpd?view=category&cat=9&name=Acute+Surgical+Nursing
http://www.nursesfornurses.com.au/cpd?view=category&cat=6&name=Medications Would this be an instance where i'd use the canonical tag to redirect each search results page? Bearing in mind the /cpd page is under /Nursing cpd, and that /Nursing cpd is our best performing page in search engines, would it be better to refer it to the 'Nursing CPD' rather than 'CPD' page? Any advice is very welcome,
Thanks,
John0 -
Canonical link vs root domain
I have a wordpress website installed on http://domain.com/home/ instead of http://domain.com - Does it matter whether I leave it that way with a canonical link from the domain.com to the domain.com/home/ or should I move the wordpress files and database to the root domain?
Intermediate & Advanced SEO | | JosephFrost0 -
Is there anyway to recover my site's rankings?
My site has been top 3 for 'speed dating' on Google.co.uk since about 2003 and it went to below top 50 for a lot of it's main keywords shortly after 27 Oct 2012. I did a re-submission request and was told there was 'no manual spam action'. My conclusions is I was dropped by Google because of poor quality links I've gained over 10+ years. I have a Domain Authority of 40, a regular blog http://bit.ly/oKyi88, a KLOUT of 42, user reviews and quality content. Since Oct 2012 I've done some technical improvements and managed to get a few questionable links removed. I've continued blogging reguarly and got more active on Twitter. I've seen no improvement and my traffic is 80% down on last year. It would be great to be able to produce content that others want to link to but I've not had much success from that in over 10 years of trying and I've not seen many others in my sector, with small budgets having much success. Is there anything I can do to regain favour with Google?
Intermediate & Advanced SEO | | benners0 -
Domain and Sitemap Question
Hi - I am hoping you can help me with this issue we are currently trying to solve. We are hosting our mobile site's content on a different domain than what the URL of the site is, though owned by same company. In Google Webmasters tool we have the mobile sitemap under "sitemaps.xyz.com", however the URL of the site is "m.xyz.com". We have submitted 60MM pages in the mobile sitemap, but only 1MM pages have been indexed. Do you think this set up causes confusion with the bots? Does this affect the crawlability of the site? Any thoughts would be greatly appreciated. Thank you!
Intermediate & Advanced SEO | | ladylana
Eva0 -
Cross-Domain Canonical Showing as inbound links?
I run several ecommerce websites, and there is some overlap in the products offered between sites. To solve this duplicate content issue, I use a cross-domain rel canonical so that there is only 1 authoritative page per product, even if it is sold on multiple sites. However, I am noticing that my inbound link profile is massively expanding because Google sees these as inbound links. The top linking domains for my site are all owned by me, even though there are not any actual links between the sites. Has anyone else experienced this?
Intermediate & Advanced SEO | | stevenmusumeche0 -
Are sites that leave out www from domain at a disadvantage to domains with www in url
I know this has been discussed but was wondering what would be the best approach from an SEO perspective. I quite like the idea of setting up websites with domains without www but always worry that setting up domains without www has a disadvantage because user are use to referring to sites with the www included. Thus one of my fears are that users would link back using www version which will mean even if you do a 301 redirect that some of the link juice would be lost. I know some famous sites have used this convention such as http://searchenginewatch.com/ so think it would be possible but still concerned that for new sites it would be better to rather stick to conventions. What are your opinions about this?
Intermediate & Advanced SEO | | SABest0 -
Export list of urls in google's index?
Is there a way to export an exact list of urls found in Google's index?
Intermediate & Advanced SEO | | nicole.healthline0