Include Cross Domain Canonical URL's in Sitemap - Yes or No?
-
I have several sites that have cross domain canonical tags setup on similar pages. I am unsure if these pages that are canonicalized to a different domain should be included in the sitemap. My first thought is no, because I should only include pages in the sitemap that I want indexed.
On the other hand, if I include ALL pages on my site in the sitemap, once Google gets to a page that has a cross domain canonical tag, I'm assuming it will just note that and determine if the canonicalized page is the better version. I have yet to see any errors in GWT about this. I have seen errors where I included a 301 redirect in my sitemap file. I suspect its ok, but to me, it seems that Google would rather not find these URL's in a sitemap, have to crawl them time and time again to determine if they are the best page, even though I'm indicating that this page has a similar page that I'd rather have indexed.
-
I looked at the sitemap, and they are including the http://www.seomoz.org/blog/the-story-of-seomoz but not the canonical page - http://www.masternewmedia.org/entrepreneurship-the-full-story-of-seomoz-told-by-rand-fishkin/
So based on this example, the page on SEOMoz is still included in the sitemap, regardless if it has a canonical or not.
This seems to make sense, since canonical links are used only as a hint and not an absolute directive.
I also noticed that Google is choosing to index and rank both pages, on Page 1.
SEOMoz is ranking higher on my browser for "the full story of seomoz". A few things going on here.
-
Why is google choosing to rank SEOMoz higher than Mastermedia.org for this page? There's a canonical setup, but google is choosing not to follow it. (again its a hint not an absolute) this doesn't always work.
-
I would think Google would be able to filter out the duplicate content easy. In this example, they are clearly not. SEOMoz is ranking #4 and Masternewmedia.org is ranking #5 for query "the full story of seomoz"
-
-
Right - as far as I know, you're supposed to put end URLs into a sitemap, not urls which 301 redirect. Cross domain canonical is still kind of new, but I would treat them as a 301 redirect and not include them in a sitemap.
Now, if you're curious, SEO Moz did a whiteboard Friday where they talked about this same exact issue (cross domain canonical), and as an experiment, re-posted a blog article from another blogger on SEO Moz.
http://www.seomoz.org/blog/cross-domain-canonical-the-new-301-whiteboard-friday
http://www.seomoz.org/blog-sitemap.xml
http://www.seomoz.org/blog/the-story-of-seomoz
The blog is still included in the blog sitemap. I think it probably won't 'hurt' to keep those pages in the sitemap, since a lot of sitemaps automatically generated CMS tools won't have been updated to deal with this yet.
-
There is no BIG problem if you add the pages that contain cross domain canonical tag on them. Why?
The reason why I can say this is because Google is not only indexing the pages from sitemap.xml file, Google have their own crawler and they have the ability to crawl and index the website no matter if you do not have an xml sitemap.
Google is very good at (in my opinion) picking the instructions that are available on the page so if you add the page in the xml sitemap, the crawler will read the instructions on the page and will only index the page that contain original content.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
After hack and remediation, thousands of URL's still appearing as 'Valid' in google search console. How to remedy?
I'm working on a site that was hacked in March 2019 and in the process, nearly 900,000 spam links were generated and indexed. After remediation of the hack in April 2019, the spammy URLs began dropping out of the index until last week, when Search Console showed around 8,000 as "Indexed, not submitted in sitemap" but listed as "Valid" in the coverage report and many of them are still hack-related URLs that are listed as being indexed in March 2019, despite the fact that clicking on them leads to a 404. As of this Saturday, the number jumped up to 18,000, but I have no way of finding out using the search console reports why the jump happened or what are the new URLs that were added, the only sort mechanism is last crawled and they don't show up there. How long can I expect it to take for these remaining urls to also be removed from the index? Is there any way to expedite the process? I've submitted a 'new' sitemap several times, which (so far) has not helped. Is there any way to see inside the new GSC view why/how the number of valid URLs in the indexed doubled over one weekend?
Intermediate & Advanced SEO | | rickyporco0 -
Move to new domain using Canonical Tag
At the moment, I am moving from olddomain.com (niche site) to the newdomain.com (multi-niche site). Due to some reasons, I do not want to use 301 right now and planning to use the canonical pointing to the new domain instead. Would Google rank the new site instead of the old site? From what I have learnt, the canonical tag lets Google know that which is the main source of the contents. Thank you very much!
Intermediate & Advanced SEO | | india-morocco0 -
Domain Change for a well positioned website... I'm a-scared
Hello, A few years ago I have "inherited" a website about a particular touristic area in Italy (the Langhe region) called langhe.net. The website is very well positioned, the domain has been registered in '97 and the overall SEO performance is pretty good (it ranks in the top #3 positions for all the main search queries in our niche). We are currently redesigning the whole thing, and one of the idea was to change the domain (and the name) of the website from langhe.net to lovelanghe.com (which we already registered). The reasons behind this decision are the following (most important first): Google prefer brands over keywords and "Langhe" is just a keyword LoveLanghe looks more memorable and "marketable" than just Langhe.net All our social presence is branded already as LoveLanghe (they were created years back under this name - I don't know why) We will do our due diligence work (301 everything, domain change in Search Console etc. etc.) but I'm still kind of worried that we will lose some ranking. So my question(s) are: do you think it's a good idea to change the domain when ranking is good and original domain is so old? how much ranking (approximately) are we going to lose? Thanks in advance 🙂 Best
Intermediate & Advanced SEO | | Enrico_Cassinelli0 -
Can I have multiple 301's when switching to https version
Hello, our programmer recently updated our http version website to https. Does it matter if we have TWO 301 redirects? Here is an example: http://www.colocationamerica.com/dedicated_servers/linux-dedicated.htm 301 https://www.colocationamerica.com/dedicated_servers/linux-dedicated.htm 301 https://www.colocationamerica.com/linux-dedicated-server We're getting pulled in two different directions. I read https://moz.com/blog/301-redirection-rules-for-seo and don't know if 2 301's suffice. Please let me know. Greatly appreciated!
Intermediate & Advanced SEO | | Shawn1240 -
We 410'ed URLs to decrease URLs submitted and increase crawl rate, but dynamically generated sub URLs from pagination are showing as 404s. Should we 410 these sub URLs?
Hi everyone! We recently 410'ed some URLs to decrease the URLs submitted and hopefully increase our crawl rate. We had some dynamically generated sub-URLs for pagination that are shown as 404s in google. These sub-URLs were canonical to the main URLs and not included in our sitemap. Ex: We assumed that if we 410'ed example.com/url, then the dynamically generated example.com/url/page1 would also 410, but instead it 404’ed. Does it make sense to go through and 410 these dynamically generated sub-URLs or is it not worth it? Thanks in advice for your help! Jeff
Intermediate & Advanced SEO | | jeffchen0 -
How much is the effect of redirecting an old URL to another URL under a new domain?
Example: http://www.olddomain.com/buy/product-type/region/city/area http://www.newdomain.com/product-type-for-sale/city/area Thanks in advance!
Intermediate & Advanced SEO | | esiow20130 -
Cross Domain Rel Canonical for Affiliates?
Hi We use the Cross Domain Rel Canonical for duplicate content between our own websites, but what about affiliates sites who want our XML feed, (descriptions of our products). We don´t mind being credited but would this present a danger for us? Who is controlling the use of that cross domain rel canonical, us in our feed or them? Is there another way around it?
Intermediate & Advanced SEO | | xoffie0 -
What's going on with my organic traffic from Google?
I am working on eCommerce website Vista Stores. My website's traffic is going down due to certain reason. I have done R & D and have assumption with auto generated content which I have added on few product pages. You can find out attachment to know more about current situation of traffic. 6789134845_d1a1578960_b.jpg
Intermediate & Advanced SEO | | CommercePundit0