Include Cross Domain Canonical URL's in Sitemap - Yes or No?
-
I have several sites that have cross domain canonical tags setup on similar pages. I am unsure if these pages that are canonicalized to a different domain should be included in the sitemap. My first thought is no, because I should only include pages in the sitemap that I want indexed.
On the other hand, if I include ALL pages on my site in the sitemap, once Google gets to a page that has a cross domain canonical tag, I'm assuming it will just note that and determine if the canonicalized page is the better version. I have yet to see any errors in GWT about this. I have seen errors where I included a 301 redirect in my sitemap file. I suspect its ok, but to me, it seems that Google would rather not find these URL's in a sitemap, have to crawl them time and time again to determine if they are the best page, even though I'm indicating that this page has a similar page that I'd rather have indexed.
-
I looked at the sitemap, and they are including the http://www.seomoz.org/blog/the-story-of-seomoz but not the canonical page - http://www.masternewmedia.org/entrepreneurship-the-full-story-of-seomoz-told-by-rand-fishkin/
So based on this example, the page on SEOMoz is still included in the sitemap, regardless if it has a canonical or not.
This seems to make sense, since canonical links are used only as a hint and not an absolute directive.
I also noticed that Google is choosing to index and rank both pages, on Page 1.
SEOMoz is ranking higher on my browser for "the full story of seomoz". A few things going on here.
-
Why is google choosing to rank SEOMoz higher than Mastermedia.org for this page? There's a canonical setup, but google is choosing not to follow it. (again its a hint not an absolute) this doesn't always work.
-
I would think Google would be able to filter out the duplicate content easy. In this example, they are clearly not. SEOMoz is ranking #4 and Masternewmedia.org is ranking #5 for query "the full story of seomoz"
-
-
Right - as far as I know, you're supposed to put end URLs into a sitemap, not urls which 301 redirect. Cross domain canonical is still kind of new, but I would treat them as a 301 redirect and not include them in a sitemap.
Now, if you're curious, SEO Moz did a whiteboard Friday where they talked about this same exact issue (cross domain canonical), and as an experiment, re-posted a blog article from another blogger on SEO Moz.
http://www.seomoz.org/blog/cross-domain-canonical-the-new-301-whiteboard-friday
http://www.seomoz.org/blog-sitemap.xml
http://www.seomoz.org/blog/the-story-of-seomoz
The blog is still included in the blog sitemap. I think it probably won't 'hurt' to keep those pages in the sitemap, since a lot of sitemaps automatically generated CMS tools won't have been updated to deal with this yet.
-
There is no BIG problem if you add the pages that contain cross domain canonical tag on them. Why?
The reason why I can say this is because Google is not only indexing the pages from sitemap.xml file, Google have their own crawler and they have the ability to crawl and index the website no matter if you do not have an xml sitemap.
Google is very good at (in my opinion) picking the instructions that are available on the page so if you add the page in the xml sitemap, the crawler will read the instructions on the page and will only index the page that contain original content.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Can a duplicate page referencing the original page on another domain in another country using the 'canonical link' still get indexed locally?
Hi I wonder if anyone could help me on a canonical link query/indexing issue. I have given an overview, intended solution and question below. Any advice on this query will be much appreciated. Overview: I have a client who has a .com domain that includes blog content intended for the US market using the correct lang tags. The client also has a .co.uk site without a blog but looking at creating one. As the target keywords and content are relevant across both UK and US markets and not to duplicate work the client has asked would it be worthwhile centralising the blog or provide any other efficient blog site structure recommendations. Suggested solution: As the domain authority (DA) on the .com/.co.uk sites are in the 60+ it would risky moving domains/subdomain at this stage and would be a waste not to utilise the DAs that have built up on both sites. I have suggested they keep both sites and share the same content between them using a content curated WP plugin and using the 'canonical link' to reference the original source (US or UK) - so not to get duplicate content issues. My question: Let's say I'm a potential customer in the UK and i'm searching using a keyword phrase that the content that answers my query is on both the UK and US site although the US content is the original source.
Intermediate & Advanced SEO | | JonRayner
Will the US or UK version blog appear in UK SERPs? My gut is the UK blog will as Google will try and serve me the most appropriate version of the content and as I'm in the UK it will be this version, even though I have identified the US source using the canonical link?2 -
Community Discussion: Are You Optimizing Your Brand's Content for Featured Snippets?
My latest post on the Moz Blog, Featured Snippets: A Dead-Simple Tactic for Making, explores how to keep Featured Snippets once you have them. I'm curious to know how many brands are actively working to get in the answer box, and for those who are, what's been the results?
Intermediate & Advanced SEO | | ronell-smith2 -
404's after pruning old posts
Hey all, So after reading about the benefits of pruning old content I decided to give it a try on our blog. After reviewing thousands of posts I found around 2500 that were simply not getting any traffic, or if they were there was 100% bounce & exit. Many of these posts also had content with relevance that had long ago expired. After deleted these old posts, I am now seeing the posts being reported as 404's in Google Search Console. But most of them are the old url with "trashed" appended to the url. My question is: are these 404's normal? Do I now have to go through and set up 301's for all of these? Is it enough to simply add the lot to my robots.txt file? Are these 404's going to hurt my blog? Thanks, Roman
Intermediate & Advanced SEO | | Dynata_panel_marketing0 -
Meta canonical or simply robots.txt other domain names with same content?
Hi, I'm working with a new client who has a main product website. This client has representatives who also sells the same products but all those reps have a copy of the same website on another domain name. The best thing would probably be to shut down the other (same) websites and redirect 301 them to the main, but that's impossible in the minding of the client. First choice : Implement a conical meta for all the URL on all the other domain names. Second choice : Robots.txt with disallow for all the other websites. Third choice : I'm really open to other suggestions 😉 Thank you very much! 🙂
Intermediate & Advanced SEO | | Louis-Philippe_Dea0 -
301 canonical'd pages?
I have an ecommerce site with many different URLs with the same product. Let's say the product is a hat. It's in: a a) mysite.com/products/hat b) mysite.com/collections/head-ware/hat c) mysite.com/collections/stuff-to-wear-on-your-head/hat Right now, A is the canonical page for B and C. I want to clean up my site, so that every product only has ONE unique URL, which is linked to from all the collections. So B and C URL will be broken. Is it necessary that I 301 them if they were already canonical'd? Based on the number of products I have, I would have to 301 1000+ URLs. I'm just trying to figure out what I need to do to avoid getting penalized. thanks
Intermediate & Advanced SEO | | birchlore0 -
Google's Exact Match Algorithm Reduced Our Traffic!
Google's first Panda de-valued our Web store, www.audiobooksonline.com, and our traffic went from 2500 - 3000 (mostly organic referrals) per month to 800 - 1000. Google's under-valuing of our Web store continued to reduce our traffic to 400-500 for the past few months. From 4/5/2013 to 4/6/2013 our traffic dropped 50% more, because (I believe) of Google's "exact domain match" algorithm implementation. We were, even after Panda and up to 4/5/2013 getting a significant amount of organic traffic for search terms such as "audiobooks online," "audio books online," and "online audiobooks." We no longer get traffic for these generic keywords. What I don't understand is why a UK company, www.audiobooksonline.co.uk/, with a very similar domain name, ranks #5 for "audio books online" and #4 for "audiobooks online" while we've almost disappeared from Google rankings. By any measurement I am aware of, our site should rank higher than audiobooksonline.co.uk. Market Samurai reports for "audio books online" and "audiobooks online" shows that our Web store is significantly "stronger" than audiobooksonline.co.uk but they show up on Google's first page and we are down several pages. I also checked a few titles on audiobooksonline.co.uk and confirmed they are using the same publisher descriptions we and many other online book / audiobook merchants do = duplicate content. We have never received notice that our Web store was being penalized. Why would audiobooksonline.co.uk rank so much higher than audiobooksonline.com? Does Google treat non-USA sites different than USA sites?
Intermediate & Advanced SEO | | lbohen0 -
Pinging SE's - is this spam?
Hi, Just read on ViperChill that Matt Cutts told Glen (owner of ViperChill) that ping services can help your blog posts. Now lets say you have a list of 10 that you ping and you put an article up everyday, thats 300 pings a month, is that not spammy? Here is the link to the post: http://www.viperchill.com/future-of-blogging/ If you scroll down your see a screen print of Google search box, its the para above and below this screen print.
Intermediate & Advanced SEO | | activitysuper1 -
Meeting Google's needs 100% with dynamic pages
We have bought into a really powerful search, very exciting We can define really detailed product based 'landing pages' by creating a search that pulles on required attributeseghttp://www.OURDOMAIN.com//search/index.php?sortprice=asc&followSearch=9673&q=red+coats+short-length Pop that in a link Short Red Coats on a previous page and wonderful, that gives a page of short red coats in price ascending order, one happy consumer, straight to a page that meets their needs Question 1 however unhappy Google right? Question 2 can we meet Google's needs 100% with a redirect permanent in an .htaccess file E.G redirect permanent /short-red-coats/ http://www.OURDOMAIN.com//search/index.php?sortprice=asc&followSearch=9673&q=red+coats+short-length
Intermediate & Advanced SEO | | GeezerG
Many thanks
CB0