Include Cross Domain Canonical URL's in Sitemap - Yes or No?
-
I have several sites that have cross domain canonical tags setup on similar pages. I am unsure if these pages that are canonicalized to a different domain should be included in the sitemap. My first thought is no, because I should only include pages in the sitemap that I want indexed.
On the other hand, if I include ALL pages on my site in the sitemap, once Google gets to a page that has a cross domain canonical tag, I'm assuming it will just note that and determine if the canonicalized page is the better version. I have yet to see any errors in GWT about this. I have seen errors where I included a 301 redirect in my sitemap file. I suspect its ok, but to me, it seems that Google would rather not find these URL's in a sitemap, have to crawl them time and time again to determine if they are the best page, even though I'm indicating that this page has a similar page that I'd rather have indexed.
-
I looked at the sitemap, and they are including the http://www.seomoz.org/blog/the-story-of-seomoz but not the canonical page - http://www.masternewmedia.org/entrepreneurship-the-full-story-of-seomoz-told-by-rand-fishkin/
So based on this example, the page on SEOMoz is still included in the sitemap, regardless if it has a canonical or not.
This seems to make sense, since canonical links are used only as a hint and not an absolute directive.
I also noticed that Google is choosing to index and rank both pages, on Page 1.
SEOMoz is ranking higher on my browser for "the full story of seomoz". A few things going on here.
-
Why is google choosing to rank SEOMoz higher than Mastermedia.org for this page? There's a canonical setup, but google is choosing not to follow it. (again its a hint not an absolute) this doesn't always work.
-
I would think Google would be able to filter out the duplicate content easy. In this example, they are clearly not. SEOMoz is ranking #4 and Masternewmedia.org is ranking #5 for query "the full story of seomoz"
-
-
Right - as far as I know, you're supposed to put end URLs into a sitemap, not urls which 301 redirect. Cross domain canonical is still kind of new, but I would treat them as a 301 redirect and not include them in a sitemap.
Now, if you're curious, SEO Moz did a whiteboard Friday where they talked about this same exact issue (cross domain canonical), and as an experiment, re-posted a blog article from another blogger on SEO Moz.
http://www.seomoz.org/blog/cross-domain-canonical-the-new-301-whiteboard-friday
http://www.seomoz.org/blog-sitemap.xml
http://www.seomoz.org/blog/the-story-of-seomoz
The blog is still included in the blog sitemap. I think it probably won't 'hurt' to keep those pages in the sitemap, since a lot of sitemaps automatically generated CMS tools won't have been updated to deal with this yet.
-
There is no BIG problem if you add the pages that contain cross domain canonical tag on them. Why?
The reason why I can say this is because Google is not only indexing the pages from sitemap.xml file, Google have their own crawler and they have the ability to crawl and index the website no matter if you do not have an xml sitemap.
Google is very good at (in my opinion) picking the instructions that are available on the page so if you add the page in the xml sitemap, the crawler will read the instructions on the page and will only index the page that contain original content.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Magento 1.9 SEO. I have product pages with identical On Page SEO score in the 90's. Some pull up Google page 1 some won't pull up at all. I am searching for the exact title on that page.
I have a website built on Magento 1.9. There are approximately 290,000 part numbers on the site. I am sampling Google SERP results. About 20% of the keywords show up on page 1 position 5 thru 10. 80% don't show up at all. When I do a MOZ page score I get high 80's to 90's. A page score of 89 on one part # may show up on page one, An identical page score on a different part # can't be found on Google. I am searching for the exact part # in the page title. Any thoughts on what may be going on? This seems to me like a Magento SEO issue.
Intermediate & Advanced SEO | | CTOPDS0 -
Could this be seen as duplicate content in Google's eyes?
Hi I'm an in-house SEO and we've recently seen Panda related traffic loss along with some of our main keywords slipping down the SERPs. Looking for possible Panda related issues I was wondering if the following could be seen as duplicate content. We've got some very similar holidays (travel company) on our website. While they are different I'm concerned it may be seen as creating content that is too similar: http://www.naturalworldsafaris.com/destinations/africa-and-the-indian-ocean/kenya/suggested-holidays/the-wildlife-and-beaches-of-kenya.aspx http://www.naturalworldsafaris.com/destinations/africa-and-the-indian-ocean/kenya/suggested-holidays/ultimate-kenya-wildlife-and-beaches.aspx http://www.naturalworldsafaris.com/destinations/africa-and-the-indian-ocean/kenya/suggested-holidays/wildlife-and-beach-family-safari.aspx They do all have unique text but as you can see from the titles, they are very similar (note from an SEO point of view the tabbed content is all within the same page at source level). At the top level of the holiday pages we have a filtered search:
Intermediate & Advanced SEO | | KateWaite
http://www.naturalworldsafaris.com/destinations/africa-and-the-indian-ocean/kenya/suggested-holidays.aspx These pages have a unique introduction but the content snippets being pulled into the boxes is drawn from each of the individual holiday pages. I'm just concerned that these could be introducing some duplicating issues. Any thoughts?0 -
Company name doesn't have keyword: use domains instead?
Good Morning! Now, I'll admit, I may be obsessing a little too much on this, and it may not make that big of an impact in the long run, but with Google being introduced to the world if I were to start a business today I would try and include my keyword into the title of my business. For example Dollar Shave Club, at least they got the word shave in there. My business doesn't have a keyword in our name, is it beneficial to structure our URLs to include a keyword so that all of our URLs include that word? So if I sell organic bananas, but my company is called Evananas, is it worth it to have all domains become a child of Evananas.com/organic_bananas? That way at least we have the keyword "Organic Bananas" in our title? So I could then have things like: evananas.com/organic_bananas/recipes evananas.com/organic_bananas/benefits evananas.com/organic_bananas/taste_really_freeking_good Vs. evananas.com/recipes evananas.com/benefits evananas.com/taste_really_freeking_good I'm not sure it makes a difference. The other problem is I want to keep our URL's as short as possible. I feel like less is always more, but I was always under the impression domain/URL based keywords were rather powerful. What is the best practice in this case? Thanks Guys! Evan(ana)
Intermediate & Advanced SEO | | HashtagHustler0 -
What are Soft 404's and are they a problem
Hi, I have some old pages that were coming up in google WMT as a 404. These had links into them so i thought i'd do a 301 back to either the home page or to a relevant category or page. However these are now listed in WMT as soft 404's. I'm not sure what this means and whether google is saying it doesn't like this? Any advice welcomed.
Intermediate & Advanced SEO | | Aikijeff0 -
If it's not in Webmaster Tools, is it Duplicate Title
I am showing a lot of errors in my SEOmoz reports for duplicate content and duplicate titles, many of which appear to be related to capitalization vs non-capitalization in the URL. Case in point, if a URL contains a lower character, such as: http://www.gallerydirect.com/art/product/allyson-krowitz/distinct-microstructure-i as opposed to the same URL having an upper character in the structure: http://www.gallerydirect.com/art/product/allyson-krowitz/distinct-microstructure-I I am finding that some of the internal links on the site use the former structure and other links use the latter structure. These show as duplicate title/content in the SEOmoz reports, but they don't appear as duplicate titles in Webmaster Tools. My question is, should I try to work with our developers to create a script to change all of the content with cap letters in the destination links internally on the site, or is this a non-issue since it doesn't appear in Webmaster Tools?
Intermediate & Advanced SEO | | sbaylor0 -
What would your Seo tactic's be for this
Hiya guys... Just a quicken, So my forum, talknightlife.co.uk is currently 10th on google for "nightlife forum" I have about 15 back links, 26 page autority. Now what i'm trying to do, which everyone else is doing, is trying to move it up a couple of spots maybe to 5th or something. What would your tactics be, I'm disregarding all the crap I read in the forums etc, you guys on here tend to have the best explanation. Let it rip 🙂 Cheers guys Luke.
Intermediate & Advanced SEO | | Lukescotty0 -
How canonical url harm our website???
Even though my website has no similar/copied content, i used rel=canonical for all my website pages. Is Google or yahoo make any harm to my SERP's?? EX: http://www.seomoz.org is my site, in that i used canonical as rel="<a class="attribute-value">canonical</a>" href="http://www.seomoz.org" to my home page like that similar to all pages, i created rel=canonical. Is search engine harm my website???
Intermediate & Advanced SEO | | MadhukarSV0 -
Ranking for our member's company names without giving them all away!
Hi, We have a directory of 25,000 odd companies who use our site. We have a strong PR site and want to rank a page for each company name. Some initial testing on one or two company names brings us to #2 after the company's own web site in the format: "Company Name Reviews and Feedback" - so it works well. We want to do this for all 25,000 of our members, however we do not wish to make it easy for our competitors to scrape through our member database!! e.g. using: www.ourdomain.com/randomstring/company-name-(profile).php unfortunately with the above performing a search on google for site:domain.com/()/()(profile).php would bring up all records. Are there any tried and tested ways of achieving what we're after here? Many Thanks.
Intermediate & Advanced SEO | | sssrpm0