Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Include Cross Domain Canonical URL's in Sitemap - Yes or No?
-
I have several sites that have cross domain canonical tags setup on similar pages. I am unsure if these pages that are canonicalized to a different domain should be included in the sitemap. My first thought is no, because I should only include pages in the sitemap that I want indexed.
On the other hand, if I include ALL pages on my site in the sitemap, once Google gets to a page that has a cross domain canonical tag, I'm assuming it will just note that and determine if the canonicalized page is the better version. I have yet to see any errors in GWT about this. I have seen errors where I included a 301 redirect in my sitemap file. I suspect its ok, but to me, it seems that Google would rather not find these URL's in a sitemap, have to crawl them time and time again to determine if they are the best page, even though I'm indicating that this page has a similar page that I'd rather have indexed.
-
I looked at the sitemap, and they are including the http://www.seomoz.org/blog/the-story-of-seomoz but not the canonical page - http://www.masternewmedia.org/entrepreneurship-the-full-story-of-seomoz-told-by-rand-fishkin/
So based on this example, the page on SEOMoz is still included in the sitemap, regardless if it has a canonical or not.
This seems to make sense, since canonical links are used only as a hint and not an absolute directive.
I also noticed that Google is choosing to index and rank both pages, on Page 1.
SEOMoz is ranking higher on my browser for "the full story of seomoz". A few things going on here.
-
Why is google choosing to rank SEOMoz higher than Mastermedia.org for this page? There's a canonical setup, but google is choosing not to follow it. (again its a hint not an absolute) this doesn't always work.
-
I would think Google would be able to filter out the duplicate content easy. In this example, they are clearly not. SEOMoz is ranking #4 and Masternewmedia.org is ranking #5 for query "the full story of seomoz"
-
-
Right - as far as I know, you're supposed to put end URLs into a sitemap, not urls which 301 redirect. Cross domain canonical is still kind of new, but I would treat them as a 301 redirect and not include them in a sitemap.
Now, if you're curious, SEO Moz did a whiteboard Friday where they talked about this same exact issue (cross domain canonical), and as an experiment, re-posted a blog article from another blogger on SEO Moz.
http://www.seomoz.org/blog/cross-domain-canonical-the-new-301-whiteboard-friday
http://www.seomoz.org/blog-sitemap.xml
http://www.seomoz.org/blog/the-story-of-seomoz
The blog is still included in the blog sitemap. I think it probably won't 'hurt' to keep those pages in the sitemap, since a lot of sitemaps automatically generated CMS tools won't have been updated to deal with this yet.
-
There is no BIG problem if you add the pages that contain cross domain canonical tag on them. Why?
The reason why I can say this is because Google is not only indexing the pages from sitemap.xml file, Google have their own crawler and they have the ability to crawl and index the website no matter if you do not have an xml sitemap.
Google is very good at (in my opinion) picking the instructions that are available on the page so if you add the page in the xml sitemap, the crawler will read the instructions on the page and will only index the page that contain original content.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why is /home used in this company's home URL?
Just working with a company that has chosen a home URL with /home latched on - very strange indeed - has anybody else comes across this kind of homepage URL "decision" in the past? I can't see why on earth anybody would do this! Perhaps simply a logic-defying decision?
Intermediate & Advanced SEO | | McTaggart0 -
Chinese Sites Linking With Bizarre Keywords Creating 404's
Just ran a link profile, and have noticed for the first time many spammy Chinese sites linking to my site with spammy keywords such as "Buy Nike" or "Get Viagra". Making matters worse, they're linking to pages that are creating 404's. Can anybody explain what's going on, and what I can do?
Intermediate & Advanced SEO | | alrockn0 -
Brackets vs Encoded URLs: The "Same" in Google's eyes, or dup content?
Hello, This is the first time I've asked a question here, but I would really appreciate the advice of the community - thank you, thank you! Scenario: Internal linking is pointing to two different versions of a URL, one with brackets [] and the other version with the brackets encoded as %5B%5D Version 1: http://www.site.com/test?hello**[]=all&howdy[]=all&ciao[]=all
Intermediate & Advanced SEO | | mirabile
Version 2: http://www.site.com/test?hello%5B%5D**=all&howdy**%5B%5D**=all&ciao**%5B%5D**=all Question: Will search engines view these as duplicate content? Technically there is a difference in characters, but it's only because one version encodes the brackets, and the other does not (See: http://www.w3schools.com/tags/ref_urlencode.asp) We are asking the developer to encode ALL URLs because this seems cleaner but they are telling us that Google will see zero difference. We aren't sure if this is true, since engines can get so _hung up on even one single difference in character. _ We don't want to unnecessarily fracture the internal link structure of the site, so again - any feedback is welcome, thank you. 🙂0 -
Yoast & rel canonical for paginated Wordpress URLs
Hello, our Wordpress blog at http://www.jobs.ca/career-resources has a rel canonical issue since we added pagination to the front page and category-pages. We're using Yoast and it's incorrectly applying a rel-canonical meta tag referencing page 1 on page 2, 3, etc. This is a known misuse of the rel-canonical tag (per Google's Webmaster Blog - http://googlewebmastercentral.blogspot.ca/2013/04/5-common-mistakes-with-relcanonical.html, which says rel-canonical should be replaced with rel-prev and rel-next for page 2, 3, etc.). We don't see a way to specify anywhere in Yoast's options to correct this behaviour for page 2, 3, etc. Yoast allows you to override a page's canonical URL, otherwise it automatically uses the Wordpress permalink. My question is, does anyone know how to configure Yoast to properly replace rel-canonical tags with rel-prev and rel-next for paginated URLs, or do I need to look at another plugin or customize the behavior directly in my child theme code? This issue was brought up here as well: http://moz.com/community/q/canonical-help, but the only response did not relate to Yoast. (We're using Wordpress 3.6.1 and Yoast "Wordpress SEO" 1.4.18)
Intermediate & Advanced SEO | | aactive0 -
Subdomain Blog Sitemap link - Add it to regular domain?
Example of setup:
Intermediate & Advanced SEO | | EEE3
www.fancydomain.com
blog.fancydomain.com Because of certain limitations, I'm told we can't put our blogs at the subdirectory level, so we are hosting our blogs at the subdomain level (blog.fancydomain.com). I've been asked to incorporate the blog's sitemap link on the regular domain, or even in the regular domain's sitemap. 1. Putting the a link to blog.fancydomain.com/sitemap_index.xml in the www.fancydomain.com/sitemap.xml -- isn't this against sitemap.org protocol? 2. Is there even a reason to do this? We do have a link to the blog's home page from the www.fancydomain.com navigation, and the blog is set up with its sitemap and link to the sitemap in the footer. 3. What about just including a text link "Blog Sitemap" (linking to blog.fancydomain.com/sitemap_index.html) in the footer of the www.fancydomain.com (adjacent to the text link "Sitemap" which already exists for the www.fancydomain.com's sitemap. Just trying to make sense of this, and figure out why or if it should be done. Thanks!0 -
Two Pages with the Same Name Different URL's
I was hoping someone could give me some insight into a perplexing issue that I am having with my website. I run an 20K product ecommerce website and I am finding it necessary to have two pages for my content: 1 for content category pages about wigets one for shop pages for wigets 1st page would be .com/shop/wiget/ 2nd page would be .com/content/wiget/ The 1st page would be a catalogue of all the products with filters for the customer to narrow down wigets. So ultimately the URL for the shop page could look like this when the customer filters down... .com/shop/wiget/color/shape/ The second page would be content all about the Wigets. This would be types of wigets colors of wigets, how wigets are used, links to articles about wigets etc. Here are my questions. 1. Is it bad to have two pages about wigets on the site, one for shopping and one for information. The issue here is when I combine my content wiget with my shop wiget page, no one buys anything. But I want to be able to provide Google the best experience for rankings. What is the best approach for Google and the customer? 2. Should I rel canonical all of my .com/shop/wiget/ + .com/wiget/color/ etc. pages to the .com/content/wiget/ page? Or, Should I be canonicalizing all of my .com/shop/wiget/color/etc pages to .com/shop/wiget/ page? 3. Ranking issues. As it is right now, I rank #1 for wiget color. This page on my site would be .com/shop/wiget/color/ . If I rel canonicalize all of my pages to .com/content/wiget/ I am going to loose my rankings because all of my shop/wiget/xxx/xxx/ pages will then point to .com/content/wiget/ page. I am just finding with these massive ecommerce sites that there is WAY to much potential for duplicate content, not enough room to allow Google the ability to rank long tail phrases all the while making it completely complicated to offer people pages that promote buying. As I said before, when I combine my content + shop pages together into one page, my sales hit the floor (like 0 - 15 dollars a day), when i just make a shop page my sales are like (1k+ a day). But I have noticed that ever since Penguin and Panda my rankings have fallen from #1 across the board to #15 and lower for a lot of my phrase with the exception of the one mentioned above. This is why I want to make an information page about wigets and a shop page for people to buy wigets. Please advise if you would. Thanks so much for any insight you can give me!
Intermediate & Advanced SEO | | SKP0 -
Is it ok to use both 301 redirect and rel="canonical' at the same time?
Hi everyone, I'm sorry if this has been asked before. I just wasn't able to find a response in previous questions. To fix the problems in our website regarding duplication I have the possibility to set up 301's and, at the same time, modify our CMS so that it automatically sets a rel="canonical" tag for every page that is generated. Would it be a problem to have both methods set up? Is it a problem to have a on a page that is redirecting to another one? Is it advisable to have a rel="canonical" tag on every single page? Thanks for reading!
Intermediate & Advanced SEO | | SDLOnlineChannel0 -
How to check a website's architecture?
Hello everyone, I am an SEO analyst - a good one - but I am weak in technical aspects. I do not know any programming and only a little HTML. I know this is a major weakness for an SEO so my first request to you all is to guide me how to learn HTML and some basic PHP programming. Secondly... about the topic of this particular question - I know that a website should have a flat architecture... but I do not know how to find out if a website's architecture is flat or not, good or bad. Please help me out on this... I would be obliged. Eagerly awaiting your responses, BEst Regards, Talha
Intermediate & Advanced SEO | | MTalhaImtiaz0