Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Canonical URLs and Sitemaps
-
We are using canonical link tags for product pages in a scenario where the URLs on the site contain category names, and the canonical URL points to a URL which does not contain the category names. So, the product page on the site is like www.example.com/clothes/skirts/skater-skirt-12345, and also like www.example.com/sale/clearance/skater-skirt-12345 in another category. And on both of these pages, the canonical link tag references a 3rd URL like www.example.com/skater-skirt-12345. This 3rd URL, used in the canonical link tag is a valid page, and displays the same content as the other two versions, but there are no actual links to this generic version anywhere on the site (nor external).
Questions:
1. Does the generic URL referenced in the canonical link also need to be included as on-page links somewhere in the crawled navigation of the site, or is it okay to be just a valid URL not linked anywhere except for the canonical tags?
2. In our sitemap, is it okay to reference the non-canonical URLs, or does the sitemap have to reference only the canonical URL? In our case, the sitemap points to yet a 3rd variation of the URL, like www.example.com/product.jsp?productID=12345. This page retrieves the same content as the others, and includes a canonical link tag back to www.example.com/skater-skirt-12345. Is this a valid approach, or should we revise the sitemap to point to either the category-specific links or the canonical links?
-
Thanks. And since we've now implemented the aforementioned changes, I can give some findings back.
What we did: We changed our sitemap to point to the same canonical URLs as are referenced in the tags on our product pages (only one entry in sitemap per product).
What we didn't do: We didn't change the product pages themselves. They still have a canonical URL link reference, pointing to a URL with no category paths, which does not naturally occur in the navigation of the site (on the site, product pages all have category paths in the URL).
Findings: After submitting the new sitemap, the stats in Google Webmasters Tools indicate that almost all (> 96%) of our product pages are indexed. We believe that the pages were already indexed (for the most part) and now the sitemap is useful for metrics. From the timing, it's unlikely that the sitemap itself caused our index stats to get significantly better in just 1 day. Possible, but unlikely. In either case, since our product page URLs still reference canonical links which don't exist in the site's navigation, the evidence suggests that the canonical link itself is enough, and an actual navigation path to the canonical version of the page is not needed. That's just empirical evidence, we have no inside info on Google's methods, but this is what we believe now after monitoring.
-
With the canonical tag in place, I'm guessing that extra link would basically be ignored. It's probably harmless, but I'm not sure it will do anything. You could create an HTML "sitemap" (or even an XML sitemap) with the canonical URLs. It's not my first choice, but it at least would give Google an extra push.
-
We're in process of updating our canonical tagging and our sitemap, based on the feedback here. I have a question for the group though. Unfortunately we can't follow Andy Smith's suggestion of creating a "By Brand" navigation section on the site, since this web site is all private label (they sell all products under their own brand name).
One possible solution is to create a user-accessible site map page, with an "all products" paginated section, where all these product page URLs would be the canonical version.
But another possible solution, easier to implement, would be to have a user accessible link on each product page to the canonical version of itself. That is, when the user is on www.example.com/clothes/skirts/skater-skirt-12345, there would be a link to www.example.com/skater-skirt-12345, which would also be the URL specified in the canonical tag.
This seems redundant, but our results so far have borne out that the canonical tag pointing to a URL which doesn't really exist anywhere in the navigation doesn't seem to be having the desired effect. So, the thought is that a combination of the canonical tag, plus a "real" link to that same URL referenced in the canonical tag would better inform the search engine robots. But our hesitation is whether it should work for this link to be on the product page itself (e.g. the non-canonical version).
Any thoughts or feedback on approach?
-
Thanks for the responses. I've been monitoring for the past couple of weeks with the current sitemap and canonical structure, and so far the data seems consistent with the replies to this thread. In GWT, the sitemap stats show less than 1% of the URLs submitted are indexed so far. We have an action plan now to update the canonical structure and the sitemap to point to URLs which will be naturally crawled on the site as well.
-
There's no "have to" in most of these situations, but it boils down to this - the more canonical your canonical URL actually is, the better chance you have of Google honoring it. In other words, if you set a canonical tag but then never use that in internal links or your XML sitemap, odds are pretty good that Google may ignore the tag in some cases. You're basically saying "Hey, this URL is canonical! No, this one is! No, this one!" - it's a mixed message, and they're going to try to interpret it algorithmically.
I definitely think pointing to yet another version in the XML sitemap is a problem. Ideally, it would be great to unify your URLs, but if that's not possible, getting the canonical version in the sitemap would be a big help (and introducing yet another variant isn't good, so you'd kill two birds with one stone). As Andy said, if you could create some kind of internal link to the canonical version, even if it's not the main link, that could also help. I only hesitate on that one, because you don't want to end up with a weird, artificial linking structure (just creating links to have links).
Please note, this isn't necessarily a disaster the way you have it. Google could honor the tags properly and generally rank your site correctly. In my experience, though, it's a recipe for long-term problems, and it's worth fixing.
-
The purpose of the canonical tag is to tell Google which page to index first. So, on that note, I usually use the canonical tag on the strongest page in terms of pagerank, as this shows which page is linked to the best.
I'm also guessing you're using a framwork/platform like Magento, this can make linking quite difficult. I often suggest creating Brand pages, and link to the product page, the "3rd URL", from there. Brand pages also great for SEO, as most people search for brands first. Great place to get some fat head keywords in.
Also, make sure you put in the http:// as well, I think it is good practice to put in the full URL.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Canonical and Alternate Advice
At the moment for most of our sites, we have both a desktop and mobile version of our sites. They both show the same content and use the same URL structure as each other. The server determines whether if you're visiting from either device and displays the relevant version of the site. We are in a predicament of how to properly use the canonical and alternate rel tags. Currently we have a canonical on mobile and alternate on desktop, both of which have the same URL because both mobile and desktop use the same as explained in the first paragraph. Would the way of us doing it at the moment be correct?
Intermediate & Advanced SEO | | JH_OffLimits3 -
Rel=canonical and internal links
Hi Mozzers, I was musing about rel=canonical this morning and it occurred to me that I didnt have a good answer to the following question: How does applying a rel=canonical on page A referencing page B as the canonical version affect the treatment of the links on page A? I am thinking of whether those links would get counted twice, or in the case of ver-near-duplicates which may have an extra sentence which includes an extra link, whther that extra link would count towards the internal link graph or not. I suspect that google would basically ignore all the content on page A and only look to page B taking into account only page Bs links. Any thoughts? Thanks!
Intermediate & Advanced SEO | | unirmk0 -
Will disallowing URL's in the robots.txt file stop those URL's being indexed by Google
I found a lot of duplicate title tags showing in Google Webmaster Tools. When I visited the URL's that these duplicates belonged to, I found that they were just images from a gallery that we didn't particularly want Google to index. There is no benefit to the end user in these image pages being indexed in Google. Our developer has told us that these urls are created by a module and are not "real" pages in the CMS. They would like to add the following to our robots.txt file Disallow: /catalog/product/gallery/ QUESTION: If the these pages are already indexed by Google, will this adjustment to the robots.txt file help to remove the pages from the index? We don't want these pages to be found.
Intermediate & Advanced SEO | | andyheath0 -
Should pages with rel="canonical" be put in a sitemap?
I am working on an ecommerce site and I am going to add different views to the category pages. The views will all have different urls so I would like to add the rel="canonical" tag to them. Should I still add these pages to the sitemap?
Intermediate & Advanced SEO | | EcommerceSite0 -
Attack of the dummy urls -- what to do?
It occurs to me that a malicious program could set up thousands of links to dummy pages on a website: www.mysite.com/dynamicpage/dummy123 www.mysite.com/dynamicpage/dummy456 etc.. How is this normally handled? Does a developer have to look at all the parameters to see if they are valid and if not, automatically create a 301 redirect or 404 not found? This requires a table lookup of acceptable url parameters for all new visitors. I was thinking that bad url names would be rare so it would be ok to just stop the program with a message, until I realized someone could intentionally set up links to non existent pages on a site.
Intermediate & Advanced SEO | | friendoffood1 -
How important are sitemap errors?
If there aren't any crawling / indexing issues with your site, how important do thing sitemap errors are? Do you work to always fix all errors? I know here: http://www.seomoz.org/blog/bings-duane-forrester-on-webmaster-tools-metrics-and-sitemap-quality-thresholds Duane Forrester mentions that sites with many 302's 301's will be punished--does any one know Googe's take on this?
Intermediate & Advanced SEO | | nicole.healthline0 -
Google News URL Structure
Hi there folks I am looking for some guidance on Google News URLs. We are restructuring the site. A main traffic driver will be the traffic we get from Google News. Most large publishers use: www.site.com/news/12345/this-is-the-title/ Others use www.example.com/news/celebrity/12345/this-is-the-title/ etc. www.example.com/news/celebrity-news/12345/this-is-the-title/ www.example.com/celebrity-news/12345/this-is-the-title/ (Celebrity is a channel on Google News so should we try and follow that format?) www.example.com/news/celebrity-news/this-is-the-title/12345/ www.example.com/news/celebrity-news/this-is-the-title-12345/ (unique ID no at the end and part of the title URL) www.example.com/news/celebrity-news/celebrity-name/this-is-the-title-12345/ Others include the date. So as you can see there are so many combinations and there doesnt seem to be any unity across news sites for this format. Have you any advice on how to structure these URLs? Particularly if we want to been seen as an authority on the following topics: fashion, hair, beauty, and celebrity news - in particular "celebrity name" So should the celebrity news section be www.example.com/news/celebrity-news/celebrity-name/this-is-the-title-12345/ or what? This is for a completely new site build. Thanks Barry
Intermediate & Advanced SEO | | Deepti_C0 -
Include Cross Domain Canonical URL's in Sitemap - Yes or No?
I have several sites that have cross domain canonical tags setup on similar pages. I am unsure if these pages that are canonicalized to a different domain should be included in the sitemap. My first thought is no, because I should only include pages in the sitemap that I want indexed. On the other hand, if I include ALL pages on my site in the sitemap, once Google gets to a page that has a cross domain canonical tag, I'm assuming it will just note that and determine if the canonicalized page is the better version. I have yet to see any errors in GWT about this. I have seen errors where I included a 301 redirect in my sitemap file. I suspect its ok, but to me, it seems that Google would rather not find these URL's in a sitemap, have to crawl them time and time again to determine if they are the best page, even though I'm indicating that this page has a similar page that I'd rather have indexed.
Intermediate & Advanced SEO | | WEB-IRS0