Canonical URLs and Sitemaps
-
We are using canonical link tags for product pages in a scenario where the URLs on the site contain category names, and the canonical URL points to a URL which does not contain the category names. So, the product page on the site is like www.example.com/clothes/skirts/skater-skirt-12345, and also like www.example.com/sale/clearance/skater-skirt-12345 in another category. And on both of these pages, the canonical link tag references a 3rd URL like www.example.com/skater-skirt-12345. This 3rd URL, used in the canonical link tag is a valid page, and displays the same content as the other two versions, but there are no actual links to this generic version anywhere on the site (nor external).
Questions:
1. Does the generic URL referenced in the canonical link also need to be included as on-page links somewhere in the crawled navigation of the site, or is it okay to be just a valid URL not linked anywhere except for the canonical tags?
2. In our sitemap, is it okay to reference the non-canonical URLs, or does the sitemap have to reference only the canonical URL? In our case, the sitemap points to yet a 3rd variation of the URL, like www.example.com/product.jsp?productID=12345. This page retrieves the same content as the others, and includes a canonical link tag back to www.example.com/skater-skirt-12345. Is this a valid approach, or should we revise the sitemap to point to either the category-specific links or the canonical links?
-
Thanks. And since we've now implemented the aforementioned changes, I can give some findings back.
What we did: We changed our sitemap to point to the same canonical URLs as are referenced in the tags on our product pages (only one entry in sitemap per product).
What we didn't do: We didn't change the product pages themselves. They still have a canonical URL link reference, pointing to a URL with no category paths, which does not naturally occur in the navigation of the site (on the site, product pages all have category paths in the URL).
Findings: After submitting the new sitemap, the stats in Google Webmasters Tools indicate that almost all (> 96%) of our product pages are indexed. We believe that the pages were already indexed (for the most part) and now the sitemap is useful for metrics. From the timing, it's unlikely that the sitemap itself caused our index stats to get significantly better in just 1 day. Possible, but unlikely. In either case, since our product page URLs still reference canonical links which don't exist in the site's navigation, the evidence suggests that the canonical link itself is enough, and an actual navigation path to the canonical version of the page is not needed. That's just empirical evidence, we have no inside info on Google's methods, but this is what we believe now after monitoring.
-
With the canonical tag in place, I'm guessing that extra link would basically be ignored. It's probably harmless, but I'm not sure it will do anything. You could create an HTML "sitemap" (or even an XML sitemap) with the canonical URLs. It's not my first choice, but it at least would give Google an extra push.
-
We're in process of updating our canonical tagging and our sitemap, based on the feedback here. I have a question for the group though. Unfortunately we can't follow Andy Smith's suggestion of creating a "By Brand" navigation section on the site, since this web site is all private label (they sell all products under their own brand name).
One possible solution is to create a user-accessible site map page, with an "all products" paginated section, where all these product page URLs would be the canonical version.
But another possible solution, easier to implement, would be to have a user accessible link on each product page to the canonical version of itself. That is, when the user is on www.example.com/clothes/skirts/skater-skirt-12345, there would be a link to www.example.com/skater-skirt-12345, which would also be the URL specified in the canonical tag.
This seems redundant, but our results so far have borne out that the canonical tag pointing to a URL which doesn't really exist anywhere in the navigation doesn't seem to be having the desired effect. So, the thought is that a combination of the canonical tag, plus a "real" link to that same URL referenced in the canonical tag would better inform the search engine robots. But our hesitation is whether it should work for this link to be on the product page itself (e.g. the non-canonical version).
Any thoughts or feedback on approach?
-
Thanks for the responses. I've been monitoring for the past couple of weeks with the current sitemap and canonical structure, and so far the data seems consistent with the replies to this thread. In GWT, the sitemap stats show less than 1% of the URLs submitted are indexed so far. We have an action plan now to update the canonical structure and the sitemap to point to URLs which will be naturally crawled on the site as well.
-
There's no "have to" in most of these situations, but it boils down to this - the more canonical your canonical URL actually is, the better chance you have of Google honoring it. In other words, if you set a canonical tag but then never use that in internal links or your XML sitemap, odds are pretty good that Google may ignore the tag in some cases. You're basically saying "Hey, this URL is canonical! No, this one is! No, this one!" - it's a mixed message, and they're going to try to interpret it algorithmically.
I definitely think pointing to yet another version in the XML sitemap is a problem. Ideally, it would be great to unify your URLs, but if that's not possible, getting the canonical version in the sitemap would be a big help (and introducing yet another variant isn't good, so you'd kill two birds with one stone). As Andy said, if you could create some kind of internal link to the canonical version, even if it's not the main link, that could also help. I only hesitate on that one, because you don't want to end up with a weird, artificial linking structure (just creating links to have links).
Please note, this isn't necessarily a disaster the way you have it. Google could honor the tags properly and generally rank your site correctly. In my experience, though, it's a recipe for long-term problems, and it's worth fixing.
-
The purpose of the canonical tag is to tell Google which page to index first. So, on that note, I usually use the canonical tag on the strongest page in terms of pagerank, as this shows which page is linked to the best.
I'm also guessing you're using a framwork/platform like Magento, this can make linking quite difficult. I often suggest creating Brand pages, and link to the product page, the "3rd URL", from there. Brand pages also great for SEO, as most people search for brands first. Great place to get some fat head keywords in.
Also, make sure you put in the http:// as well, I think it is good practice to put in the full URL.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why a certain URL ( a category URL ) disappears?
the page hasn't been spammed. - links are natural - onpage grader is perfect - there are useful high ranking articles linking to the page...pretty much everything is okay.....also all of my websites pages are okay and none of them has disappeared only this one ( the most important category of my site. )
Intermediate & Advanced SEO | | mohamadalieskandariii0 -
This url is not allowed for a Sitemap at this location error using pro-sitemaps.com
Hey, guys, We are using the pro-sitemaps.com tool to automate our sitemaps on our properties, but some of them give this error "This url is not allowed for a Sitemap at this location" for all the urls. Strange thing is that not all of them are with the error and most have all the urls indexed already. Do you have any experience with the tool and what is your opinion? Thanks
Intermediate & Advanced SEO | | lgrozeva0 -
Canonical Issue with urls
I saw some urls of my site showing duplicate page content, duplicate page title issues on crawl reports. So I have set canonical url for every urls , that has dupicate content / page title. But still SeoMoz crawl test is showing issue. I am giving here one url with issue. The below given urls shown duplicate content and duplicate page title with some other urls all are given below. Checked URL http://www.cyrusrugs.com/bridge-traditional-area-rug-item-7635 dup page content http://www.cyrusrugs.com/bridge-traditional-area-rug-item-7622&category_id=270&colors=Black_Tones&click=colors&ci=1
Intermediate & Advanced SEO | | trixmediainc
http://www.cyrusrugs.com/bridge-traditional-area-rug-item-7622 dup page Title http://www.cyrusrugs.com/bridge-traditional-area-rug-item-7636&category_id=270&sizes=12x15,12x18&click=sizes
http://www.cyrusrugs.com/bridge-traditional-area-rug-item-7636
http://www.cyrusrugs.com/bridge-traditional-area-rug-item-7622&category_id=270&colors=Black_Tones&click=colors&ci=1
http://www.cyrusrugs.com/bridge-traditional-area-rug-item-7622 But I have set canonical url for all these urls already , that is :- http://www.cyrusrugs.com/bridge-traditional-area-rug-item-7622 This should actually solve the problem right ? Search engine should identify the canonical url as original url and only should consider that. Thanks0 -
Uppercase in URLs = Dupe Content
Hi Mozzers, My developers recently changed a bunch of the pages I am working on into all lower case (something I know ideally should have been done in the first place). The URLs have sat for about a week as lower case without 301 redirecting the old upper-case URLs to these pages. In Google Webmaster Tools, I'm seeing Google recognize them as duplicate meta tags, title tags, etc. See image: http://screencast.com/t/KloiZMKOYfa We're 301 redirecting the old URLs to the new ones ASAP, but is there anything else I should do? Any chance Google is going to noindex these pages because it seems them as dupes until I fix them? Sometimes I can see both pages in the SERPs if I use personalized results, and it scares me: http://screencast.com/t/4BL6iOhz4py3 Thanks!
Intermediate & Advanced SEO | | Travis-W0 -
Simple Pagination and Rel Canonical
Hello, I am trying to find a solid solution to this. I think it is simple, but trying to think of a good setup for SEO. If you have a paginated result set, page 1, page 2, page 3, page 4. What i am wondering is, should I point my REL CANONICAL page to Page 1 always, so i'm not loosing power from the first page? Domain structure: www.domain.com/search/[term]/page1/
Intermediate & Advanced SEO | | aactive
www.domain.com/search/[term]/page2/ Should I point all pages to page 1, so I don't get watered down as we go farther into the site? Thoughts?0 -
Overly-Dynamic URLs & Changing URL Structure w Web Redesign
I have a client that has multiple apartment complexes in different states and metro areas. They get good traffic and pretty good conversions but the site needs a lot of updating, including the architecture, to implement SEO standards. Right now they rank for " <brand_name>apartments" on every place but not " <city_name>apartments".</city_name></brand_name> There current architecture displays their URLs like: http://www.<client_apartments>.com/index.php?mainLevelCurrent=communities&communityID=28&secLevelCurrent=overview</client_apartments> http://www.<client_apartments>.com/index.php?mainLevelCurrent=communities&communityID=28&secLevelCurrent=floorplans&floorPlanID=121</client_apartments> I know it is said to never change the URL structure but what about this site? I see this URL structure being bad for SEO, bad for users, and basically forces us to keep the current architecture. They don't have many links built to their community pages so will creating a new URL structure and doing 301 redirects to the new URLs drastically drop rankings? Is this something that we should bite the bullet on now for future rankings, traffic, and a better architecture?
Intermediate & Advanced SEO | | JaredDetroit0 -
Changing Site URLs
I am working on a new client that hasn't implemented any SEO previously. The site has terrible url nomenclature and I am wondering if it is worth it to try and change it. Will I lose rankings? What is the best url naming structure? Here's the website http://www.formica.com/en/home/TradeLanding.aspx. (I am only working on the North America site.) Thanks!
Intermediate & Advanced SEO | | AlightAnalytics0 -
Googlebot crawling partial URLs
Hi guys, I've checked my email this morning and I've got a number of 404 errors over the weekend where Google has tried to crawl some of my existing pages but not found the full URL. Instead of hitting 'domain.com/folder/complete-pagename.php' it's hit 'domain.com/folder/comp'. This is definitely Googlebot/2.1; http://www.google.com/bot.html (66.249.72.53) but I can't find where it would have found only the partial URL. It certainly wasn't on the domain it's crawling and I can't find any links from external sites pointing to us with the incorrect URL. GoogleBot is doing the same thing across a single domain but in different sub-folders. Having checked Webmaster Tools there aren't any hard 404s and the soft ones aren't related and haven't occured since August. I'm really confused as to how this is happening.. Thanks!
Intermediate & Advanced SEO | | panini0