Best Practice Approaches to Canonicals vs. Indexing in Google Sitemap vs. No Follow Tags
-
Hi There,
I am working on the following website: https://wave.com.au/
I have become aware that there are different pages that are competing for the same keywords.
For example, I just started to update a core, category page - Anaesthetics (https://wave.com.au/job-specialties/anaesthetics/) to focus mainly around the keywords ‘Anaesthetist Jobs’.
But I have recognized that there are ongoing landing pages that contain pretty similar content:
We want to direct organic traffic to our core pages e.g. (https://wave.com.au/job-specialties/anaesthetics/).
This then leads me to have to deal with the duplicate pages with either a canonical link (content manageable) or maybe alternatively adding a no-follow tag or updating the robots.txt. Our resident developer also suggested that it might be good to use Google Index in the sitemap to tell Google that these are of less value?
What is the best approach? Should I add a canonical link to the landing pages pointing it to the category page? Or alternatively, should I use the Google Index? Or even another approach?
Any advice would be greatly appreciated.
Thanks!
-
This all sounds good, just make sure before you proceed, you use GA to check what % of your SEO (segment: "Organic") traffic comes from these URLs. Don't act on a hunch, act on data!
-
Thank you for the comprehensive response this is greatly appreciated my friend.
Yes, I agree. I have since read further and have completely ruled out blocking (robots txt. etc) as an option.
I went back and read some more Moz/SEO articles and I think I have narrowed it down to either:
a) canonicals pointing from the landing pages to the core website category pages
b) NoIndex/Follow tags on the landing pages
Basically, I think the key contextual factors to keep in mind are that:
- The landing pages are basically just sent to people directly by our recruiters in emails and over the phone, so they are almost counted as direct traffic.
- It just contains a form and doesn't encourage click through into our core website beside logo etc. - we just want them to register directly on that page.
- Over the past year, the visits on the landing pages were much, much less, and the bounce rate and exit % was higher.
- my manager has told me to prioritise the SEO towards the core category pages as they see the landing pages as purely for UX/registrations/useful to internal business recruiting practices rather than encouraging organic traffic.
I think canonicals would probably work the best since in some cases the landing pages were ranking higher than the category pages and it should hopefully transfer a bit of ranking power to the category pages.
But perhaps you are right and I can batch apply canonicals monitor the results and then progress.
Once again, thank you for your response.
-
First of all keep in mind that Google has chosen the pages it is deciding to rank for one reason or another, and that canonical tags do not consolidate link equity (SEO authority) in the same way which 301 redirects do
As such, it's possible that you could implement a very 'logical' canonical tag structure, but for whatever reason Google may not give your new 'canonical' URLs the same rankings which it ascribed to the old URLs. So there is a possibility here that, you could lose some rankings! Google's acceptance of both the canonical tag and the 301 redirect depends upon the (machine-like) similarity of the content on both URLs
Think of Boolean string similarity. You get two strings of text, whack them into a tool like this one - and it tells you the 'percentage' of similarity between the two text strings. Google operate something similar yet infinitely more sophisticated. No one has told me that they do this, I have observed it over hundreds of site migration projects where, sometimes Google gives the new site loads of SEO authority through the 301s and sometimes not much at all. For me, the two main causes of Google refusing to accept new canonical URLs are redirect chains (which could include soft redirect chains) but also content 'dissimilarity'. Basically, content has won links and interactions on one URL which prove it is popular and authoritative. If you move that content somewhere else, or tell Google to go somewhere else instead - they have to be pretty certain that the new content is pretty much the same, otherwise it's a risk to them and an 'unknown quantity' in the SERPs (in terms of CTR and stuff)
If you're pretty damn sure that you have loads of URLs which are essentially the same, read the same, reference the same prices for things (one isn't cheaper than the other), that Google has really chosen the wrong page to rank in terms of Google-user click-through UX, then go ahead and lay out your canonical tag strategy
Personally I'd pick sections of the site and do it one part at a time in isolation, so you can minimise losses from disturbing Google and also measure your efforts more effectively / efficiently
If you no-index and robots-block URLs, it KILLS their SEO authority (dead) instead of moving it elsewhere (so steer clear of those except in extreme situations, they're really a last resort if you have the worst sprawling architecture imaginable). 301 redirects can shift ranking URLs and relevance, but don't pipe much authority. 301 redirects (if handled correctly) do all three things
What you have to ask yourself is, if you flat out deleted the pages you don't want to rank (obviously you wouldn't do this, as it would cause internal UX issues on your site) - if you did that, would Google:
A) Rank the other pages in their place from your site, which you want Google to rank
B) Give up on you and just rank similar pages (to the ones you don't want to rank) from other, competing sites instead
If you think (A) - take a measured, sectioned, small approach to canonical tag deployment and really test it before full roll-out. If you think (B), then you are admitting that there's something more Google-friendly one the pages you don't want to be ranking and just have to accept - no, your Google->conversion funnel will never be completely perfect like how you want it to be. You have to satisfy Google, not the other way around
Hope that helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Do cross domain rel canonical and original source tags have to be the same?
I have placed content on a partner site using the same content that is on my site. I want the link juice from the site and the canonical tag points back to my site. However, they are also using the original source tag as they publish a lot of news. If they have the original source tag as the page on their site and the canonical as mine, is this killing the link juice from the canonical and putting me in jeopardy of a duplicate content penalty? Google has already started indexing the page on their site with the same content.
Intermediate & Advanced SEO | | SecuritiesCE0 -
How and When Should I use Canonical Url Tags?
Pretty new to the SEO universe. But I have not used any canonical tags, just because there is not definitive source explaining exactly when and why you should use them??? Am I the only one who feels this way?
Intermediate & Advanced SEO | | greenrushdaily0 -
Sitemap Indexation
When we use HTML sitemap. Many a times i have seen that the sitemap itself gets mapped to keywords which it shouldn't have got to. So should we keep the HTML sitemap as No-Index, Follow or does anyone has a better solution that the sitemap doesn't show-up for other keyword terms that actually isn't representing this page.
Intermediate & Advanced SEO | | welcomecure0 -
Google Indexing Duplicate URLs : Ignoring Robots & Canonical Tags
Hi Moz Community, We have the following robots command that should prevent URLs with tracking parameters being indexed. Disallow: /*? We have noticed google has started indexing pages that are using tracking parameters. Example below. http://www.oakfurnitureland.co.uk/furniture/original-rustic-solid-oak-4-drawer-storage-coffee-table/1149.html http://www.oakfurnitureland.co.uk/furniture/original-rustic-solid-oak-4-drawer-storage-coffee-table/1149.html?ec=affee77a60fe4867 These pages are identified as duplicate content yet have the correct canonical tags: https://www.google.co.uk/search?num=100&site=&source=hp&q=site%3Ahttp%3A%2F%2Fwww.oakfurnitureland.co.uk%2Ffurniture%2Foriginal-rustic-solid-oak-4-drawer-storage-coffee-table%2F1149.html&oq=site%3Ahttp%3A%2F%2Fwww.oakfurnitureland.co.uk%2Ffurniture%2Foriginal-rustic-solid-oak-4-drawer-storage-coffee-table%2F1149.html&gs_l=hp.3..0i10j0l9.4201.5461.0.5879.8.8.0.0.0.0.82.376.7.7.0....0...1c.1.58.hp..3.5.268.0.JTW91YEkjh4 With various affiliate feeds available for our site, we effectively have duplicate versions of every page due to the tracking query that Google seems to be willing to index, ignoring both robots rules & canonical tags. Can anyone shed any light onto the situation?
Intermediate & Advanced SEO | | JBGlobalSEO0 -
Total Indexed 1.5M vs 83k submitted by sitemap. What?
We recently took a good look at one of our content site's sitemap and tried to cut out a lot of crap that had gotten in there such as .php, .xml, .htm versions of each page. We also cut out images to put in a separate image sitemap. The sitemap generated 83,000+ URLs for google to crawl (this partially used the Yoast Wordpress plugin to generate) In webmaster tools in the index status section is showing that this site has a total index of 1.5 million. With our sitemap coming back with 83k and google indexing 1.5 million pages, is this a sign of a CMS gone rogue? Is it an indication that we could be pumping out error pages or empty templates, or junk pages that we're cramming into Google's bot? I would love to hear what you guys think. Is this normal? Is this something to be concerned about? Should our total index more closely match our sitemap page count?
Intermediate & Advanced SEO | | seoninjaz0 -
Google Not Indexing XML Sitemap Images
Hi Mozzers, We are having an issue with our XML sitemap images not being indexed. The site has over 39,000 pages and 17,500 images submitted in GWT. If you take a look at the attached screenshot, 'GWT Images - Not Indexed', you can see that the majority of the pages are being indexed - but none of the images are. The first thing you should know about the images is that they are hosted on a content delivery network (CDN), rather than on the site itself. However, Google advice suggests hosting on a CDN is fine - see second screenshot, 'Google CDN Advice'. That advice says to either (i) ensure the hosting site is verified in GWT or (ii) submit in robots.txt. As we can't verify the hosting site in GWT, we had opted to submit via robots.txt. There are 3 sitemap indexes: 1) http://www.greenplantswap.co.uk/sitemap_index.xml, 2) http://www.greenplantswap.co.uk/sitemap/plant_genera/listings.xml and 3) http://www.greenplantswap.co.uk/sitemap/plant_genera/plants.xml. Each sitemap index is split up into often hundreds or thousands of smaller XML sitemaps. This is necessary due to the size of the site and how we have decided to pull URLs in. Essentially, if we did it another way, it may have involved some of the sitemaps being massive and thus taking upwards of a minute to load. To give you an idea of what is being submitted to Google in one of the sitemaps, please see view-source:http://www.greenplantswap.co.uk/sitemap/plant_genera/4/listings.xml?page=1. Originally, the images were SSL, so we decided to reverted to non-SSL URLs as that was an easy change. But over a week later, that seems to have had no impact. The image URLs are ugly... but should this prevent them from being indexed? The strange thing is that a very small number of images have been indexed - see http://goo.gl/P8GMn. I don't know if this is an anomaly or whether it suggests no issue with how the images have been set up - thus, there may be another issue. Sorry for the long message but I would be extremely grateful for any insight into this. I have tried to offer as much information as I can, however please do let me know if this is not enough. Thank you for taking the time to read and help. Regards, Mark Oz6HzKO rYD3ICZ
Intermediate & Advanced SEO | | edlondon0 -
Can Google index PDFs with flash?
Does anyone know if Google can index PDF with Flash embedded? I would assume that the regular flash recommendations are still valid, even when embedded in another document. I would assume there is a list of the filetype and version which Google can index with the search appliance, but was not able to find any. Does anyone have a link or a list?
Intermediate & Advanced SEO | | andreas.wpv0 -
How can I check if the FOLLOW,NOINDEX tag is working?
Hi everyone! After reading about pagination practices, a few days ago we introduced the <meta name="robots" content="FOLLOW,NOINDEX" /> tag, to prevent duplicate content. You can find an example below: http://www.inmonova.com/en/properties?page=2 I have been checking yahoo site explorer and result pages still get indexed. My question is: Am I doing something wrong? Is the code incorrect (follow,noindex - noindex,follow)? Or does it just take some time to have effect? Thanks in advance.
Intermediate & Advanced SEO | | inmonova0