Skip to content
    Moz logo Menu open Menu close
    • Products
      • Moz Pro
      • Moz Pro Home
      • Moz Local
      • Moz Local Home
      • STAT
      • Moz API
      • Moz API Home
      • Compare SEO Products
      • Moz Data
    • Free SEO Tools
      • Domain Analysis
      • Keyword Explorer
      • Link Explorer
      • Competitive Research
      • MozBar
      • More Free SEO Tools
    • Learn SEO
      • Beginner's Guide to SEO
      • SEO Learning Center
      • Moz Academy
      • MozCon
      • Webinars, Whitepapers, & Guides
    • Blog
    • Why Moz
      • Digital Marketers
      • Agency Solutions
      • Enterprise Solutions
      • Small Business Solutions
      • The Moz Story
      • New Releases
    • Log in
    • Log out
    • Products
      • Moz Pro

        Your all-in-one suite of SEO essentials.

      • Moz Local

        Raise your local SEO visibility with complete local SEO management.

      • STAT

        SERP tracking and analytics for enterprise SEO experts.

      • Moz API

        Power your SEO with our index of over 44 trillion links.

      • Compare SEO Products

        See which Moz SEO solution best meets your business needs.

      • Moz Data

        Power your SEO strategy & AI models with custom data solutions.

      Track AI Overviews in Keyword Research
      Moz Pro

      Track AI Overviews in Keyword Research

      Try it free!
    • Free SEO Tools
      • Domain Analysis

        Get top competitive SEO metrics like DA, top pages and more.

      • Keyword Explorer

        Find traffic-driving keywords with our 1.25 billion+ keyword index.

      • Link Explorer

        Explore over 40 trillion links for powerful backlink data.

      • Competitive Research

        Uncover valuable insights on your organic search competitors.

      • MozBar

        See top SEO metrics for free as you browse the web.

      • More Free SEO Tools

        Explore all the free SEO tools Moz has to offer.

      NEW Keyword Suggestions by Topic
      Moz Pro

      NEW Keyword Suggestions by Topic

      Learn more
    • Learn SEO
      • Beginner's Guide to SEO

        The #1 most popular introduction to SEO, trusted by millions.

      • SEO Learning Center

        Broaden your knowledge with SEO resources for all skill levels.

      • On-Demand Webinars

        Learn modern SEO best practices from industry experts.

      • How-To Guides

        Step-by-step guides to search success from the authority on SEO.

      • Moz Academy

        Upskill and get certified with on-demand courses & certifications.

      • MozCon

        Save on Early Bird tickets and join us in London or New York City

      Unlock flexible pricing & new endpoints
      Moz API

      Unlock flexible pricing & new endpoints

      Find your plan
    • Blog
    • Why Moz
      • Digital Marketers

        Simplify SEO tasks to save time and grow your traffic.

      • Small Business Solutions

        Uncover insights to make smarter marketing decisions in less time.

      • Agency Solutions

        Earn & keep valuable clients with unparalleled data & insights.

      • Enterprise Solutions

        Gain a competitive edge in the ever-changing world of search.

      • The Moz Story

        Moz was the first & remains the most trusted SEO company.

      • New Releases

        Get the scoop on the latest and greatest from Moz.

      Surface actionable competitive intel
      New Feature

      Surface actionable competitive intel

      Learn More
    • Log in
      • Moz Pro
      • Moz Local
      • Moz Local Dashboard
      • Moz API
      • Moz API Dashboard
      • Moz Academy
    • Avatar
      • Moz Home
      • Notifications
      • Account & Billing
      • Manage Users
      • Community Profile
      • My Q&A
      • My Videos
      • Log Out

    The Moz Q&A Forum

    • Forum
    • Questions
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. Home
    2. SEO Tactics
    3. Intermediate & Advanced SEO
    4. Block in robots.txt instead of using canonical?

    Moz Q&A is closed.

    After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.

    Block in robots.txt instead of using canonical?

    Intermediate & Advanced SEO
    4
    9
    3006
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with question management privileges can see it.
    • YairSpolter
      YairSpolter last edited by

      When I use a canonical tag for pages that are variations of the same page, it basically means that I don't want Google to index this page. But at the same time, spiders will go ahead and crawl the page. Isn't this a waste of my crawl budget? Wouldn't it be better to just disallow the page in robots.txt and let Google focus on crawling the pages that I do want indexed?

      In other words, why should I ever use rel=canonical as opposed to simply disallowing in robots.txt?

      1 Reply Last reply Reply Quote 0
      • RobertFisher
        RobertFisher @YairSpolter last edited by

        With this info, I would go with Robots.txt because, as you say, it outweighs any potential loss given the use of the pages and the absence of links.

        Thanks

        1 Reply Last reply Reply Quote 1
        • YairSpolter
          YairSpolter @RobertFisher last edited by

          Thanks Robert.

          The pages that I'm talking about disallowing do not have rank or links. They are sub-pages of a profile page. If anything, the main page will be linked to, not the sub-pages.

          Maybe I should have explained that I'm talking about a large site - around 400K pages. More than 1,000 new pages are created per  week. That's why I am concerned about managing crawl budget. The pages that I'm referring to are not linked to anywhere on the site. Sure, Google can potentially get to them if someone decides to link to them on their own site, but this is unlikely and certainly won't happen on a large scale. So I'm not really concerned about about losing pagerank on the main profile page if I disallow them. To be clear: we have many thousands of pages with content that we want to rank. The pages I'm talking about are not important in those terms.

          So it's really a question of balance... if these pages (there are MANY of them) are included in the crawl (and in our sitemap), potentially it's a real waste of crawl budget. Doesn't this outweigh the minuscule, far-fetched potential loss?

          I understand that Google designed rel=canonical for this scenario, but that does not mean that it's necessarily the best way to go considering the other options.

          RobertFisher 1 Reply Last reply Reply Quote 0
          • YairSpolter
            YairSpolter @TakeshiYoung last edited by

            Thanks Takeshi.

            Maybe I should have explained that I'm talking about a large site - around 400K pages. More than 1,000 new pages are created per  week. That's why I am concerned about managing crawl budget. The pages that I'm referring to are not linked to anywhere on the site. Sure, Google can potentially get to them if someone decides to link to them on their own site, but this is unlikely (since it's a sub-page of the main profile page, which is where people would naturally link to) and certainly won't happen on a large scale. So I'm not really concerned about about link-juice evaporation. According to AJ Kohn here, it's not enough to see in Webmaster Tools that Google has indexed all pages on our site. There is also the issue of how often pages are being crawled, which is what we are trying to optimize for.

            So it's really a question of balance... if these pages (there are MANY of them) are included in the crawl (and in our sitemap), potentially it's a real waste of crawl budget. Doesn't this outweigh the minuscule, far-fetched potential loss?

            Would love to hear your thoughts...

            1 Reply Last reply Reply Quote 0
            • TakeshiYoung
              TakeshiYoung last edited by

              I would go with the canonicals. If there are any links going to these duplicate pages, that will prevent any "link juice evaporation" from links which Google can see but can't crawl due to robots.txt. Best to let Google just crawl the page and see the canonical so that it understands that it is a duplicate page.

              Having canonicals on all your pages is good practice anyway, as it can prevent inadvertent duplicate content from things like query parameters.

              Crawl budget can be of some concern if you're talking about a massive number of pages, but start by first taking a look at Google Webmaster Tools and seeing how many of your pages are being crawled vs the total number of pages on your site. As long as this ration isn't small, you should be good. You can also get more crawl budget by building up your domain authority by building links.

              YairSpolter 1 Reply Last reply Reply Quote 0
              • RobertFisher
                RobertFisher @YairSpolter last edited by

                I don't disagree at all and I think AJ Kohn is a rock star. In SEO, I have learned over time that there are rarely absolutes like always do this or never do that. I based my answer on how you posited the question.

                If you read AJ's post you will note that the rel=canonical issue comes up with others commenting and not in the body of his post. Yes, if the page is superfluous like a cart page or a contact page, use the robots.txt to block the crawl. But, if you have a page with rank, links, etc. that help your canonical page, how are you helping yourself by forgoing rel=canon?

                I think his bigger point was that you want to be aware and to understand that the # of times you are crawled is at least partially governed by PR which is governed by all those other things we discussed. If you understand that and keep the crawl focused on better pages you help yourself.

                Does that clarify a bit?
                Best

                YairSpolter 1 Reply Last reply Reply Quote 1
                • Devanur-Rafi
                  Devanur-Rafi @YairSpolter last edited by

                  Hi, even if you use robots.txt file to block these pages, Google can still pick the references of these pages from third-party websites and can crawl from there. Such pages will not have the description snippet in the search results and instead will show text that reads:

                  A description of this result is not available because of this site's robots.txt.

                  So, to fully stop Google from crawling these pages, you can go in for the page-level meta robots tag along with the robots.txt method. The page-level robots meta tag complements robots.txt method.By the way, robots.txt file can definitely save you some crawl budget. I don't think you should be thinking much about crawl budget though, as long as your website is super-easy to crawl with simple text-based internal links and stuff like, super-fast servers etc.,

                  Those my my two cents my friend.

                  Best regards,

                  Devanur Rafi

                  1 Reply Last reply Reply Quote 1
                  • YairSpolter
                    YairSpolter @RobertFisher last edited by

                    Thanks for the response, Robert.

                    I have read lots of SEO advice on maximizing your "crawl budget" - making sure your internal link system is built well to send the bots to the right pages. According to my research, since bots only spend a certain amount of time on your site when they are crawling, it is important to do whatever you can to ensure that they don't "waste time" on pages that are not important for SEO. Just as one example, see this post from AJ Kohn.

                    Do you disagree with this whole approach?

                    Devanur-Rafi RobertFisher 2 Replies Last reply Reply Quote 0
                    • RobertFisher
                      RobertFisher last edited by

                      Yair

                      I think that the canonical is the better option. I am unsure as to your use of the term "crawl budget," in that there is no fixed number of times a page or a site will be crawled versus a second similar site for example. I have a huge reference site that is crawled every couple of days and I have small sites of ten pages that are crawled weekly or less. It is dependent on the traffic and behaviors of that traffic (which would include number of inbound links, etc.) and on things like you re-submitting sitemap, etc. 
                      The canonical tag was created to provide the clarification to the search engine as to what you considered to be the relevant page. Go ahead and use it.

                      Best

                      Robert

                      YairSpolter 1 Reply Last reply Reply Quote 1
                      • 1 / 1
                      • First post
                        Last post

                      Got a burning SEO question?

                      Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.


                      Start my free trial


                      Browse Questions

                      Explore more categories

                      • Moz Tools

                        Chat with the community about the Moz tools.

                      • SEO Tactics

                        Discuss the SEO process with fellow marketers

                      • Community

                        Discuss industry events, jobs, and news!

                      • Digital Marketing

                        Chat about tactics outside of SEO

                      • Research & Trends

                        Dive into research and trends in the search industry.

                      • Support

                        Connect on product support and feature requests.

                      • See all categories

                      Related Questions

                      • seoanalytics

                        Homepage appearing instead of subpage

                        Hi, I have my homepage which has links saying "bike tours and bike tours in France" because in the past I was only doing bike tours in France. I now do tours all over Europe and I have a page about "bike tours in Franc"  only. The issue I have is that my page about "bike tours in France" never appears in the search ranking it is always my homepage that does for tjhe keyword "bike tours in France". My guess is that it is due to the links that my homepage has ? How could I make sure my France page appears instead of my homepage ? Thank you,

                        Intermediate & Advanced SEO | | seoanalytics
                        0
                      • ostesmorbrod

                        Landing pages for paid traffic and the use of noindex vs canonical

                        A client of mine has a lot of differentiated landing pages with only a few changes on each, but with the same intent and goal as the generic version. The generic version of the landing page  is included in navigation, sitemap and is indexed on Google. The purpose of the differentiated landing pages is to include the city and some minor changes in the text/imagery to best fit the Adwords text. Other than that, the intent and purpose of the pages are the same as the main / generic page. They are not to be indexed, nor am I trying to have hidden pages linking to the generic and indexed one (I'm not going the blackhat way). So – I want to avoid that the duplicate landing pages are being indexed (obviously), but I'm not sure if I should use noindex (nofollow as well?) or rel=canonical, since these landing pages are localized campaign versions of the generic page with more or less only paid traffic to them. I don't want to be accidentally penalized, but I still need the generic / main page to rank as high as possible... What would be your recommendation on this issue?

                        Intermediate & Advanced SEO | | ostesmorbrod
                        0
                      • mellison

                        Should I use the on classified listing pages that have expired?

                        We have went back and forth on this and wanted to get some outside input.  I work for an online listing website that has classified ads on it.  These ads are generated by companies on our site advertising weekend events around the country. We have about 10,000 companies that use our service to generate their online ads. This means that we have thousands of pages being created each week. The ads have lots of content: pictures, sale descriptions, and company information. After the ads have expired, and the sale is no longer happening, we are currently placing the in the heads of each page. The content is not relative anymore since the ad has ended. The only value the content offers a searcher is the images (there are millions on expired ads) and the descriptions of the items for sale.  We currently are the leader in our industry and control most of the top spots on Google for our keywords. We have been worried about cluttering up the search results with pages of ads that are expired.  In our Moz account right now we currently have over 28k crawler warnings alerting us to the  being in the page heads of the expired ads. Seeing those warnings have made us nervous and second guessing what we are doing. Does anybody have any thoughts on this? Should we continue with placing the  in the heads of the expired ads, or should we be allowing search engines to index the old pages.  I have seen websites with discontinued products keeping the products around so that individuals can look up past information. This is the closest thing have seen to our situation. Any help or insight would be greatly appreciated! -Matt

                        Intermediate & Advanced SEO | | mellison
                        0
                      • SEO_Promenade

                        When is it recommended to use a self referencing rel "canonical"?

                        In what type of a situation is it the best type of practice to use a self referencing rel "canonical" tag? Are there particular practices to be cautious of when using a self referencing  rel "canonical" tag? I see this practice used mainly with larger websites but I can't find any information that really explains when is a good time to make use of this practice for SEO purposes. Appreciate all feedback. Thank you in advance.

                        Intermediate & Advanced SEO | | SEO_Promenade
                        0
                      • Netrepid

                        Do I need to use rel="canonical" on pages with no external links?

                        I know having rel="canonical" for each page on my website is not a bad practice... but how necessary is it for pages that don't have any external links pointing to them? I have my own opinions on this, to be fair - but I'd love to get a consensus before I start trying to customize which URLs have/don't have it included. Thank you.

                        Intermediate & Advanced SEO | | Netrepid
                        0
                      • HaCos

                        Use of subdomains, subdirectories or both?

                        Hello, i would like your advice on a dilemma i am facing. I am working a new project that is going to release soon,  thats a network of users with personal profiles seperated in categories for example lets say the categories are colors. So let say i am a member and i belong in red color categorie and i got a page where i update my personal information/cv/resume as well as a personal blog thats on that page. So the main site is giving the option to user to search for members by the criteria of color. My first idea is that all users should own a subdomain (and this is how its developed so far) thats easy to use and since the domain name is really small (just 3 letters) i believe subdomain worth since personal site will be easy to remember. My dilemma is should all users own a subdomain, a subdirectory or both and if both witch one should be the canonical? Since it said that search engines treat subdomains as different stand-alone sites, whats best for the main site? to show multiple search results with profiles in subdomains or subdirectories? What if i use both? meaning in search results i use search directory url for each profile while same time each profile owns a subdomains as well? and if so which one should be the canonical? Thanks in advance, C

                        Intermediate & Advanced SEO | | HaCos
                        0
                      • BrooklynCruiser

                        When using ALT tags - are spaces, hyphens or underscores preferred by Google when using multiple words?

                        when plugging ALT tags into images, does Google prefer spaces, hyphens, or underscores?  I know with filenames, hyphens or underscores are preferred and spaces are replaced with %20. Thoughts? Thanks!

                        Intermediate & Advanced SEO | | BrooklynCruiser
                        3
                      • nicole.healthline

                        Robots.txt & url removal vs. noindex, follow?

                        When de-indexing pages from google, what are the pros & cons of each of the below two options: robots.txt & requesting url removal from google webmasters Use the noindex, follow meta tag on all doctor profile pages Keep the URLs in the Sitemap file so that Google will recrawl them and find the noindex meta tag make sure that they're not disallowed by the robots.txt file

                        Intermediate & Advanced SEO | | nicole.healthline
                        0

                      Get started with Moz Pro!

                      Unlock the power of advanced SEO tools and data-driven insights.

                      Start my free trial
                      Products
                      • Moz Pro
                      • Moz Local
                      • Moz API
                      • Moz Data
                      • STAT
                      • Product Updates
                      Moz Solutions
                      • SMB Solutions
                      • Agency Solutions
                      • Enterprise Solutions
                      • Digital Marketers
                      Free SEO Tools
                      • Domain Authority Checker
                      • Link Explorer
                      • Keyword Explorer
                      • Competitive Research
                      • Brand Authority Checker
                      • Local Citation Checker
                      • MozBar Extension
                      • MozCast
                      Resources
                      • Blog
                      • SEO Learning Center
                      • Help Hub
                      • Beginner's Guide to SEO
                      • How-to Guides
                      • Moz Academy
                      • API Docs
                      About Moz
                      • About
                      • Team
                      • Careers
                      • Contact
                      Why Moz
                      • Case Studies
                      • Testimonials
                      Get Involved
                      • Become an Affiliate
                      • MozCon
                      • Webinars
                      • Practical Marketer Series
                      • MozPod
                      Connect with us

                      Contact the Help team

                      Join our newsletter
                      Moz logo
                      © 2021 - 2025 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                      • Accessibility
                      • Terms of Use
                      • Privacy

                      Looks like your connection to Moz was lost, please wait while we try to reconnect.