Rogerbot directives in robots.txt
-
I feel like I spend a lot of time setting false positives in my reports to ignore.
Can I prevent Rogerbot from crawling pages I don't care about with robots.txt directives? For example., I have some page types with meta noindex and it reports these to me. Theoretically, I can block Rogerbot from these with a robots,txt directive and not have to deal with false positives.
-
Yes, you can definitely use the robots.txt file to prevent Rogerbot from crawling pages that you don’t want to include in your reports. This approach can help you manage and minimize false positives effectively.
To block specific pages or directories from being crawled, you would add directives to your robots.txt file. For example, if you have certain page types that you’ve already set with meta noindex, you can specify rules like this:
User-agent: Rogerbot Disallow: /path-to-unwanted-page/ Disallow: /another-unwanted-directory/
This tells Rogerbot not to crawl the specified paths, which should reduce the number of irrelevant entries in your reports.
However, keep in mind that while robots.txt directives can prevent crawling, they do not guarantee that these pages won't show up in search results if they are linked from other sites or indexed by different bots.
Additionally, using meta noindex tags is still a good practice for pages that may occasionally be crawled but shouldn’t appear in search results. Combining both methods—robots.txt for crawling and noindex for indexing—provides a robust solution to manage your web presence more effectively.
-
Never mind, I found this. https://moz.com/help/moz-procedures/crawlers/rogerbot
-
@awilliams_kingston
Yes, you can use robots.txt directives to prevent Rogerbot from crawling certain pages or sections of your site, which can help reduce the number of false positives in your reports. By doing so, you can focus Rogerbot’s attention on the parts of your site that matter more to you and avoid reporting issues on pages you don't care about.Here’s a basic outline of how you can use robots.txt to block Rogerbot:
Locate or Create Your robots.txt File: This file should be placed in the root directory of your website (e.g., https://www.yourwebsite.com/robots.txt).
Add Directives to Block Rogerbot: You’ll need to specify the user-agent for Rogerbot and define which pages or directories to block. The User-agent directive specifies which web crawlers the rules apply to, and Disallow directives specify the URLs or directories to block.
Here’s an example of what your robots.txt file might look like if you want to block Rogerbot from crawling certain pages:
javascript
Disallow: /path-to-block/
Disallow: /another-path/
If you want to block Rogerbot from accessing pages with certain parameters or patterns, you can use wildcards:javascript
Disallow: /path-to-block/*
Disallow: /another-path/?parameter=
Verify the Changes: After updating the robots.txt file, you can use tools like Google Search Console or other site analysis tools to check if the directives are being applied as expected.Monitor and Adjust: Keep an eye on your reports and site performance to ensure that blocking these pages is achieving the desired effect without inadvertently blocking important pages.
By doing this, you should be able to reduce the number of irrelevant or false positive issues reported by Rogerbot and make your reporting more focused and useful.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Unsolved The Moz.com bot is overloading my server
0 -
Small startup Marketing leader - What are 3 actionable reports I can review daily
Hi all - I've joined a small startup as their first marketing hire and I am strategizing, planning, and executing all work. I need to get to 3-4 reports I focus on per channel so I can still be relatively effective across multiple channels. What are 3-4 reports I should be laser-focused on in Moz that will help me ID opportunities/threats and be able to identify best actions from.
Digital Marketing | | AndrewAeqium0 -
520 Error from crawl report with Cloudflare
I am getting a lot of 520 Server Error in crawl reports. I see this is related to Cloudflare. We know 520 is Cloudflare so maybe the Moz team can change this from "unknown" to "Cloudflare 520". Perhaps the Moz team can update the "how to fix" section in the reporting, if they have some possible suggestions on how to avoid seeing these in the report of if there is a real issue that needs to be addressed. At this point I don't know. There must be a solution that Moz can provide like a setting in Cloudflare that will permit the Rogerbot if Cloudflare is blocking it because it does not like its behavior or something. It could be that Rogerbot is crawling my site on a bad day or at a time when we were deploying a massive site change. If I know when my site will be down can I pause Rogerbot? I found this https://developers.cloudflare.com/support/troubleshooting/general-troubleshooting/troubleshooting-crawl-errors/
Technical SEO | | awilliams_kingston0 -
Client Dashboard Options
My company is an agency, and we manage many SEO Campaigns. I love the reports, but I'd really like to add an online dashboard that my clients can login to and see the same up to date stats as I do in Moz Pro. I can't do it with Seats because you can't limit a seat to a specific campaign (as far as I know). Has anyone found a solution for this?
Reporting & Analytics | | bizmarquee3 -
Spam Direct Traffic
Hello, Lately, I have been receiving a big amount of unexpected direct traffic from Boston. After analyzing with Analytivs, this is what I get (please, check attachment). Normally I would be blocking this traffic source straight away from my Google Analytics account, and also blocking this traffic from accesing my servers, but check out the analytic metrics: this traffic represents 12% of my total traffic right now!!! av. session duration is 4:53 !! bounce rate is 72% !!!! pages/session 1.44 !! Service provider is "Microsoft Corporation" who looks like one of the typical spammy service providers. My question is, is this a bot?? what do you think ? Thanks, Luis zUlVHIi
Reporting & Analytics | | Yeeply.com1 -
Drop in direct traffic
Hi, I help look after two websites and have been tracking traffic sources for a couple of years. I have noticed that both sites have seen a drop in direct traffic over 2 years. Has anyone else noticed this and do you have a hypothesis as to the reason? Overall, on both sites traffic has increased. Thanks, Amelia
Reporting & Analytics | | CommT0 -
Google Analytics shows most referrers as "Direct" -- What are some better tools?
Very often Google Analytics will show 50-90% of our referrers as (direct) which is not very helpful. Are there other tools out there that will provide a clearer breakdown of what other websites are sending us our traffic? Specifically, I want to be able to be able to tell who are the top traffic referrers to my top performing pages on my site for the last 30 days. (I want to be able to study this on a per-page basis.) Thanks in advance!
Reporting & Analytics | | Brand_Psychic0 -
Domain redirect for direct mail (source) tracking in Analytics?
We have a client that would like to do some direct mail marketing and the plan is to use a short/simple domain in the marketing materials, which redirects to the main site domain. By default this would show as a referral traffic source in Analytics, right? So any traffic that came through that redirect would be attributed to "shortdomain.com / referral"? Meaning I wouldn't need to do any sort of customized, advanced tracking set up to track conversions that I've already set up (ecomm and goals) and attribute them to this new source? Just double checking that I'm not overlooking something. Thanks!
Reporting & Analytics | | VTDesignWorks0