Rogerbot directives in robots.txt
-
I feel like I spend a lot of time setting false positives in my reports to ignore.
Can I prevent Rogerbot from crawling pages I don't care about with robots.txt directives? For example., I have some page types with meta noindex and it reports these to me. Theoretically, I can block Rogerbot from these with a robots,txt directive and not have to deal with false positives.
-
Yes, you can definitely use the robots.txt file to prevent Rogerbot from crawling pages that you don’t want to include in your reports. This approach can help you manage and minimize false positives effectively.
To block specific pages or directories from being crawled, you would add directives to your robots.txt file. For example, if you have certain page types that you’ve already set with meta noindex, you can specify rules like this:
User-agent: Rogerbot Disallow: /path-to-unwanted-page/ Disallow: /another-unwanted-directory/
This tells Rogerbot not to crawl the specified paths, which should reduce the number of irrelevant entries in your reports.
However, keep in mind that while robots.txt directives can prevent crawling, they do not guarantee that these pages won't show up in search results if they are linked from other sites or indexed by different bots.
Additionally, using meta noindex tags is still a good practice for pages that may occasionally be crawled but shouldn’t appear in search results. Combining both methods—robots.txt for crawling and noindex for indexing—provides a robust solution to manage your web presence more effectively.
-
Never mind, I found this. https://moz.com/help/moz-procedures/crawlers/rogerbot
-
@awilliams_kingston
Yes, you can use robots.txt directives to prevent Rogerbot from crawling certain pages or sections of your site, which can help reduce the number of false positives in your reports. By doing so, you can focus Rogerbot’s attention on the parts of your site that matter more to you and avoid reporting issues on pages you don't care about.Here’s a basic outline of how you can use robots.txt to block Rogerbot:
Locate or Create Your robots.txt File: This file should be placed in the root directory of your website (e.g., https://www.yourwebsite.com/robots.txt).
Add Directives to Block Rogerbot: You’ll need to specify the user-agent for Rogerbot and define which pages or directories to block. The User-agent directive specifies which web crawlers the rules apply to, and Disallow directives specify the URLs or directories to block.
Here’s an example of what your robots.txt file might look like if you want to block Rogerbot from crawling certain pages:
javascript
Disallow: /path-to-block/
Disallow: /another-path/
If you want to block Rogerbot from accessing pages with certain parameters or patterns, you can use wildcards:javascript
Disallow: /path-to-block/*
Disallow: /another-path/?parameter=
Verify the Changes: After updating the robots.txt file, you can use tools like Google Search Console or other site analysis tools to check if the directives are being applied as expected.Monitor and Adjust: Keep an eye on your reports and site performance to ensure that blocking these pages is achieving the desired effect without inadvertently blocking important pages.
By doing this, you should be able to reduce the number of irrelevant or false positive issues reported by Rogerbot and make your reporting more focused and useful.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Unsolved 503 Service Unavailable (temporary?) Rogerbot takes a break
A lot of my Moz duties seem to be setting hundreds of issues to ignore because my site was getting crawled while under maintenance. Why can't Rogerbot take a break after running into a few of these and then try again later? Is there an official code for Temporary Service Unavailability that can smart bots pause crawls so that they are not wasting compute, bandwidth, crawl budget and my time?
Product Support | | awilliams_kingston0 -
Unsolved Google Analytics (GA4) recommendations for SEO analysis?
Guides on Moz and elsewhere mostly refer to Google Analytics' Universal Analytics (UA). However, UA is being replaced with GA4, and the interface, options, and reporting are very different. Can you recommend a clear, thorough, and effective walkthrough of how to set up useful SEO reports in GA4? Is there a simple tool you recommend that will help connect historical data from UA to GA4 when GA4 is the only option available? If there's no simple tool, what values do you recommend retaining from UA for effective historical reporting? How would you use them? At minimum for reporting, I'd want to show month-to-month changes and year-to-year changes (in percentages and in real numbers) for the following: all site visits all organic visits organic visits as a percentage of all site visits organic visits that led to a specific goal completion organic visits that led to any goal completion Thanks in advance for your help!
Reporting & Analytics | | Kevin_P1 -
Increase in Direct Traffic plus Bounce Rate rise for all traffic sources
Hello, I work for an agency and we have seen a big rise in bounce rate for 4 of our clients which happened on the exact same day. This rise on bounce rate is across all traffic sources. We are also seeing a big increase in direct traffic, starting on the same day. Is it possible for bot traffic to affect the bounce rate of all other traffic sources? We have ruled out double reporting in GA but can explain how the bounce rate has increased for all traffic sources. How is this linked to the rise in direct traffic (in some cases as high as 500%)? Thanks
Reporting & Analytics | | jenallen0 -
Direct traffic spam on Google Analytics: how can you identify and filter it?
One of my smaller clients noticed a huge jump in direct traffic visits last month. The bounce rate was around 97% so I'm pretty certain that most of the traffic was illegitimate. I know how to filter out spam referrals and organic keywords in Google Analytics. However I'm not sure what to do about direct traffic spam. Are there recommendations for filtering this out? Can I identify spam IP addresses?
Reporting & Analytics | | RosemaryB0 -
Organic and direct traffic swap
We moved to a CMS (Webhook) and when we did that organic traffic and direct traffic swapped places. Since we moved it organic traffic is down by about 400 visits and direct traffic is up by 400 visits. I went through the list below and confirmed everything is working. The http refer wasn't being passed for a couple of weeks but the issue was resolved and the organic traffic issue is still ongoing. Is there anything else that may cause this issue? I confirmed the issue isn't one of the below problems. during http to https redirect (or vice versa) the referrer may not be passed incorrect subdomain or cross-domain tracking can strip the referrer. 302 redirects sometimes caused the referrer to be dropped problems with cookies being lost/corrupted. javascript missing from certain entry pages (means any further page view looks like a direct)
Reporting & Analytics | | BT20090 -
Universal Analytics: Why does Google Organic appear as Direct traffic?
Hi there, When I enter the site via Google Search and follow myself via Real-Time Analytics I appear an organic visitor (which is good). When I browse and visit the site I still am an organic visitor. However, as soon as I fill in the contact form (gravity forms) and land on the "thank you page" I appear as a direct visitor with Google as the source. Since I have the thank you page set-up as a goal, Analytics incorrectly attributes these conversions to the direct medium instead of the organic medium. The tracking code has been installed on all the pages and all conversions are being recorded. What is going on?
Reporting & Analytics | | Robbern0 -
Huge Spike in Direct Traffic from IE7
Our site is seeing a huge spike in direct (none) traffic from IE 7 from July 8, 2014 - on. June 25 - July 7 showed 21 direct visits from IE 7; July 8 - July 20 is showing 5,889 (an increase of 27,943%). All traffic from the spike is going to our homepage. Other Google Analytics' stats for this direct (none) IE 7 traffic: Bounce Rate: 99.52%
Reporting & Analytics | | SJVC_Susie
Avg. Session Duration: 0:02
Pages/session: 1.01
Mostly all new users What's strange is that the traffic is from a variety of cities and networks. What could be causing this? Has anyone experienced this before?0