Rogerbot directives in robots.txt
-
I feel like I spend a lot of time setting false positives in my reports to ignore.
Can I prevent Rogerbot from crawling pages I don't care about with robots.txt directives? For example., I have some page types with meta noindex and it reports these to me. Theoretically, I can block Rogerbot from these with a robots,txt directive and not have to deal with false positives.
-
Yes, you can definitely use the robots.txt file to prevent Rogerbot from crawling pages that you don’t want to include in your reports. This approach can help you manage and minimize false positives effectively.
To block specific pages or directories from being crawled, you would add directives to your robots.txt file. For example, if you have certain page types that you’ve already set with meta noindex, you can specify rules like this:
User-agent: Rogerbot Disallow: /path-to-unwanted-page/ Disallow: /another-unwanted-directory/
This tells Rogerbot not to crawl the specified paths, which should reduce the number of irrelevant entries in your reports.
However, keep in mind that while robots.txt directives can prevent crawling, they do not guarantee that these pages won't show up in search results if they are linked from other sites or indexed by different bots.
Additionally, using meta noindex tags is still a good practice for pages that may occasionally be crawled but shouldn’t appear in search results. Combining both methods—robots.txt for crawling and noindex for indexing—provides a robust solution to manage your web presence more effectively.
-
Never mind, I found this. https://moz.com/help/moz-procedures/crawlers/rogerbot
-
@awilliams_kingston
Yes, you can use robots.txt directives to prevent Rogerbot from crawling certain pages or sections of your site, which can help reduce the number of false positives in your reports. By doing so, you can focus Rogerbot’s attention on the parts of your site that matter more to you and avoid reporting issues on pages you don't care about.Here’s a basic outline of how you can use robots.txt to block Rogerbot:
Locate or Create Your robots.txt File: This file should be placed in the root directory of your website (e.g., https://www.yourwebsite.com/robots.txt).
Add Directives to Block Rogerbot: You’ll need to specify the user-agent for Rogerbot and define which pages or directories to block. The User-agent directive specifies which web crawlers the rules apply to, and Disallow directives specify the URLs or directories to block.
Here’s an example of what your robots.txt file might look like if you want to block Rogerbot from crawling certain pages:
javascript
Disallow: /path-to-block/
Disallow: /another-path/
If you want to block Rogerbot from accessing pages with certain parameters or patterns, you can use wildcards:javascript
Disallow: /path-to-block/*
Disallow: /another-path/?parameter=
Verify the Changes: After updating the robots.txt file, you can use tools like Google Search Console or other site analysis tools to check if the directives are being applied as expected.Monitor and Adjust: Keep an eye on your reports and site performance to ensure that blocking these pages is achieving the desired effect without inadvertently blocking important pages.
By doing this, you should be able to reduce the number of irrelevant or false positive issues reported by Rogerbot and make your reporting more focused and useful.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Unsolved 503 Service Unavailable (temporary?) Rogerbot takes a break
A lot of my Moz duties seem to be setting hundreds of issues to ignore because my site was getting crawled while under maintenance. Why can't Rogerbot take a break after running into a few of these and then try again later? Is there an official code for Temporary Service Unavailability that can smart bots pause crawls so that they are not wasting compute, bandwidth, crawl budget and my time?
Product Support | | awilliams_kingston0 -
Increase in Direct Traffic plus Bounce Rate rise for all traffic sources
Hello, I work for an agency and we have seen a big rise in bounce rate for 4 of our clients which happened on the exact same day. This rise on bounce rate is across all traffic sources. We are also seeing a big increase in direct traffic, starting on the same day. Is it possible for bot traffic to affect the bounce rate of all other traffic sources? We have ruled out double reporting in GA but can explain how the bounce rate has increased for all traffic sources. How is this linked to the rise in direct traffic (in some cases as high as 500%)? Thanks
Reporting & Analytics | | jenallen0 -
Unsolved Building a report that provides all ranking keywords to specific pages on my website
I'd like to build a report that provides me all of the ranking keywords (tracked and un-tracked) for about 100 specific webpages on the rockhurst.edu site. I can pull the keywords from Keyword Explorer using the exact page url, but I don't want to have to do that individually for all 100 or so pages.
Keyword Explorer | | Dave_Hunt0 -
Google Analytics - Referral traffic being reported as Direct traffic
Hi, I have a link on my www homepage to another subdomain website. Both websites are served on https, and have independent Google Analytics properties. However, the traffic from the www site to the subdomain site is being reported as Direct, and not Referral traffic. Any ideas why? See image provided for context... RCn6HER
Reporting & Analytics | | RWesley1 -
Direct traffic spam on Google Analytics: how can you identify and filter it?
One of my smaller clients noticed a huge jump in direct traffic visits last month. The bounce rate was around 97% so I'm pretty certain that most of the traffic was illegitimate. I know how to filter out spam referrals and organic keywords in Google Analytics. However I'm not sure what to do about direct traffic spam. Are there recommendations for filtering this out? Can I identify spam IP addresses?
Reporting & Analytics | | RosemaryB0 -
Longevity of robot.txt files on Google rankings
This may be a difficult question to answer without a ton more information, but I'm curious if there's any general thought that could shed some light on the following scenario I've recently heard about and wish to be able to offer some sound advice: An extremely reputable non-profit site with excellent ranking had gone through a re-design and change-over into WordPress. A robots.txt file was used during development on the dev site on the dev server. Two months later it was noticed through GA that traffic was way down to the site. It was then discovered that the robot.txt file hadn't been removed and the new site (same content, same nav) went live with it in place. It was removed and a site index forced. How long might it take for the site to re-appear and regain past standing in the SERPs if rankings have been damaged. What would the expected recovery time be?
Reporting & Analytics | | gfiedel0 -
Get a list of robots.txt blocked URL and tell Google to crawl and index it.
Some of my key pages got blocked by robots.txt file and I have made required changes in robots.txt file but how can I get the blocked URL's list. My webmaster page Health>blocked URL's shows only number not the blocked URL's.My first question is from where can I fetch these blocked URL's and how can I get them back in searches, One other interesting point I see is that blocked pages are still showing up in searches.Title is appearing fine but Description shows blocked by robots.txt file. I need urgent recommendation as I do not want to see drop in my traffic any more.
Reporting & Analytics | | csfarnsworth0 -
Sudden drop in direct visit?
I'm having a sudden drop in direct visit on 12th January by 50% and never go up again since then. I dont change anything in my website, just regular posting. Could someone give me any idea about this? Thank you
Reporting & Analytics | | apps-foundry0