How to block Rogerbot From Crawling UTM URLs
-
I am trying to block roger from crawling some UTM urls we have created, but having no luck. My robots.txt file looks like:
User-agent: rogerbot Disallow: /?utm_source* This does not seem to be working. Any ideas?
-
Shoot! There may be something else going on. Give us a shout at help@moz.com and we'll see if we can figure it out!
-
FYI - I tried this and it did not work. Rogerbot is still picking up URL's we don't need. It's making my crawl report a mess!
-
The only difference there is the * wildchar. The string with that character will limit the crawler from accessing any URL with that string of characters in it.
-
What is the difference between Disallow: /*?utm_ and Disallow: /?utm_ ?
-
Hi there! Tawny from the Customer Support team here!
You should be able to add a disallow directive for that parameter and any others to block our crawler from accessing them. It would look something like this:
User-agent: Rogerbot
Disallow: ?utmetc., until you have blocked all of the parameters that may be causing these duplicate content errors. It looks like the _source* might be what's giving our tools some trouble. It looks like Logan Ray has made an excellent suggestion - give that formatting a try and see if it helps!
You can also use the wild card user-agent * in order to block all crawlers from those pages, if you prefer. Here is a great resource about the robots.txt file that might be helpful: https://moz.com/learn/seo/robotstxt We always recommend checking your robots.txt file with a handy Robots Checker Tool once you make changes to avoid any nasty surprises.
-
Skyler,
You're close, give this a shot:
Disallow: /*?utm_
This will be inclusive of all UTM tags regardless of what comes before the tag or what element you have first.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to remove staging Urls from being shown in MOZ as critical crawler issues?
Hi MOZs, A few months ago we were updating our website and used a staging site to do so. Soon after we realized that it was indexed by G and shown in Search so we removed it to be indexed. The staging site has since not been visible in search except for 2 links that I just found out about today. My question is how to remove the staging site URLs being shown in MOZ as crawl issues or remove completely? I want to do this because the digital reports are connected to MOZ and it shows there are currently more than 400x 4xx errors which are all because of the staging URLs. I already marked to ignore the issues but they are still showing in. Any help would be much appreciated! Thanks
Product Support | | Strausberg0 -
Why does Moz see short Russian & Chinese urls as too long
We are translating content into Russian and Chinese on our website, the number of errors are increasing mainly around URL too long, each time we create a page with a Chinese or Russian url. If you click on the link below for a Chinese content page: https://www.westbourneschool.com/zh-hans/%E5%AE%BF%E8%88%8D%E5%8F%8A%E5%AF%84%E5%AE%BF%E5%AE%B6%E5%BA%AD/%E5%AE%BF%E8%88%8D%E7%94%9F%E6%B4%BB You will notice the url displayed by the browser is actually not very long, is there a way for MOZ not to see it as it appears above? Below is a page in Russian https://www.westbourneschool.com/ru/%D0%A8%D0%BA%D0%BE%D0%BB%D0%B0%20%D0%9F%D1%80%D0%BE%D0%B6%D0%B8%D0%B2%D0%B0%D0%BD%D0%B8%D0%B5 Any help will be much appreciated.
Product Support | | mariedetitomount0 -
Site Crawl Status code 430
Hello, In the site crawl report we have a few pages that are status 430 - but that's not a valid HTTP status code. What does this mean / refer to?
Product Support | | ianatkins
https://en.wikipedia.org/wiki/List_of_HTTP_status_codes#4xx_Client_errors If I visit the URL from the report I get a 404 response code, is this a bug in the site crawl report? Thanks, Ian.0 -
Crawl Issue
Hi, We have 3 campaigns running for our websites in different territories. All was going well until April 11th when Moz reported that our .com site (sendmode.com) could not be crawled. I get this error "Your page redirects or links to a page that is outside of the scope of your campaign settings ..." I've been through the site a number of times but have been unable to get to the root of the problem. Robots.txt and 301's look fine. Is there any way I can find out which page is causing the issue? John
Product Support | | johnmc330 -
My site crawl has been in progress since last week
Hi there, I've been waiting on my site crawl to complete since Friday (it's Tuesday now), but it still has the 'in progress' notification at the top. Is it normal for it to take over 3 days? Or is there something holding it up?
Product Support | | VAPartners0 -
Why can I not crawl this site
I wanted to add this site as new campaign: new.kbc.be But it won't accept it. Why?
Product Support | | KBC0 -
Received emails about new ranking, crawl and on page reports, but nothing new shows.
Yesterday morning I had emails about updated crawl, ranking and on page reports being available however nothing in my dashboard is newer than 2/21. I waited through yesterday to see if things changed, logged in and out etc but nothing new has shown up. Any ideas on why that is the case?
Product Support | | sea2dca0 -
Cannot create campaign because Moz doesn't recognize my URL
I have a new url, and I'm trying to create a new campaign for it. But in first step when i enter the domain, an error message pops up saying the url is invalid. could you help?
Product Support | | ALLee0