Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Block Moz (or any other robot) from crawling pages with specific URLs
-
Hello!
Moz reports that my site has around 380 duplicate page content. Most of them come from dynamic generated URLs that have some specific parameters. I have sorted this out for Google in webmaster tools (the new Google Search Console) by blocking the pages with these parameters. However, Moz is still reporting the same amount of duplicate content pages and, to stop it, I know I must use robots.txt. The trick is that, I don't want to block every page, but just the pages with specific parameters. I want to do this because among these 380 pages there are some other pages with no parameters (or different parameters) that I need to take care of. Basically, I need to clean this list to be able to use the feature properly in the future.
I have read through Moz forums and found a few topics related to this, but there is no clear answer on how to block only pages with specific URLs. Therefore, I have done my research and come up with these lines for robots.txt:
User-agent: dotbot
Disallow: /*numberOfStars=0User-agent: rogerbot
Disallow: /*numberOfStars=0My questions:
1. Are the above lines correct and would block Moz (dotbot and rogerbot) from crawling only pages that have numberOfStars=0 parameter in their URLs, leaving other pages intact?
2. Do I need to have an empty line between the two groups? (I mean between "Disallow: /*numberOfStars=0" and "User-agent: rogerbot")? (or does it even matter?)
I think this would help many people as there is no clear answer on how to block crawling only pages with specific URLs. Moreover, this should be valid for any robot out there.
Thank you for your help!
-
Hello!
Thanks a lot for your feedback and clearing this out! It worked well.
The robots.txt tester is a good tip!
Thanks!
-
Hi,
What you have there will work absolutely fine with a little tweak. And no need to leave spaces between lines.
Disallow: /numberOfStars=0
However, no need to add the wildcard at the end if there is nothing more after that.
The best way to test what works, is before you go and add it to live, use the Robots.txt test tool in Search Console (Webmaster Tools), add in the lines above and then check to make sure none of your other pages are blocked. They won't be, but it's a great way to test before going live.
I hope this helps

-Andy
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Pages with URL Too Long
Hello Mozzers! MOZ keeps kindly telling me the URLs are too long. However, this is largely due to the structure of E-commerce site, which has to include 'brand' 'range' and 'products' keyword. For example -
Moz Pro | | tigersohelll
https://www.choicefurnituresuperstore.co.uk/Devonshire-Rustic-Oak-Bedside-Cabinet-1-Drawer-p40668.html MOZ recommends no more than 75 characters. This means we have 25-30 characters for both the brand name and product name. Questions:
If it is an issue, how to fix it on my site?
If it's not an issue, how can we turn off this alert from MOZ?
Anyone know how big an issue URLs are as a ranking factor? I thought pretty low.0 -
MOZ Versus HUBSPOT??? If We Discontinue MOZ Subscription and Get HUBSPOT, Anything Lost?
How does Hubspot compare with MOZ? Doe it provide similar features or is the functionality very different? Is Hubspot complementary to MOZ or could it be used as a substitute? If we stopped our MOZ subscription and subscribed to Hubspot would we lose anything? Thanks, Alan
Moz Pro | | Kingalan10 -
Woocommerce filter urls showing in crawl results, but not indexed?
I'm getting 100's of Duplicate Content warnings for a Woocommerce store I have. The urls are
Moz Pro | | JustinMurray
etc These don't seem to be indexed in google, and the canonical is for the shop base url. These seem to be simply urls generated by Woocommerce filters. Is this simply a false alarm from Moz crawl?0 -
How long for authority to transfer form an old page to a new page via a 301 redirect? (& Moz PA score update?)
Hi How long aproximately does G take to pass authority via a 301 from an old page to its new replacement page ? Does Moz Page Authority reflect this in its score once G has passed it ? All Best
Moz Pro | | Dan-Lawrence
Dan3 -
Moz WordPress Plugin?
WordPress is currently 18% of the Internet. Given its huge footprint, wouldn't it make sense for Moz to develop a WP plugin that can not only report site metrics, but help fix and optimize site structure directly from within the site? Just curious - I can't be the only one who wonders if I'm implementing Moz findings/recommendations correctly given the myriad of WP SEO plugins, authors, implementations.
Moz Pro | | twelvetwo.net5 -
How to resolve Duplicate Content crawl errors for Magento Login Page
I am using the Magento shopping cart, and 99% of my duplicate content errors come from the login page. The URL looks like: http://www.site.com/customer/account/login/referer/aHR0cDovL3d3dy5tbW1zcGVjaW9zYS5jb20vcmV2aWV3L3Byb2R1Y3QvbGlzdC9pZC8xOTYvY2F0ZWdvcnkvNC8jcmV2aWV3LWZvcm0%2C/ Or, the same url but with the long string different from the one above. This link is available at the top of every page in my site, but I have made sure to add "rel=nofollow" as an attribute to the link in every case (it is done easily by modifying the header links template). Is there something else I should be doing? Do I need to try to add canonical to the login page? If so, does anyone know how to do it using XML?
Moz Pro | | kdl01 -
Duplicate page titles are the same URL listed twice
The system says I have two duplicate page titles. The page titles are exactly the same because the two URLs are exactly the same. These same two identical URLs show up in the Duplicate Page Content also - because they are the same. We also have a blog and there are two tag pags showing identical content - I have blocked the blog in robots.txt now, because the blog is only for writers. I suppose I could have just blocked the tags pages.
Moz Pro | | loopyal0 -
Use of the tilde in URLs
I just signed up for SEOMoz and sent my site through the first crawl. I use the tilde in my rewritten URLs. This threw my entire site into the Notice section 301 (permanent redirect) since each page redirects to the exact URL with the ~, not the %7e. I find conflicting information on the web - you can use the tilde in more recent coding guidelines where you couldn't in the old. It would be a huge thing to change every page in my site to use an underscore instead of a tilde int he URL. If Google is like SEOMoz and is 301 redirecting every page on the site, then I'll do it, but is it just an SEOMoz thing? I ran my site through Firebug and and all my pages show the 200 response header, not the 301 redirect. Thanks for any help you can provide.
Moz Pro | | fdb0