Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Block Moz (or any other robot) from crawling pages with specific URLs
-
Hello!
Moz reports that my site has around 380 duplicate page content. Most of them come from dynamic generated URLs that have some specific parameters. I have sorted this out for Google in webmaster tools (the new Google Search Console) by blocking the pages with these parameters. However, Moz is still reporting the same amount of duplicate content pages and, to stop it, I know I must use robots.txt. The trick is that, I don't want to block every page, but just the pages with specific parameters. I want to do this because among these 380 pages there are some other pages with no parameters (or different parameters) that I need to take care of. Basically, I need to clean this list to be able to use the feature properly in the future.
I have read through Moz forums and found a few topics related to this, but there is no clear answer on how to block only pages with specific URLs. Therefore, I have done my research and come up with these lines for robots.txt:
User-agent: dotbot
Disallow: /*numberOfStars=0User-agent: rogerbot
Disallow: /*numberOfStars=0My questions:
1. Are the above lines correct and would block Moz (dotbot and rogerbot) from crawling only pages that have numberOfStars=0 parameter in their URLs, leaving other pages intact?
2. Do I need to have an empty line between the two groups? (I mean between "Disallow: /*numberOfStars=0" and "User-agent: rogerbot")? (or does it even matter?)
I think this would help many people as there is no clear answer on how to block crawling only pages with specific URLs. Moreover, this should be valid for any robot out there.
Thank you for your help!
-
Hello!
Thanks a lot for your feedback and clearing this out! It worked well.
The robots.txt tester is a good tip!
Thanks!
-
Hi,
What you have there will work absolutely fine with a little tweak. And no need to leave spaces between lines.
Disallow: /numberOfStars=0
However, no need to add the wildcard at the end if there is nothing more after that.
The best way to test what works, is before you go and add it to live, use the Robots.txt test tool in Search Console (Webmaster Tools), add in the lines above and then check to make sure none of your other pages are blocked. They won't be, but it's a great way to test before going live.
I hope this helps
-Andy
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Ahrefs vs Moz
Hi! I noticed the Moz DA en the Ahrefs DA are very different. Where https://www.123opzeggen.nl/ has a DA of 10 at MOZ, the DA at Ahrefs is 26. Where does this big difference come from? Do you measure in different ways? I hope you can answer this question for me. Thank you in advance!
Moz Pro | | NaomiAdivare2 -
URL Length Issue
MOZ is telling me the URLs are too long. I did a little research and I found out that the length of the URLs is not really a serious problem. In fact, others recommend ignoring the situation. Even on their blog I found this explanation: "Shorter URLs are generally preferable. You do not need to take this to the extreme, and if your URL is already less than 50-60 characters, do not worry about it at all. But if you have URLs pushing 100+ characters, there's probably an opportunity to rewrite them and gain value. This is not a direct problem with Google or Bing - the search engines can process long URLs without much trouble. The issue, instead, lies with usability and user experience. Shorter URLs are easier to parse, copy and paste, share on social media, and embed, and while these may all add up to a fractional improvement in sharing or amplification, every tweet, like, share, pin, email, and link matters (either directly or, often, indirectly)." And yet, I have these questions: In this case, why do I get this error telling me that the urls are too long, and what are the best practices to get this out? Thank You
Moz Pro | | Cart_generation1 -
Robots.txt blocking Moz
Moz are reporting the robots.txt file is blocking them from crawling one of our websites. But as far as we can see this file is exactly the same as the robots.txt files on other websites that Moz is crawling without problems. We have never come up against this before, even with this site. Our stats show Rogerbot attempting to crawl our site, but it receives a 404 error. Can anyone enlighten us to the problem please? http://www.wychwoodflooring.com -Christina
Moz Pro | | ChristinaRadisic0 -
Pages with URL Too Long
Hello Mozzers! MOZ keeps kindly telling me the URLs are too long. However, this is largely due to the structure of E-commerce site, which has to include 'brand' 'range' and 'products' keyword. For example -
Moz Pro | | tigersohelll
https://www.choicefurnituresuperstore.co.uk/Devonshire-Rustic-Oak-Bedside-Cabinet-1-Drawer-p40668.html MOZ recommends no more than 75 characters. This means we have 25-30 characters for both the brand name and product name. Questions:
If it is an issue, how to fix it on my site?
If it's not an issue, how can we turn off this alert from MOZ?
Anyone know how big an issue URLs are as a ranking factor? I thought pretty low.0 -
Woocommerce filter urls showing in crawl results, but not indexed?
I'm getting 100's of Duplicate Content warnings for a Woocommerce store I have. The urls are
Moz Pro | | JustinMurray
etc These don't seem to be indexed in google, and the canonical is for the shop base url. These seem to be simply urls generated by Woocommerce filters. Is this simply a false alarm from Moz crawl?0 -
Page Authority is the same on every page of my site
I'm analyzing a site and the page authority is the exact same for every page in the site. How can this be since the page authority is supposed to be unique to each page?
Moz Pro | | azjayhawk0 -
Is there a tool to upload multiple URLs and gather statistics and page rank?
I was wondering if there is a tool out there where you can compile a list of URL resources, upload them in a CSV and run a report to gather and index each individual page. Does anyone know of a tool that can do this or do we need to create one?
Moz Pro | | Brother220 -
How to check Page Authority in bulk?
Hey guys, I'm on the free trial for SEOmoz PRO and I'm in love. One question, though. I've been looking all over the internet for a way to check Page Authority in bulk. Is there a way to do this? Would I need the SEOmoz API? And what is the charge? All I really need is a way to check Page Authority in bulk--no extra bells and whistles. Thanks, Brandon
Moz Pro | | thegreatpursuit0