Standard Syntax in robots.txt doesn't prevent Moz bot from crawling
-
A client is getting many false positive site crawl errors for things like duplicate titles and duplicate content on pages that include /tag/ in the URL. An example is https://needquest.com/place_tag/autism-spectrum-disorder/page/4/
To resolve this we have set up a disallow statement in the robots.txt file that says
Disallow: /page/For some reason this appears not to work, as the site crawl errors continue to list pages like this. Does anyone understand why that would be and what we need to do to properly disallow crawling these pages?
-
Thanks, Tawny,
If you look at Duplicate titles, check the first one (https://needquest.com/place_tag/autism-spectrum-disorder/). All the URLs with a duplicate title have /page/ in them. I will suggest they move the Allow statement and see if that helps.
-
I'm not seeing that URL coming up with Duplicate Title or Duplicate Content issues — when I search by that URL I see no Content issues at that URL. I do see that URL in the All Crawled Pages section, but I can't find it bringing up Content issues in the app.
That said, I took a look at your robots.txt file, and I think this could be a result of having an Allow command before the rest of the Disallow commands. I think possibly if you put that Allow command at the end of the block of Disallow commands, rogerbot would see the disallow for /page/ and stop crawling those URLs.
If you're still running into trouble, I would suggest writing in to us at help@moz.com so we can take a closer look at the Campaign and what could be going on there.
-
Any reason the Disallow: /page/ isn't preventing URLs like
https://needquest.com/place_tag/autism-spectrum-disorder**/page/**4/
from generating duplicate descriptions and title errors in our site crawl? It was my hope that those pages wouldn't be crawled at all. -
Sorry, Tawny ... I did go back and correct y question. We did apply Disallow: /page/ to address this issue. The /place_tag/ is found in many pages we DO want to crawl and index ... and we only want here to disallow those page 2, page 3, page 4, etc. pages.
(We also disallowed /tag/, /category/, and a few other common issues that generate false positives in the site crawl.)
-
Hey there!
Tawny from Moz's Help Team here.
Adding a disallow directive for /tag/ won't help with the example URL you've provided — that URL doesn't have /tag/ in the URL pathway. To block us from seeing content like that URL you listed, you'd need a disallow directive for /place_tag/.
If you include that disallow directive, that should stop us from seeing duplicate content on pages with /place_tag/ in the URL.
Hope that helps! If you've still got questions, feel free to shoot us a note over at help@moz.com and we'll do our best to sort things out with you.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Risk of moz on website
Does Moz collect information of customers accessing my website https://tructiepbongda.site/?
Getting Started | | gogoanimetp
Thank you!0 -
Crawling issue
Hi, I have to set up a campaign for a webshop. This webshop is a subdomain itself. First question: The two subfolders I need to track are /nl_BE and /fr_BE. What is the best way to handle this? Shall I set up two different campaigns for each subfolder, or shall I just make one campaign and add tags to keywords? **Second question: **it seems like Moz can't crawl enough pages. There are no disallows in the robots.txt. Should I try putting the following at the top into my robots.txt? User-agent: rogerbot
Getting Started | | Mat_C
Disallow: Or is it because I want to crawl only a subdomain that it doesn't work? Thanks0 -
Why does the moz bar link count differ from the moz link analysis page
Hi all, Why does the Moz Bar show a different link count from the Moz link analysis page? For example, when I check the SERP below, for the first result, the bar shows 936 page links from 4 RDs. But when I check out the link analysis page, it tells me there are just 141 page links from 4 RDs. What gives? For the second entry, the bar shows 6 page links from 0 RDs. Not sure how that's possible. Can anyone explain these things. Thanks! Andy Reviewed SERP: https://www.google.com/search?source=hp&ei=hhRVW5yyH5C60PEP-_-isAw&q=mountain+bike+trails+near+me&oq=mountain+bike+trails+near+me&gs_l=psy-ab.3..0l7j0i22i30k1l3.645.4654.0.5322.28.19.0.8.8.0.243.2920.3j11j5.19.0....0...1c.1.64.psy-ab..2.26.2961...0i131k1.0.C4lAxLkGgH0
Getting Started | | AndyKubrin0 -
Moz campaign for a subdomain with no home page
Hello all, I'm trying to setup a campaign in Moz for one of our subdomains that we use for marketing landing pages. Unfortunately there really is no home/landing page for the subdomain so when I enter it into the campaign settings it says the site isn't accessible. Is there anyway for me to add this? I tried entering the subdomain with a star, e.g. pages.mydomain.com but it would not accept it. Any ideas? Thanks!
Getting Started | | Brando160 -
Crawl Diagnostics Help
Hi there Where can i find my campaigns crawl diagnostics? I need to find where this information can be found and specific issues? Is this possible, i cant seem to find this info. regards Ana
Getting Started | | Starsia200000 -
Is the Moz Rank Tracker Broken?
Hi, I've waited a few days for data to appear in the account but all of the keywords show 'not in top 50'. When I know this isn't the case since I can see half of them myself either on page one or two (yes I have cleared cache / history and have tried an incognito window to see the results for myself; I have even tried another PC. Also I'm really not keen on the new layout of the pro account, everything is so much smaller event the text. Personally I think it doesn't look half as clean and simple as it used to. Another thing, the link analysis tab isn't showing all of my links - it's only showing 2! Which is way off, even when I check with Open Site Explorer there are a lot more than that. Thanks Ricky
Getting Started | | Thirsty-Media0 -
Whenever I try to access campaigns in moz pro I get an error page
I recently signed-up for a new pro account. As I was adding my first subdomain everything was fine until I was asked to link to GA, when I clicked yes I got this error message: 403 Forbidden Now every time I click on set-up campaign I get taken to a page with nothing but the 403 Forbidden text.
Getting Started | | Toptal0