Standard Syntax in robots.txt doesn't prevent Moz bot from crawling
-
A client is getting many false positive site crawl errors for things like duplicate titles and duplicate content on pages that include /tag/ in the URL. An example is https://needquest.com/place_tag/autism-spectrum-disorder/page/4/
To resolve this we have set up a disallow statement in the robots.txt file that says
Disallow: /page/For some reason this appears not to work, as the site crawl errors continue to list pages like this. Does anyone understand why that would be and what we need to do to properly disallow crawling these pages?
-
Thanks, Tawny,
If you look at Duplicate titles, check the first one (https://needquest.com/place_tag/autism-spectrum-disorder/). All the URLs with a duplicate title have /page/ in them. I will suggest they move the Allow statement and see if that helps.
-
I'm not seeing that URL coming up with Duplicate Title or Duplicate Content issues — when I search by that URL I see no Content issues at that URL. I do see that URL in the All Crawled Pages section, but I can't find it bringing up Content issues in the app.
That said, I took a look at your robots.txt file, and I think this could be a result of having an Allow command before the rest of the Disallow commands. I think possibly if you put that Allow command at the end of the block of Disallow commands, rogerbot would see the disallow for /page/ and stop crawling those URLs.
If you're still running into trouble, I would suggest writing in to us at help@moz.com so we can take a closer look at the Campaign and what could be going on there.
-
Any reason the Disallow: /page/ isn't preventing URLs like
https://needquest.com/place_tag/autism-spectrum-disorder**/page/**4/
from generating duplicate descriptions and title errors in our site crawl? It was my hope that those pages wouldn't be crawled at all. -
Sorry, Tawny ... I did go back and correct y question. We did apply Disallow: /page/ to address this issue. The /place_tag/ is found in many pages we DO want to crawl and index ... and we only want here to disallow those page 2, page 3, page 4, etc. pages.
(We also disallowed /tag/, /category/, and a few other common issues that generate false positives in the site crawl.)
-
Hey there!
Tawny from Moz's Help Team here.
Adding a disallow directive for /tag/ won't help with the example URL you've provided — that URL doesn't have /tag/ in the URL pathway. To block us from seeing content like that URL you listed, you'd need a disallow directive for /place_tag/.
If you include that disallow directive, that should stop us from seeing duplicate content on pages with /place_tag/ in the URL.
Hope that helps! If you've still got questions, feel free to shoot us a note over at help@moz.com and we'll do our best to sort things out with you.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How long does it take, So i can see results of Kewords Ranking search on moz?
Hi, I m the new guy at the Neighborhood Can you help please with the question... how long does it take, So i can see results of Kewords Ranking search on moz? This is the popup note Message: Moz collects and updates campaign data weekly. Not collected means we have not yet updated data for this keyword. TNX
Getting Started | | YarivHPAZ1 -
New to Moz Pro? Join our latest free webinar this Friday!
Hello everyone! We'll be holding a webinar on tomorrow to help new members learn about what all Pro has to offer, show some off our most popular tools, and get you comfortable with the dashboard. Register here: https://www3.gotomeeting.com/register/355765902 Date: Friday, September 26th (this Friday!) Time: 10:00 AM - 11:00 AM PDT Hope to see you all there!
Getting Started | | jennita1 -
New to Moz Pro? Join our free webinar this Friday!
Hello everyone! We'll be holding a webinar on Friday to help new members learn about what all Pro has to offer, show some off our most popular tools, and get you comfortable with the dashboard. Register here: https://www3.gotomeeting.com/register/589105390 Date: Friday, August 29th (this Friday!) Time: 10:00 AM - 11:00 AM PDT Hope to see you all there!
Getting Started | | jennita6 -
I am new to MOZ, I set up one tracking campaign two weeks ago, I have tracked no keywords, I have done some keyword research for ranking difficulty and in two weeks I have already hit 50K pages crawled, I'm maxed out, is this common?
I am a startup and can't afford the higher plans yet. And even their highest plan is 600K pages crawled, which seems really low considering how lightly I used the tool and how quickly I hit 50K. Does anyone have any advice or information on how they use the tool on lower packages? Did I do something wrong to hit 50K pages crawled that fast? Does this pricing make any sense, it seems like an incredibly high price, I love the tool, any help is appreciated.
Getting Started | | Daedilus1 -
Crawl Diagnostics Help
Hi there Where can i find my campaigns crawl diagnostics? I need to find where this information can be found and specific issues? Is this possible, i cant seem to find this info. regards Ana
Getting Started | | Starsia200000 -
High Number of Crawl Errors for Blog
Hello All, We have been having an issue with very high crawl errors on websites that contain blogs. Here is a screenshot of one of the sites we are dealing with: http://cl.ly/image/0i2Q2O100p2v . Looking through the links that are turning up in the crawl errors, the majority of them (roughly 90%) are auto-generated by the blog's system. This includes category/tag links, archived links, etc. A few examples being: http://www.mysite.com/2004/10/ http://www.mysite.com/2004/10/17/ http://www.mysite.com/tagname As far as I know (please correct me if I'm wrong!), search engines will not penalize you for things like this that appear on auto-generated pages. Also, even if search engines did penalize you, I do not believe we can make a unique meta tag for auto-generate pages. Regardless, our client is very concerned seeing these high number of errors in the reports, even though we have explained the situation to him. Would anyone have any suggestions on how to either 1) tell Moz to ignore these types of errors or 2) adjust the website so that these errors now longer appear in the reports? Thanks so much! Rebecca
Getting Started | | Level2Designs0 -
Is there a difference between Moz Pro and Moz Analytics?
I thought that Moz Analytics was now included with Moz Pro. Is this true? I have Moz Pro and I was using the beta version of Moz Analytics back in October. When I log into Moz Analytics, the latest data stops at October 27th and there is no data after that. Is that an indication of when my trial period ended? I have the basic Moz Pro, does that include Analytics portion? Thanks!
Getting Started | | Symmetri0