Standard Syntax in robots.txt doesn't prevent Moz bot from crawling
-
A client is getting many false positive site crawl errors for things like duplicate titles and duplicate content on pages that include /tag/ in the URL. An example is https://needquest.com/place_tag/autism-spectrum-disorder/page/4/
To resolve this we have set up a disallow statement in the robots.txt file that says
Disallow: /page/For some reason this appears not to work, as the site crawl errors continue to list pages like this. Does anyone understand why that would be and what we need to do to properly disallow crawling these pages?
-
Thanks, Tawny,
If you look at Duplicate titles, check the first one (https://needquest.com/place_tag/autism-spectrum-disorder/). All the URLs with a duplicate title have /page/ in them. I will suggest they move the Allow statement and see if that helps.
-
I'm not seeing that URL coming up with Duplicate Title or Duplicate Content issues — when I search by that URL I see no Content issues at that URL. I do see that URL in the All Crawled Pages section, but I can't find it bringing up Content issues in the app.
That said, I took a look at your robots.txt file, and I think this could be a result of having an Allow command before the rest of the Disallow commands. I think possibly if you put that Allow command at the end of the block of Disallow commands, rogerbot would see the disallow for /page/ and stop crawling those URLs.
If you're still running into trouble, I would suggest writing in to us at help@moz.com so we can take a closer look at the Campaign and what could be going on there.
-
Any reason the Disallow: /page/ isn't preventing URLs like
https://needquest.com/place_tag/autism-spectrum-disorder**/page/**4/
from generating duplicate descriptions and title errors in our site crawl? It was my hope that those pages wouldn't be crawled at all. -
Sorry, Tawny ... I did go back and correct y question. We did apply Disallow: /page/ to address this issue. The /place_tag/ is found in many pages we DO want to crawl and index ... and we only want here to disallow those page 2, page 3, page 4, etc. pages.
(We also disallowed /tag/, /category/, and a few other common issues that generate false positives in the site crawl.)
-
Hey there!
Tawny from Moz's Help Team here.
Adding a disallow directive for /tag/ won't help with the example URL you've provided — that URL doesn't have /tag/ in the URL pathway. To block us from seeing content like that URL you listed, you'd need a disallow directive for /place_tag/.
If you include that disallow directive, that should stop us from seeing duplicate content on pages with /place_tag/ in the URL.
Hope that helps! If you've still got questions, feel free to shoot us a note over at help@moz.com and we'll do our best to sort things out with you.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
When I crawl my site On Moz it says it can't access the robots.txt file, but crawl is fine on SEM Rush - Anyone know any reason for this?
Hi guys, When I try to run a site crawl on Moz it returns an error saying that it has failed due to an error with the robots.txt file. However, my site can be crawled by SEM Rush with no mention of problems with roots.txt file issues. My developer has looked into it and insists their is no problem with my robots.txt and I've tried the Moz crawl at least 6 times over an 8 week period. Has anyone ever seen such a large discrepancy between Moz and SEM Rush or have any ideas why Moz has this issue with my site?? TIA everyone
Getting Started | | Webreviewadmin0 -
Moz not recognising url to setup a new account
I am attempting to setup a new account for my client but Moz is having a problem and not registering the url http://www.esc.uk.net/. I get the "oops" message & read the info on the FAQ page it refers to. Why do I keep getting an "Oops" message when I type and submit my campaign URL? To create a campaign, we need to receive a valid HTTP response from the web server hosting your site. While the site may be accessible from a standard web browser, other user-agents and search engine bots may receive a different response. If you receive this error, we recommend running a test by sending an HTTP response using any online HTTP tool. Using an online HTTP tool I believe I received an HTTP response. What else can I do? Regards Terry
Getting Started | | Buzzin0 -
Where can I find the list of all the question I've asked here?
In my profile I see only the comments, I'd like to keep track also of my questions
Getting Started | | 2mlab0 -
Moz Not Pulling Monthly Data
So I set up my Moz campaigns on the 8th of this month. I can see data for the week, however, I can't see data from the previous week or for the month. So my question has two parts: a.) Is there a way to view data for the previous week, and b.) when can I expect to see data for the entire month. Seeing data for the entire month is absolutely critical for me, since this is what I'm going to use to prepare my monthly client reports. Has anyone experienced anything like this? PLEASE HELP!!!
Getting Started | | maxcarnage0 -
Back on Moz
I quick my seomoz subscription a long time ago. I got an invite to the new moz analytics. I don't see anything new at first glance. What's changed?
Getting Started | | joseph11790 -
Cant download my crawl csv
When I click on the [download csv] in my crawl campaign section nothing happens.
Getting Started | | digitalmedialounge0 -
Lost since the change over to Moz
I was recently lead to a series of videos which looked to be quite helpful here http://moz.com/help/pro/introduction But is seems with the change over from SEO Moz to Moz, nothing is in the same place. Should I bother watching them or wait for new ones? I'm particularly interested in SEO to raise our rank.
Getting Started | | AliciaMarie0 -
Moz Analytics - Reports
In the guide video (http://moz.com/help/guides/getting-started) a report button is show. However, I cannot see this button in Moz Analytics. I'm assuming this is due to the tool being in beta?
Getting Started | | David_ODonnell0