Moz "Crawl Diagnostics" doesn't respect robots.txt
-
Hello, I've just had a new website crawled by the Moz bot. It's come back with thousands of errors saying things like:
- Duplicate content
- Overly dynamic URLs
- Duplicate Page Titles
The duplicate content & URLs it's found are all blocked in the robots.txt so why am I seeing these errors?
Here's an example of some of the robots.txt that blocks things like dynamic URLs and directories (which Moz bot ignored):Disallow: /?mode=
Disallow: /?limit=
Disallow: /?dir=
Disallow: /?p=*&
Disallow: /?SID=
Disallow: /reviews/
Disallow: /home/Many thanks for any info on this issue.
-
Hi Si, has this issue been resolved?
-
Hey Si,
Thanks for writing in. It doesn't seem that we are having an overarching issue with our crawler ignoring robots.txt files so I did some research in Google Webmaster Tools and it looks like most crawlers require an asterisk in the disallow directive to recognize that all pages of a dynamic URL are being disallowed. If you look in the "Pattern Matching" section of this resource here: http://support.google.com/webmasters/bin/answer.py?hl=en&answer=156449, that should give you more information about setting up the robots.txt with the correct disallow directives to block those pages.
If you add in the astrisk to the disallow directive and you are still seeing these pages crawled, it would help if you sent in an email with your campaign information to our support desk at help@moz.com so we can have our engineers look into this more directly.
I hope this helps.
Chiaryn
-
If you have an "index,(no)follow" meta on those pages I think they will be crawled even though you have them blocked in robots.txt. So by adding "noindex" on those pages it might work as you want it to.
-
Is the / actually in the URL at that spot? Or is your link like http://www.example.com/abcd?p=147
If you give an example full URL that includes one of your blocked dynamic URLs we can take a better look. If your robots is setup correctly, it shouldn't find that stuff but give us more info if you're able.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How do I 'sign in' to the Moz Bar?
It's installed, I can see links etc as highlighted - but it won' t let me "sign in". This 20 second video explains: https://www.screencast.com/t/3kEjQFkTHZv Suggestions? Or shall I just ignore? Paul Barrs
Moz Bar | | PaulBarrs0 -
Crawling Problem : No-data Of My Site
Hey, Folks!! So It's Been More than a Month.And Moz is not crawling my website http://www.trickypedia.com/.I have seen Moz Last update I thought they will crawl my site but No results Disappointing. My Sites Domain Authority and Page Authority is still 1. But In other Seo Tools, they are Perfectly Crawling My Website. What Would Be the reason? Can anyone Please Explain.
Moz Bar | | seothatworks010 -
Moz Content --- for SEO or simply user engagement?
What is the primary function of Moz Content? It looks like it is most useful for managing content as a user engagement tool. Our content strategy is centered on boosting organic placement - with user engagement as a nice but unessential side product. Besides providing general descriptive details of a site's content / authorship - how can Moz Content help with SEO?
Moz Bar | | cvonhassell0 -
Ask moz staff
Was trying to crawl a page and the moz on page grader tells that URL is not accessible. The URL is accessible thorugh my mobile and desktop. Any idea what is the issue ? The URL is https://www.practo.com/singapore/dentist IGrzzVe
Moz Bar | | ozil0 -
My H1 is not readable in MOZ
I have tried a lot to no success. I can read the H1 in the source of the browser. And I have tried without any classes(joomla). Does anybody experienced something similar or have a clue why MOZ can't read my H1? Here is the link to the page: http://generalvarmepumpe.no/nordisk-luft-til-luft-varmepumpe/luft-til-luft-varmepumpe-tilbud-vinterkongen-2.html Appreciate any kind of help:)
Moz Bar | | zionray0 -
Error for a page that doesn't exist.
Hi, I'm just trailing this service, and I have a couple of questions that I hope someone can help with. 1. I am getting a high priority error regarding a page not being able to be crawled - a 4XX error. Problem is, there is no such page in existence. The URL is my site/comments/feed It's driving me crazy. 2. I'm also getting errors based on missing meta tags in blog posts. I am adding tags at the time of posting, so I am unsure why these errors are showing up. Actually, I didn't add tags to all posts - but there are errors on ALL posts, even those I added tags to. Any help would be wonderful. Thanks!!! Hugh
Moz Bar | | hughanderson0 -
How do I export my keywords from Moz?
Simple question: once you've built up a big set of keywords within Moz, how do you export it back out to use in other places?
Moz Bar | | tcolling0 -
Rankings Tool Shows 'Wed' for Most Keywords
Hi Folks, I have 300 keywords in the moz rankings tool but I would say over 90% of them simply show a 'WED' icon instead of the icon. It has been like this for nearly 3 weeks now. Any ideas why this is? Cheers Gaz
Moz Bar | | PurpleGriffon0