Robots.txt blocking Moz
-
Moz are reporting the robots.txt file is blocking them from crawling one of our websites.
But as far as we can see this file is exactly the same as the robots.txt files on other websites that Moz is crawling without problems.
We have never come up against this before, even with this site.
Our stats show Rogerbot attempting to crawl our site, but it receives a 404 error.
Can anyone enlighten us to the problem please?
http://www.wychwoodflooring.com
-Christina
-
Hi Nigel
Neither, they use server side filtering.Regards- David
-
Hi David
That's great news!
As a matter of interest, where did they block it? as it's not in the Robots.txt - was in in htaccess.txt?
Regards
Nigel
-
Nigel,Thanks for the reply, the cgi-bin folder is never used by any of my sites but I put this in just as a matter of course, the folder would normally contain old cgi scripts so would not usually affect the crawling of a robot in any case.The reason for the problem turns out that our host had blocked rogerbot along with several other malicious bots, they have now lifted this block and the site is able to be crawled.- David
-
Hi Christina
I don't know how your site is set up but I can see that for some reason you are blocking access to the cgi-bin
If that directory contains files that execute php or other permissions then that may well be your problem. It's the only directory you are blocking and since I haven't seen other Robots.tx blocking it, then I would hazard a guess that this is the root of your problem.
Robots.txt
User-agent: * Disallow: /cgi-bin/ Sitemap: http://www.wychwoodflooring.com/sitemap.xml
Regards
Nigel
-
Our hosting provider has banned Rogerbot as they see it as problematic!!!!
They are a great hosting provider so this is going to be a difficult one.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
50,000 4xx errors listed in MOZ report :(
HI, I've just had a look at a customers MOZ report and discovered nearly 50,000 errors listed!! The site is a non dynamic, non database driven site so where these have come from is beyond me. Example - (Both links open in a new window)
Moz Pro | | Skips
LIVE PAGE - http://lunnonwaste.com/licenses/Heatherland-Limited-H&S-Policy-Document-JAN-15.pdf
NON EXISTANT PAGE - http://lunnonwaste.com/licenses/licenses/Heatherland-Limited-H&S-Policy-Document-JAN-15.pdf - if you hover over any links you'll see they're ALL ......./licenses/licenses/.......... which doesn't exist! It's the file system that seems to be the problem - .........../licenses/licenses/........... and this gets added onto (............./licenses/licenses/licenses/licenses/.............. , up to the 50,000 page errors LOL The problem is self replicating. It's very odd and not something I've ever seen before, NON of these 'extra' pages are listed on the server, so where are they coming from? Any suggestions or help would be gratefully appreciated1 -
Block Moz (or any other robot) from crawling pages with specific URLs
Hello! Moz reports that my site has around 380 duplicate page content. Most of them come from dynamic generated URLs that have some specific parameters. I have sorted this out for Google in webmaster tools (the new Google Search Console) by blocking the pages with these parameters. However, Moz is still reporting the same amount of duplicate content pages and, to stop it, I know I must use robots.txt. The trick is that, I don't want to block every page, but just the pages with specific parameters. I want to do this because among these 380 pages there are some other pages with no parameters (or different parameters) that I need to take care of. Basically, I need to clean this list to be able to use the feature properly in the future. I have read through Moz forums and found a few topics related to this, but there is no clear answer on how to block only pages with specific URLs. Therefore, I have done my research and come up with these lines for robots.txt: User-agent: dotbot
Moz Pro | | Blacktie
Disallow: /*numberOfStars=0 User-agent: rogerbot
Disallow: /*numberOfStars=0 My questions: 1. Are the above lines correct and would block Moz (dotbot and rogerbot) from crawling only pages that have numberOfStars=0 parameter in their URLs, leaving other pages intact? 2. Do I need to have an empty line between the two groups? (I mean between "Disallow: /*numberOfStars=0" and "User-agent: rogerbot")? (or does it even matter?) I think this would help many people as there is no clear answer on how to block crawling only pages with specific URLs. Moreover, this should be valid for any robot out there. Thank you for your help!0 -
Screaming frog, Xenu, Moz giving wrong results
Hello guys and gals, This is a very odd one, I've a client's website and most of the crawlers I'm using are giving me weird/ wrong results. For now lets focus on screaming frog, when I crawl the site it will list e.g. meta titles as missing (not all of them though), however going into the site the title is not missing, and Google seems to be indexing the site fine. The robots.txt are not affecting the site (I've also tried changing the user agent). The other odd thing is SF gives a 200 code but as a status tells me "connection refused" even though it's giving me data. I'm unable to share the clients site, has any one else seen this very odd issue? And solutions for it? Many thanks in advanced for any help,
Moz Pro | | GPainter0 -
[Moz Help] Re: Trying to add a valid URL into MOZ account
See below and pls let us know what we have to do solve this : | | Joel Day (Moz Help) Mar 07 05:03 PM Hey Tracy, It looks like there's a redirect loop on your site. greatwesternflooring.com redirects to www.greatwesternflooring.com/ which in turn 302 redirects back into itself. You'll likely need to fix the redirect before you can continue configuring the campaign. 🙂 Thanks!
Moz Pro | | Britewave
Joel. Moz
t: @HelpWizard | | | Tracy Mar 07 03:14 PM I sent an email, and this is the response I got. The help forum sent me here, so here I am 🙂 An answer was posted to this question:
Question I have a valid URL greatwesternflooring.com, but when I try to add this campaign I get an "opps" message telling me it's not a valid URL. Can you help me? Answer
This looks like a bug. Please reach out to us via support so that we can forward this along to our Developers for review. Thanks!(https://moz.com/help/contact)
See where this question was originally asked. |0 -
Error in Moz duplicate content reports
Hi - I've run the Moz campaign on a client's site. Moz is saying that there are duplicate content errors, and when I look at the errors it is showing that they are all to do with the non-www URLs having being duplicated in the www form of the URLs. However this is not the case - all the non-www URLs are all 301 redirected to the www URLs. Is this an error in the Moz tool? Has anybody experienced something similar?
Moz Pro | | rorynatkiel0 -
Can I see when SEO Moz has crawled my website?
I would like to know if it's possible to see (maybe in my Google Analytics) if SEO Moz has crawled my website. I'm also curious if and where I can see when the robot of Google visited my website. Thanks!
Moz Pro | | Spotler0 -
Does the SEOMoz weekly crawl that highlights no meta description tag, take into account if there is a meta robots noindex,follow tag on the pages it indicates the missing meta descriptions?
The weekly crawl website report is telling me that there are pages that have missing meta description tags, yet I've implemented meta robots tags to 'noindex, follow' those pages which are visible in those page source files. As far as Google Is concerned, surely this then won't be a problem since it is being instructed NOT to consider these specific pages for indexing. I am assuming that the weekly SEOmoz website crawl is simply throwing the missing meta description crawl findings into its report without itself observing that the particluar URL references contain the meta robots 'noindex,follow' tag ???? Appreciate if you can clairfy if this is the case. It would help me understand that (at least in terms of my efforts towards Google) your own crawl doesn't observe the meta robots tag instruction, hence the resultant report's flagging the discrepancy.
Moz Pro | | callassist0 -
Any SEO moz users notice a HUGE change in OSE (Open Site Explorer) link data numbers?
Hi All, I am having some serious concern with OSE data recently for numerous clients, one client I want to talk about today has the following data from OSE for the month of August 2011 compared with July 2011: Total links to the domain: (decrease of around 100,000+)
Moz Pro | | ColumbusAustralia
External Followed links: (decrease by around 5,000)
**Linking Root domains: (decrease of over 60) ** The crazy thing is that the domain authority has actually gone up by around 5 points for this client even though every thing has suddenly gone down? Also funny thing is we have been link building quite strong for this client over the last 12 months using only high quality sources from out niche. I am worried that their is serious issues with the data, I realise we saw some updates to OSE recently yet I am suprised it can be this drastic. Kind Regards. PSV1