Different Errors Running 2 Crawls on Effectively the Same Setup
-
Our developers are moving away from utilising robots.txt files due to security risks, so e have been in the process of removing them from sites. However we, and our clients still want to run Moz crawl reports as they can highlight useful information.
The two sites in question sit on the same server with the same settings (in fact running on the same Magento install). We do not have a robots.txt files present (they 404), and as per Chiaryn's response here https://moz.com/community/q/without-robots-txt-no-crawling this should work fine?
However for www.iconiclights.co.uk we got: 902 : Network errors prevented crawler from contacting server for page.
While for www.valuelights.co.uk we got: 612 : Page banned by error response for robots.txt.
These crawls were both run recently, and there was no robots.txt present. Not to mention, they are on the same setup/server etc as mentioned. Now, we have just tested this, by uploading a blank robots.txt file to see if it changed anything - but we get exactly the same errors.
I have had a look, but can't find anything that really matches this on here - help would really be appreciated!
Thanks!
-
Hey there! Tawny from the Customer Support team here!
This sounds like a juicy issue, and one I'd love to dive in and help you with! Unfortunately, without being able to take a look at your campaigns and account directly, it's tough to provide specific support for these issues.
That said, if you write in to help@moz.com and give us the details of what you're seeing - basically exactly what's in this question - we should be able to help investigate for you.
-
Having no Robots.txt, or a blank one, is perfectly fine (though honestly its no more a security risk than your Sitemap.xml). But your current issue is that both of your sites are returning 403 status codes at crawlers while people are still able to land on your pages. This has nothing to do with the Robots.txt file being changed or removed; just an odd coincidence. This most likely is an issue in htaccess file.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
I added a privacy policy link to my footer and now Moz is showing thousands of 4xx errors
My website didn't have a privacy policy so I added one and put the link in the footer menu. When I did this, Moz came back telling me that there are a lot of new errors on the site. Is this a bad thing? Do I need to address it? HY59Iks sYyAHCB
Moz Bar | | elisa175910 -
Monthly Keyword Volume Differences Keyword Research V's Campaign Rankings
Why are there differences in the keyword volume data, is this a UK only issue? Am I missing something? Within Keyword Research;
Moz Bar | | GrouchyKids
suspended ceiling systems swindon (0-10) Within Campaign Rankings:
suspended ceiling systems swindon (no data) It's not just one instance either its across multiple keywords.0 -
Crawl report shows that it gets 4xx errors for pages that work fine. Why?
On the crawl report it has all these "Critical Crawler Issues". They all say "4xx Error", yet when i click on the link from the crawler report, it goes to a perfectly functioning page, not a 404 page or anything. If i click in it actually says it's a 403 error. It's all for pages generated by the IDX solution for our real estate website. Is Moz broken or am i missing something? Here are a couple examples: <dl class="crawl-page-details-list"> <dd class="crawl-page-details-list-emphasis">https://teamvivi.com/homes-for-sale-map-search/</dd> <dd class="crawl-page-details-list-emphasis"> <dl class="crawl-page-details-list"> <dd class="crawl-page-details-list-emphasis">https://teamvivi.com/email-alerts/</dd> </dl> </dd> </dl>
Moz Bar | | TeamViviRealEstate0 -
What does it mean when Moz KW explorer returns a negative keyword difficulty score (eg -2)?
I recently did a keyowrd difficulty lookup in Moz keyword explorer as usual and for the first time a saw several (out of 100s) of negative keyowrd scores (generally -2). Is this a bug, what does it mean?
Moz Bar | | FlagshipCons1 -
How does a non-traditional TLD impact Moz's crawl test?
I have a client who moved from a .com to .academy domain 6 months ago, and their current crawl tests are coming back with a universal page authority of 1, along with 0 indexed backlinks. The previous version of the site had an average page authority of 35-40, the site architecture and content are nearly identical, and there are no other errors or red flags in the crawl report that would hold back their organic rankings. In fact, looking at the site's analytics account, I can see dozens of sites that provide current and properly functioning backlinks, non of which are listed on the crawl test. So the question is - is Moz currently unable to properly crawl a .academy (or any other non-traditional TLD) site, or is there some deeper issue with the site's SEO that I'm not seeing? Thanks!
Moz Bar | | ThinkAOR1 -
Crawl Diagnostics: How many pages (deep) will it crawl for dup content
Does anyone know how deep the crawl diagnostics will crawl when searching for dup content? Will it crawl the entire site, or will it only crawl "x" amount of pages? Thanks!
Moz Bar | | tdawson090 -
Moz is reporting a broken link error but GWT is not
My latest Moz report is showing a 404 for: http://www.fateyes.com/how-will-googles-hummingbird-affect-your-search-ranking/”ht
Moz Bar | | gfiedel
(and showing the link this way with the characters after the last / which are not part of the page URL) Google Webmaster Tools says we have no errors. I'm wondering why there is this descrepancy and I'm wondering how I can track down where this link is originating from on our site. I've tried downloading screamingfrog and deeptrawl to no avail (java issues). I've also tried a couple of services online and installed Broken Link Checker plugin with no luck finding it. Any suggestions? Thanks in advance!0 -
Blocked Production Site from Search Engines - How to get it Crawled by Moz Crawler
I have an 'under development' site hosted, (which is an exact replica of live site as working on to add new functionalities & modules) - but its password protected, excluded from robots.txt (Disallow) & also marked noindex on all pages in the index - so that Googlebot & other Search Engines can not crawl the site At present the development work is almost 95% completed., Now - feel like to crawl the site through SEOMOZ Roger Bot - to know the errors and all indexed urls by Rogerbot. What's the best way to get Moz Bot crawl the site - but simultaneously continue it blocking its access to Search Engines I have gone through - https://support.google.com/webmasters/answer/93708?hl=en, it says a) Save it in a password-protected directory. Googlebot and other spiders won't be able to access the content- But this way Moz will also not be able to crawl the site b) Use a robots.txt to control access to files and directories on your server - However it also says - It's important to note that even if you use a robots.txt file to block spiders from crawling content on your site, Google could discover it in other ways and add it to our index. c) Use a noindex meta tag to prevent content from appearing in our search results - It also says that a link to the page can still appear in their search results. Because we have to crawl your page in order to see the noindex tag, there's a small chance that Googlebot won't see and respect the noindex meta tag Password Protected thus seems the best way to continue blocking. However, continuing with it will also block Moz bot to crawl the site. Any suggestions Thanks
Moz Bar | | Modi0