Different Errors Running 2 Crawls on Effectively the Same Setup
-
Our developers are moving away from utilising robots.txt files due to security risks, so e have been in the process of removing them from sites. However we, and our clients still want to run Moz crawl reports as they can highlight useful information.
The two sites in question sit on the same server with the same settings (in fact running on the same Magento install). We do not have a robots.txt files present (they 404), and as per Chiaryn's response here https://moz.com/community/q/without-robots-txt-no-crawling this should work fine?
However for www.iconiclights.co.uk we got: 902 : Network errors prevented crawler from contacting server for page.
While for www.valuelights.co.uk we got: 612 : Page banned by error response for robots.txt.
These crawls were both run recently, and there was no robots.txt present. Not to mention, they are on the same setup/server etc as mentioned. Now, we have just tested this, by uploading a blank robots.txt file to see if it changed anything - but we get exactly the same errors.
I have had a look, but can't find anything that really matches this on here - help would really be appreciated!
Thanks!
-
Hey there! Tawny from the Customer Support team here!
This sounds like a juicy issue, and one I'd love to dive in and help you with! Unfortunately, without being able to take a look at your campaigns and account directly, it's tough to provide specific support for these issues.
That said, if you write in to help@moz.com and give us the details of what you're seeing - basically exactly what's in this question - we should be able to help investigate for you.
-
Having no Robots.txt, or a blank one, is perfectly fine (though honestly its no more a security risk than your Sitemap.xml). But your current issue is that both of your sites are returning 403 status codes at crawlers while people are still able to land on your pages. This has nothing to do with the Robots.txt file being changed or removed; just an odd coincidence. This most likely is an issue in htaccess file.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Need help fixing a duplicate content issue for my website. The moz crawl is show OMG my website with https:// and https://www. But I have never used the url https:// so I don’t understand why moz is showing this
Moz is showing my url with two different starts. Https:// and then the one I use https://www. The problem is I don’t think I have ever used the url without the www. at the start. How do I fix this?
Moz Bar | | jdp_uk0 -
Https address has different result that http in Page Optimization Score toll in Moz PRO
The following url
Moz Bar | | TrueluxGroup
https://www.whichledlight.com/t/gu10-led-bulbs has (100 score for keyword 'GU10 LED')
has different on page opmisation score results to
http://www.whichledlight.com/t/gu10-led-bulbs (73 score for keyword 'GU10 LED') Anyone know if we've set something up wrong?
Also, is this even something to worry about, does google treat them differently? We're using the Page Optimisation Tool in Moz Pro ** UPDATE ** It's worth mentioning we are using emberjs, so the website is a single page application.
We use prerender to render the pages for google.0 -
Site Crawl report show strange duplicate pages
Beginning in early in Feb, we got a big bump in duplicate pages. The URLs of the pages are very odd: Example URL:
Moz Bar | | Neo4j
http://firstname.lastname@website.com/dir/page.php
is duplicate with http://website.com/dir/page.php I checked though the site, nginx conf files, and referral pages, and could not find what is prefixing the pages with 'http://firstname.lastname@'. Any ideas? The person whose name is 'Firstname Lastname' is stumped as well. Thanks.0 -
Moz Page Analysis Country different to Who.is?
If I analyse a domain with Moz Page Analysis tool, it says that the domain is hosted in the United States but if look up the same domain on who.is, the hosting location is Italy?
Moz Bar | | Marketing_Today0 -
Getting 'Sorry, but that URL is inaccessible' error msg when trying to run On-Page Grader
I just signed up for MOZ Pro for the first time today. Tried to run the 'on-page grader' tool on some of my pages but I'm getting a 'Sorry, but that URL is inaccessible' error msg. I have verified against the robot.txt file that the pages are NOT blocking any crawlers. Can anybody help?
Moz Bar | | spinoki0 -
On the on page optimization page, I found out that there are 2 contributing factors which are opposite to each other. "No More Than One H1 Tag" and "Appropriate Keyword Usage in H1 Tag"
"No More Than One H1 Tag" and "Appropriate Keyword Usage in H1 Tag" If you fulfill one condition, the other one is not completed. If you consider Article heading as H1 then Moz do not detect keyword in the heading.
Moz Bar | | MoeezLodhi0 -
"Sorry! We weren't able to find that page when we crawled your site." Please help!
Can someone please explain whey I am getting this error for this link "http://lensoutloud.com/san-antonio-real-estate-photography/" when I attempt to perform an on page SEO grading? The link is indexed and ranking very well but for some reason Moz says it can't find the page when it crawled my site. This has also happened when I attempt to grade other pages on my site. Thanks in advance!
Moz Bar | | AndreGant0 -
Blocked Production Site from Search Engines - How to get it Crawled by Moz Crawler
I have an 'under development' site hosted, (which is an exact replica of live site as working on to add new functionalities & modules) - but its password protected, excluded from robots.txt (Disallow) & also marked noindex on all pages in the index - so that Googlebot & other Search Engines can not crawl the site At present the development work is almost 95% completed., Now - feel like to crawl the site through SEOMOZ Roger Bot - to know the errors and all indexed urls by Rogerbot. What's the best way to get Moz Bot crawl the site - but simultaneously continue it blocking its access to Search Engines I have gone through - https://support.google.com/webmasters/answer/93708?hl=en, it says a) Save it in a password-protected directory. Googlebot and other spiders won't be able to access the content- But this way Moz will also not be able to crawl the site b) Use a robots.txt to control access to files and directories on your server - However it also says - It's important to note that even if you use a robots.txt file to block spiders from crawling content on your site, Google could discover it in other ways and add it to our index. c) Use a noindex meta tag to prevent content from appearing in our search results - It also says that a link to the page can still appear in their search results. Because we have to crawl your page in order to see the noindex tag, there's a small chance that Googlebot won't see and respect the noindex meta tag Password Protected thus seems the best way to continue blocking. However, continuing with it will also block Moz bot to crawl the site. Any suggestions Thanks
Moz Bar | | Modi0