Odd crawl test issues
-
Hi all, first post, be gentle...
Just signed up for moz with the hope that it, and the learning will help me improve my web traffic. Have managed to get a bit of woe already with one of the sites we have added to the tool. I cannot get the crawl test to do any actual crawling. Ive tried to add the domain three times now but the initial of a few pages (the auto one when you add a domain to pro) will not work for me.
Instead of getting a list of problems with the site, i have a list of 18 pages where it says 'Error Code 902: Network Errors Prevented Crawler from Contacting Server'. Being a little puzzled by this, i checked the site myself...no problems. I asked several people in different locations (and countries) to have a go, and no problems for them either. I ran the same site through Raven Tool site auditor and got some results. it crawled a few thousand pages. I ran the site through screaming frog as google bot user agent, and again no issues. I just tried the fetch as Gbot in WMT and all was fine there.
I'm very puzzled then as to why moz is having issues with the site but everyone is happy with it. I know the homepage takes 7 seconds to load - caching is off at the moment while we tweak the design - but all the other pages (according to SF) take average of 0.72 seconds to load.
The site is a magento one so we have a lengthy robots.txt but that is not causing problems for any of the other services. The robots txt is below.
Google Image Crawler Setup
User-agent: Googlebot-Image
Disallow:Crawlers Setup
User-agent: *
Directories
Disallow: /ajax/
Disallow: /404/
Disallow: /app/
Disallow: /cgi-bin/
Disallow: /downloader/
Disallow: /errors/
Disallow: /includes/
#Disallow: /js/
#Disallow: /lib/
Disallow: /magento/
#Disallow: /media/
Disallow: /pkginfo/
Disallow: /report/
Disallow: /scripts/
Disallow: /shell/
Disallow: /skin/
Disallow: /stats/
Disallow: /var/
Disallow: /catalog/product
Disallow: /index.php/
Disallow: /catalog/product_compare/
Disallow: /catalog/category/view/
Disallow: /catalog/product/view/
Disallow: /catalogsearch/
#Disallow: /checkout/
Disallow: /control/
Disallow: /contacts/
Disallow: /customer/
Disallow: /customize/
Disallow: /newsletter/
Disallow: /poll/
Disallow: /review/
Disallow: /sendfriend/
Disallow: /tag/
Disallow: /wishlist/
Disallow: /catalog/product/gallery/Files
Disallow: /cron.php
Disallow: /cron.sh
Disallow: /error_log
Disallow: /install.php
Disallow: /LICENSE.html
Disallow: /LICENSE.txt
Disallow: /LICENSE_AFL.txt
Disallow: /STATUS.txtPaths (no clean URLs)
#Disallow: /.js$
#Disallow: /.css$
Disallow: /.php$
Disallow: /?SID=Pagnation
Disallow: /?dir=
Disallow: /&dir=
Disallow: /?mode=
Disallow: /&mode=
Disallow: /?order=
Disallow: /&order=
Disallow: /?p=
Disallow: /&p=If anyone has any suggestions then please i would welcome them, be it with the tool or my robots. As a side note, im aware that we are blocking the individual product pages. Too many products on the site at the moment (250k plus) which manufacturer default descriptions so we have blocked them and are working on getting the category pages and guides listed. In time we will rewrite the most popular products and unblock them as we go
Many thanks
Carl
-
Thanks for the hints re the robots, will tidy that up.
-
Network errors can be somewhere between us and your site and not necessarily directly with your server itself. The best bet would be to check with your ISP for any connectivity issues to your server. Since your issues are only the first time they are reported, the next crawl may be more successful.
One thing though you will want to keep your user-agent directives in a single block of code without spaces.
so
Crawlers Setup
User-agent: *
Directories
Disallow: /ajax/
Disallow: /404/
Disallow: /app/would need to look like:
Crawlers Setup
User-agent: *
Directories
Disallow: /ajax/
Disallow: /404/
Disallow: /app/ -
Many thanks for the reply. The server we use is a dedicated server which we set up ourselves inc OS and control panel. Just seems very odd that every other tool is working fine etc but moz won't. I cannot see how it would need anything special from, say, Raven's site crawler.
I will check out those other threads though to see if i missed anything, thanks for the links.
Just checked port 80 using http:// www.yougetsignal. com/tools/open-ports/ (not sure if links allowed) and no problems there.
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Moz Crawl only crawling the top level page (1 page)
For the past few mounts my weekly site crawl has been inconsistent. One week works fine, it crawls all of my 500 or so pages. The following week it only crawls 1 page (http://mydomain.com) and nothing else. A few weekly scan go by and the crawl is back up the the 500 or so pages.I went ahead and created several campaigns with duplicate settings and crawled the site. Most times but not all the new campaign's crawl works fine crawling all pages. But within a week or two the weekly crawl will fail again. (crawling 1 page). Currently i have four campaign's all with the same settings running weekly crawls. 2 campaign's crawled the 500 pages and two crawled only the single page. Any help will be greatly appreciated
Moz Bar | | dmaude0 -
My page cant be crawled by MOZ
Hi everyone, I have a page named zita.vn. Im using Moz to do SEO for the page, but when I use On-page grader in MOZ, I got an error that the page cant be accessed. Additionally, I got an issue with page crawling. Google search still crawl my page normally. Please help. Thanks a lot.
Moz Bar | | zita.vn0 -
Site Crawl report show strange duplicate pages
Beginning in early in Feb, we got a big bump in duplicate pages. The URLs of the pages are very odd: Example URL:
Moz Bar | | Neo4j
http://firstname.lastname@website.com/dir/page.php
is duplicate with http://website.com/dir/page.php I checked though the site, nginx conf files, and referral pages, and could not find what is prefixing the pages with 'http://firstname.lastname@'. Any ideas? The person whose name is 'Firstname Lastname' is stumped as well. Thanks.0 -
How does a non-traditional TLD impact Moz's crawl test?
I have a client who moved from a .com to .academy domain 6 months ago, and their current crawl tests are coming back with a universal page authority of 1, along with 0 indexed backlinks. The previous version of the site had an average page authority of 35-40, the site architecture and content are nearly identical, and there are no other errors or red flags in the crawl report that would hold back their organic rankings. In fact, looking at the site's analytics account, I can see dozens of sites that provide current and properly functioning backlinks, non of which are listed on the crawl test. So the question is - is Moz currently unable to properly crawl a .academy (or any other non-traditional TLD) site, or is there some deeper issue with the site's SEO that I'm not seeing? Thanks!
Moz Bar | | ThinkAOR1 -
Rogerbot will not crawl my site! Site URL is https but keep getting and error that homepage (http) can not be accessed. I set up a second campaign to alter the target url to the newer https version but still getting the same error! What can I do?
Site URL is https but keep getting and error that homepage (http://www.flogas.co.uk/) can not be accessed. I set up a second campaign to alter the target url to the newer https://www.flogas.co.uk/ version but still getting the same error! What can I do? I want to use Moz for everything rather than continuing to use a separate auditing tool!
Moz Bar | | digitalascend0 -
Is it possible to extend my crawling date in SEO Moz?
My web site was crawled by MOZ before week, next crawling date is tomorrow. Because of some reason I am not able to take any action on last week MOZ report.I want to extend MOZ next crawling date, Can I ?
Moz Bar | | ankit.rahevar0 -
Moz not crawling opencart product pages
hi, i have waited for over 2 weeks now and the crawler only got 8 pages, and is not getting all the open cart pages and products. any idea of what can be wrong? im using joomla 2.5.11 and mijoshop 2.0.5 (which uses opencart 1.5.5.1). thanks
Moz Bar | | marlvass10 -
Where is the Crawl Tool we had before Research?
Hello, I can't seem to find the crawl tool for other domains that aren't in our campaigns. We've lost it 😞 Thanks, Romeo.
Moz Bar | | RomeoMadrid1