Why RogerBot can't crawl site https://unplag.com
-
Hello
Please help me to solve the problem.
The on-page grader and Crawl Test are not working for Unplag.com website. Both said that they can't access the url. Yes, I've tried different variants like unplag.com, http://unplag.com
One more thing - RogerBot was disallowed in robots.txt file. I deleted it from the file a week ago so maybe moz index haven't been renewed.
-
Thank you. I'll try to solve the problem
-
The trouble is not with your robot.txt - in the server config you block rogerbot completely and serve a 400 for each request it makes..
If you have a user agent switcher plugin in your browser & change the user agent to rogerbot (rogerbot/1.1 (http://moz.com/help/guides/search-overview/crawl-diagnostics#more-help, rogerbot-crawler+pr2-crawler-101@moz.com) - the server returns a 400 Bad Request.
Dirk
-
The logs are like this:
"GET / HTTP/1.0" 400 166 "-" "rogerbot/1.1 (http://moz.com/help/guides/search-overview/crawl-diagnostics#more-help, rogerbot-crawler+pr2-crawler-101@moz.com)" "-" - "https"
and of course sometimes rogerbot is trying to see the robots file:
"GET /robots.txt HTTP/1.1" 400 166 "-" "rogerbot/1.1 (http://moz.com/help/guides/search-overview/crawl-diagnostics#more-help, rogerbot-crawler+pr2-crawler-101@moz.com)" "-" - "https"
for me it looks like the rogerbot is disallowed in robots.txt but the file is like this https://unplag.com/robots.txt
-
thanks a lot!
-
Follow the advice from Jordan below and try to check your log files to see what the server response is when Rogerbot is trying to visit the site.
I noticed some DNS issues with your site - check http://dnscheck.pingdom.com/?domain=unplag.com - Nameservers don't seem to be ok. Also noticed that you have a 302 redirect from http -> https - while this should be 301. Probably not related to your main issue but worth checking.
-
Thanks.
The last crawl was after the robots.txt change.
And I don't see any errors in the dashboard.
-
After creating a fresh test campaign for the site, I'm still seeing a 400 response being served to rogerbot from https://unplag.com/. While I'm not able to pinpoint the exact setting that is causing the site to serve that response, I'd recommend checking your server logs to verify the response that is being served.
-
It's possible that your site hasn't been crawled yet (since you changed the robots.txt). You can see in your campaign dashboard (upper right corner) when the next crawl is scheduled.
Do you see any specific error codes on your dashboard?
Dirk
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Crawl tests stuck in queue
I have tried to run a number of crawl tests recently for our client's sites outside the US and they have been stuck in the queue for over a week. 3 of them completed, but then 5 are stuck. Anyone experience this? I haven't seen anything about crawl tests having issues right now.
Moz Bar | | rmcgrath810 -
Has anyone had to deal with Moz crawl issues on their Zendesk support site?
If so - how did you end up resolving them? For instance we have 85 "temporary redirect" errors from our Zendesk support site in our crawl error report and we don't have access to the robots.txt file through Zendesk.
Moz Bar | | zspace0 -
4 days waiting for a Moz Crawl - How quick are yours?
Hi there Please could anyone say how long they have been waiting for crawl results. I requested a crawl on a 20 page website and I have been waiting 4 days since last weekend. I checked Moz Health and there have been no related issues there: http://health.moz.com/ Your response would be welcome. Thanks
Moz Bar | | SEOguy10 -
Omega8.cc decided to block rogerbot
My host decided to block rogerbot because "it's too agreessive... and doesn't follow the Crawl-limit... so we blocked them". And now I can't get crawl reports on my site. Any advice?
Moz Bar | | JayShoe0 -
Moz Crawl Report showing non-existent Duplicate Errors since new reporting layout
Hi Moz Community, Since Moz changed to the new style of Crawl report, we've seen a jump in duplicate errors for our site. These duplicate errors do not exist and were not present on the Crawl reports before the report change and also we have not made any changes to the flagged pages on our site since then either. When you download the report data in csv it appears that the Moz report is mixing up data for two or more pages on the site. e.g.in csv for 'Page1' data, it will show the meta description for 'Page2' and 'Page2' shows that for 'Page1', so this then gets flagged as duplicate, however looking at the actual Meta description assigned onsite, both Page 1 and Page 2 are completely unique. Has anyone else experienced this and Moz Team - are you looking into this? Thanks, V
Moz Bar | | WWTeam1 -
"Avoid Keyword Self-Cannibalization" - can't find the problem
Hi, I understand what this means (or at least I think I do!), but I can't find where the problem lies. The keyword is "fire warden training" and the url is http://www.tutis-fire.co.uk/fire-warden-training-courses/ If anyone could lend a helping hand, I'd appreciate it.
Moz Bar | | Gordon_Hall0 -
Rel Canonical and Moz Crawl
we have Rel Canonical tags set up on a few pages. When viewing the page source, the tags are correct. However, Moz Crawl results show the opposite. for example the page source, correctly shows, URL X with a Rel canonical Tag of URL Y
Moz Bar | | S.S.N
but.. Moz crawl is showing URL Y with a Rel Canonical Tag of URL X ..any thoughts why this would happen? which should i trust more?0