Why RogerBot can't crawl site https://unplag.com
-
Hello
Please help me to solve the problem.
The on-page grader and Crawl Test are not working for Unplag.com website. Both said that they can't access the url. Yes, I've tried different variants like unplag.com, http://unplag.com
One more thing - RogerBot was disallowed in robots.txt file. I deleted it from the file a week ago so maybe moz index haven't been renewed.
-
Thank you. I'll try to solve the problem
-
The trouble is not with your robot.txt - in the server config you block rogerbot completely and serve a 400 for each request it makes..
If you have a user agent switcher plugin in your browser & change the user agent to rogerbot (rogerbot/1.1 (http://moz.com/help/guides/search-overview/crawl-diagnostics#more-help, rogerbot-crawler+pr2-crawler-101@moz.com) - the server returns a 400 Bad Request.
Dirk
-
The logs are like this:
"GET / HTTP/1.0" 400 166 "-" "rogerbot/1.1 (http://moz.com/help/guides/search-overview/crawl-diagnostics#more-help, rogerbot-crawler+pr2-crawler-101@moz.com)" "-" - "https"
and of course sometimes rogerbot is trying to see the robots file:
"GET /robots.txt HTTP/1.1" 400 166 "-" "rogerbot/1.1 (http://moz.com/help/guides/search-overview/crawl-diagnostics#more-help, rogerbot-crawler+pr2-crawler-101@moz.com)" "-" - "https"
for me it looks like the rogerbot is disallowed in robots.txt but the file is like this https://unplag.com/robots.txt
-
thanks a lot!
-
Follow the advice from Jordan below and try to check your log files to see what the server response is when Rogerbot is trying to visit the site.
I noticed some DNS issues with your site - check http://dnscheck.pingdom.com/?domain=unplag.com - Nameservers don't seem to be ok. Also noticed that you have a 302 redirect from http -> https - while this should be 301. Probably not related to your main issue but worth checking.
-
Thanks.
The last crawl was after the robots.txt change.
And I don't see any errors in the dashboard.
-
After creating a fresh test campaign for the site, I'm still seeing a 400 response being served to rogerbot from https://unplag.com/. While I'm not able to pinpoint the exact setting that is causing the site to serve that response, I'd recommend checking your server logs to verify the response that is being served.
-
It's possible that your site hasn't been crawled yet (since you changed the robots.txt). You can see in your campaign dashboard (upper right corner) when the next crawl is scheduled.
Do you see any specific error codes on your dashboard?
Dirk
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Site Crawl 1-page 301 status error but httpstatus.io says its 403
I am trying to run a site crawl for my website and MOZ is only resulting in 1 page crawled with the home page URL Status Code of 301. However when I run it in httpstatus.io it is giving me a 403 status error. Im curious as to why MOZ is saying its a 301 and httpstatus.io is saying 403. Is there anything I can do in MOZ first to get the site crawled before asking my developers to look into the 403 error?
Moz Bar | | JohnConover0 -
Monthly Keyword Volume Differences Keyword Research V's Campaign Rankings
Why are there differences in the keyword volume data, is this a UK only issue? Am I missing something? Within Keyword Research;
Moz Bar | | GrouchyKids
suspended ceiling systems swindon (0-10) Within Campaign Rankings:
suspended ceiling systems swindon (no data) It's not just one instance either its across multiple keywords.0 -
Crawl report shows that it gets 4xx errors for pages that work fine. Why?
On the crawl report it has all these "Critical Crawler Issues". They all say "4xx Error", yet when i click on the link from the crawler report, it goes to a perfectly functioning page, not a 404 page or anything. If i click in it actually says it's a 403 error. It's all for pages generated by the IDX solution for our real estate website. Is Moz broken or am i missing something? Here are a couple examples: <dl class="crawl-page-details-list"> <dd class="crawl-page-details-list-emphasis">https://teamvivi.com/homes-for-sale-map-search/</dd> <dd class="crawl-page-details-list-emphasis"> <dl class="crawl-page-details-list"> <dd class="crawl-page-details-list-emphasis">https://teamvivi.com/email-alerts/</dd> </dl> </dd> </dl>
Moz Bar | | TeamViviRealEstate0 -
Error: 804 : HTTPS (SSL) error encountered when requesting page
In my crawl report I'm getting the error: 804 : HTTPS (SSL) error encountered when requesting page. How can I fix this? .
Moz Bar | | Yesi.Ortega0 -
Issues with Crawl Test and SSL Certificate
So I have having issues with the Crawl Test being able to crawl my site accurately due to what the tool is saying is a "SSL Certificate Error" (804 : HTTPS (SSL) error encountered when requesting page.) Only thing is that I have no warnings about this SSL issue in Search Console and when I check the SSL on https://www.sslshopper.com it comes back just fine. Anybody know why this might be happening or have encountered this issue before?
Moz Bar | | DRSearchEngOpt0 -
Canonical in Moz crawl report
I'm wondering if the moz bot is seeing my rel="canonical" on my pages. There are 2 notices that are bothering me: Overly Dynamic URL Rel Canonical Overly Dynamic URL - This notice is being generated by urls with query strings. On the main page I have the rel="canonical" tag in the header. So every page with the query string has the canonical tag that points to the page that should be indexed. So my question...Why the notice? Isn't this being handled properly with the canonical tag? I know I can use my robots.txt or the tool in Google search console but is it really necessary when I have the canonical on every page? Here is one of the links that has the "Overly Dynamic URL" notice, as you can see the the canonical in the header points to the page without the query string: https://www.vistex.com/services/training/traditional-classroom/registration-form/?values=true&course-title=DMP101 – Data Maintenance Pricing – Business Processes&date=March 14, 2016 Rel Canonical - Every page in my report has this notice "Using rel=canonical suggests to search engines which URL should be seen as canonical". I'm using the rel="canonical" tag on all of my pages by default. Is the report suggesting that I don't do this? Or is it suggesting that I should? Again...why the notice?
Moz Bar | | Brando160 -
Weird back link showed in moz crawl
Some time ago somebody from this site: http://dianibeach.com created a weird link to our site which had on the end db. Later we have realized that the link was coming from every footer on each page. I believe that the back links from footer does not have realy value and even the more of them the less value. We have asked the guy to remove that links as I thought it might harm our site more then help. Now I I was very surprised to find this link in moz crawl error as second top page on our site in current index??? Can somebody explain how is this possible?? The most ridiculous thing is that when I click on that link it realy opens our site! How is that possible, what is it? This is the link: http://villasdiani.com/?db Thank you very much for any help with this
Moz Bar | | Rebeca10 -
408 errors in crawl diagnostics
Best community, The Crawl Diagnostics Report of Moz gave our website a lot of 408 errors like below: <dl> <dt>Title</dt> <dd>408 : Error</dd> <dt>Meta Description</dt> <dd>408 Request Time-out</dd> <dt>Meta Robots</dt> <dd>Not present/empty</dd> <dt>Meta Refresh</dt> <dd>Not present/empty</dd> <dd>-----------------------------------------------------------------------</dd> <dd>The report has diagnosed a lot of these (around 320), even though we cannot reproduce the error (we cannot seem to find it ourself). </dd> <dd>2 questions relating to this: </dd> <dd>* Can you (the people of Moz) reproduce the errors manually? </dd> <dd>* Is it possible that it is a bug in the spider of Moz itself (too many spiders crawling at the same time)?</dd> </dl>
Moz Bar | | arjen.koedam0