Why wont rogerbot crawl my page?
-
How can I find out why rogerbot won't crawl an individual page I give it to crawl for page-grader? Google, bing, yahoo all crawl pages just fine, but I put in one of the internal pages fo page-grader to check for keywords and it gave me an F -- it isn't crawling the page because the keyword IS in the title and it says it isn't. How do I diagnose the problem?
-
Very glad to see you got it working!
You can mark the question as answered to let others know it is fixed.
-
Thanks. The robots.txt file was the problem. It originally (yesterday) excluded rogerbot (by default) and then I remembered that and put it in as rogerbot but that didn't work. So I changed it to RogerBot and that didn't work. Today I removed the robots.txt file completely and it worked. Then I put it back with rogerbot and it is working.
It APPEARS that maybe it read the robots.txt yesterday before i put in rogerbot and for some reason didn't read it after I put it in. Will never know but it is now working.
Thanks for the help!
-
I know in robots.txt any URL's are case sensitive, I am not sure about user agents (bots/crawlers) but you do have RogerBot spelled with a capitol "B", changing it to lower case (Rogerbot) may fix the issue.
Another thing to test would be to simply remove the mass exclusion just to see if Rogerbot somehow is being blocked by it. Let me know how it goes.
User-agent: * Disallow: /
-
Hi sure, thanks. This page shouldn't have a speed issue but maybe you can see what the issue is:
www.qjamba.com/local-coupons/wentzville/mo/all
Thanks.
-
Hi Theodore,
Last time I looked at this issue for another community member they had a site that had huge images and slow script. This decreased the load time of the page and Roger just got frustrated. Rogerbot is not as sophisticated as the huge Search Engines crawlers and can easily be put off.
As Martijn asked, for us to help we really would have to look at the site to pick out possible issues.
-
Hi Theodore, could you share the specific URL with us so we could help you diagnose what the issue could be?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Unsolved What would the exact text be for robots.txt to stop Moz crawling a subdomain?
I need Moz to stop crawling a subdomain of my site, and am just checking what the exact text should be in the file to do this. I assume it would be: User-agent: Moz
Getting Started | | Simon-Plan
Disallow: / But just checking so I can tell the agency who will apply it, to avoid paying for their time with the incorrect text! Many thanks.0 -
Why does Moz only seem to be crawling a snap shot of the site I am working with?
I was wondering if anyone can help? I am working using Moz to help improve the SEO on a website I am working with, the website contains thousands of pages, yet for some reason Moz only seems to be crawling a small snap shot of the website. I know there are particular pages that I had added a couple of weeks ago - about 300 in total - and none of these were showing on the first crawl, so I did another on-demand crawl and some of these showed up then. Despite this, it says it crawled 700ish pages, but there are getting close to 20-30ish thousand live pages on the site. Any thoughts and guidance as to why they crawling may be stopping?
Getting Started | | dsmith8020200 -
Moz not able to crawl our site - any advice?
When I try and crawl our site through Moz it gives this message: Moz was unable to crawl your site on Aug 7, 2019. Our crawler was banned by a page on your site, either through your robots.txt, the X-Robots-Tag HTTP header, or the meta robots tag. Update these tags to allow your page and the rest of your site to be crawled. If this error is found on any page on your site, it prevents our crawler (and some search engines) from crawling the rest of your site. Typically errors like this should be investigated and fixed by the site webmaster. I have been through all the help and doesn't seem to be any issues. You can check the site and robots.txt here: https://myfamilyclub.co.uk/robots.txt. Anyone got any advice on where I could go to get this sorted?
Getting Started | | MyFamilClubLtd1 -
Moz can't crawl my site.
Moz cannot carry out the site crawl on my online shop. Not really sure what the issue is, it has no problem getting onto my site when you use www. before the address, but it needs to be able to access bluerinsevintage.co.uk Stuck as what to do, we are a shopify store. Anyone else had this problem, or know what i need to change so they can crawl the site? thjis is the page they are getting when trying to get on bluerinsevintage.co.uk but if they use www.bluerinsevintage.co.uk the site comes up. Adam
Getting Started | | bluerinsevintage0 -
Moz Not Crawling Angular SPA
I have a client that just launched a redesigned website using Angular as a single page app. Google appears to be able to crawl the site just fine, but Moz crawl is only finding one page. We have updated the htaccess to allow for Rogerbot and Dotbot, but still unable to crawl any pages other than the home page. Does anyone have experience with this or ideas of why it won't crawl all pages, and how to allow for Moz to crawl all pages? There is a sitemap with approx. 390 pages. Thanks!
Getting Started | | PIN_Celler1 -
Standard Syntax in robots.txt doesn't prevent Moz bot from crawling
A client is getting many false positive site crawl errors for things like duplicate titles and duplicate content on pages that include /tag/ in the URL. An example is https://needquest.com/place_tag/autism-spectrum-disorder/page/4/ To resolve this we have set up a disallow statement in the robots.txt file that says
Getting Started | | btreloar
Disallow: /page/ For some reason this appears not to work, as the site crawl errors continue to list pages like this. Does anyone understand why that would be and what we need to do to properly disallow crawling these pages?0 -
Daily crawl reports, are they wasting my time?
I am relatively new here, I have 5 campaigns. I get new crawl complete reports almost every day for all of them. Wow great, except when I check the reports nothing has changed. Even if I have gone in and changed things or fixed errors, the same ones are still there and takes 4-7 days for that work to show up. Everytime I get one of these reports I am opening them up going through and not seeing the changes I implemented the previous days before. I'll spend 20-30 minutes going over these and checking details. So the question is, Are these reports wasting my time? Are they actually new reports or am I just getting spammed repeat notices everyday?
Getting Started | | RandyFriesen0 -
Page appearing multiple times in Warnings report
In reviewing my Moz warnings report, one page is appearing multiple times because the title is longer than recommended. Is this a bug in Moz? The page is appearing with a number of different URLs, despite there being a rel="canonical" tag. The page's canonical URL is: http://betablog.org/wishing-and-hoping-and-praying/ And in the warnings report I'm seeing variations like this: http://betablog.org/wishing-and-hoping-and-praying/?replytocom=26539 which are clearly links from the comments section.
Getting Started | | AlexBernardin0