Does Rogerbot respect the robots.txt file for wildcards?
-
Hi All,
Our robots.txt file has wildcards in it, which Googlebot recognizes. Can anyone tell me whether or not Rogerbot recognizes wildcards in the robots.txt file?
We've done a Rogerbot site crawl since updating the robots.txt file and the pages that are set to disallow using the wildcards are still showing.
BTW, Googlebot is not crawling these pages according to Webmaster Tools.
Thanks in advance,
Robert
-
Thanks! RogerBot is now working. Perhaps it had a cached copy of the old robots.txt file. All is well now.
Thank you!
-
Yes, rogerbot follows robots exclusion protocol - http://www.seomoz.org/dp/rogerbot
-
Roger should obey wildcards. It sounds like he's not, so could you tattle on him to the help team and they'll see why he's not following directions? http://www.seomoz.org/help Thanks!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Ajax4SEO and rogerbot crawling
Has anyone had any experience with seo4ajax.com and moz? The idea is that it points a bot to a html version of an ajax page (sounds good) without the need for ugly urls. However, I don't know how this will work with rogerbot and whether moz can crawl this. There's a section to add in specific user agents and I've added "rogerbot". Does anyone know if this will work or not? Otherwise, it's going to create some complications. I can't currently check as the site is in development and the dev version is noindexed currently. Thanks!
Moz Pro | | LeahHutcheon0 -
Allow only Rogerbot, not googlebot nor undesired access
I'm in the middle of site development and wanted to start crawling my site with Rogerbot, but avoid googlebot or similar to crawl it. Actually mi site is protected with login (basic Joomla offline site, user and password required) so I thought that a good solution would be to remove that limitation and use .htaccess to protect with password for all users, except Rogerbot. Reading here and there, it seems that practice is not very recommended as it could lead to security holes - any other user could see allowed agents and emulate them. Ok, maybe it's necessary to be a hacker/cracker to get that info - or experienced developer - but was not able to get a clear information how to proceed in a secure way. The other solution was to continue using Joomla's access limitation for all, again, except Rogerbot. Still not sure how possible would that be. Mostly, my question is, how do you work on your site before wanting to be indexed from Google or similar, independently if you use or not some CMS? Is there some other way to perform it?
Moz Pro | | MilosMilcom
I would love to have my site ready and crawled before launching it and avoid fixing issues afterwards... Thanks in advance.0 -
The pages that add robots as noindex will Crawl and marked as duplicate page content on seo moz ?
When we marked a page as noindex with robots like {<meta name="<a class="attribute-value">robots</a>" content="<a class="attribute-value">noindex</a>" />} will crawl and marked as duplicate page content(Its already a duplicate page content within the site. ie, Two links pointing to the same page).So we are mentioning both the links no need to index on SE.But after we made this and crawl reports have no change like it tooks the duplicate with noindex marked pages too. Please help to solve this problem.
Moz Pro | | trixmediainc0 -
Its been over a month, rogerbot hasn't crawled the entire website yet. Any ideas?
Rogerbot has stopped crawling the website at 308 pages past week and has not crawled the website with over 1000+ pages. Any ideas on what I can do to get this fixed & crawling?
Moz Pro | | TejaswiNaidu0 -
Does the SEOMoz weekly crawl that highlights no meta description tag, take into account if there is a meta robots noindex,follow tag on the pages it indicates the missing meta descriptions?
The weekly crawl website report is telling me that there are pages that have missing meta description tags, yet I've implemented meta robots tags to 'noindex, follow' those pages which are visible in those page source files. As far as Google Is concerned, surely this then won't be a problem since it is being instructed NOT to consider these specific pages for indexing. I am assuming that the weekly SEOmoz website crawl is simply throwing the missing meta description crawl findings into its report without itself observing that the particluar URL references contain the meta robots 'noindex,follow' tag ???? Appreciate if you can clairfy if this is the case. It would help me understand that (at least in terms of my efforts towards Google) your own crawl doesn't observe the meta robots tag instruction, hence the resultant report's flagging the discrepancy.
Moz Pro | | callassist0 -
SEOMoz's Crawl Diagnostics showing an error where the Title is missing on our Sitemap.xml file?
Hi Everyone, I'm working on our website Sky Candle and I've been running it as a campaign in SEOmoz. I've corrected a few errors we had with the site previously, but today it's recrawled and found a new error which is a missing Title tag on the sitemap.xml file. Is this a little glitch in the SEOmoz system? Or do I need to add a page title and meta description to my XML file. http://www.skycandle.co.uk/sitemap.xml Any help would be greatly appreciated. I didn't think I'd need to add this. Kind Regards Lewis
Moz Pro | | LewisSellers0 -
What is the full User Agent of Rogerbot?
What's the exact string that Rogerbot send out as his UserAgent within the HTTP Request? Does it ever differ?
Moz Pro | | rightmove0 -
Is there a whitelist of the RogerBot IP Addresses?
I'm all for letting Roger crawl my site, but it's not uncommon for malicious spiders to spoof the User-Agent string. Having a whitelist of Roger's IP addresses would be immensely useful!
Moz Pro | | EricCholis1