Allow only Rogerbot, not googlebot nor undesired access

MilosMilcom

I'm in the middle of site development and wanted to start crawling my site with Rogerbot, but avoid googlebot or similar to crawl it.

Actually mi site is protected with login (basic Joomla offline site, user and password required) so I thought that a good solution would be to remove that limitation and use .htaccess to protect with password for all users, except Rogerbot.

Reading here and there, it seems that practice is not very recommended as it could lead to security holes - any other user could see allowed agents and emulate them. Ok, maybe it's necessary to be a hacker/cracker to get that info - or experienced developer - but was not able to get a clear information how to proceed in a secure way.

The other solution was to continue using Joomla's access limitation for all, again, except Rogerbot. Still not sure how possible would that be.

Mostly, my question is, how do you work on your site before wanting to be indexed from Google or similar, independently if you use or not some CMS? Is there some other way to perform it?
I would love to have my site ready and crawled before launching it and avoid fixing issues afterwards...

Thanks in advance.

MilosMilcom

Great, thanks.

With those 2 recommendations I have more than enough for the next crawler. Thank you both!

MilosMilcom

Hi, thanks for answering

Well, it looks doable. Will try t do it on next programmed crawler, trying to minimize exposed time.

Hw, your idea seems very compatible with my first approach, maybe I could also allow rogerbot through htaccess, limiting others and only for that day remove the security user/password restriction (from joomla) and leave only the htaccess limitation. (I know maybe I'm a bit paranoid just want to be sure to minimize any collateral effect...)

*Maybe could be a good feature for Moz to be able to access restricted sites...

Carla_Dawson

Hi,

I ran into a similar issue while we were redesigning our site. This is what we did. We unblocked our site (we also had a user and password to avoid Google indexing it). We added the link to a Moz campaign. We were very careful not to share the URL (developing site) or put it anywhere where Google might find it quickly. Remember Google finds links from following other links. We did not submit the developing site to Google webmaster tools or Google analytics. We watched and waited for the Moz report to come in. When it did, we blocked the site again.

Hope this helps

Carla

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Allow only Rogerbot, not googlebot nor undesired access

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

If i subscribed to the PRO version, do I have access to followerwonk?

Rogerbot does not catch all existing 4XX Errors

In alt tag of a image can we use #hashtag or domain.com ? Is that good SEO or not allowed ?

HTC access 301 redirect rules regarding pagination and striped category base (wp)

Its been over a month, rogerbot hasn't crawled the entire website yet. Any ideas?

Is it possible to allow a client ato access a campaign and see the information.

4xx status code a page that cannot be accessed..

Why can't we access metrics in the API that are available in the mozbar?