Allow only Rogerbot, not googlebot nor undesired access
-
I'm in the middle of site development and wanted to start crawling my site with Rogerbot, but avoid googlebot or similar to crawl it.
Actually mi site is protected with login (basic Joomla offline site, user and password required) so I thought that a good solution would be to remove that limitation and use .htaccess to protect with password for all users, except Rogerbot.
Reading here and there, it seems that practice is not very recommended as it could lead to security holes - any other user could see allowed agents and emulate them. Ok, maybe it's necessary to be a hacker/cracker to get that info - or experienced developer - but was not able to get a clear information how to proceed in a secure way.
The other solution was to continue using Joomla's access limitation for all, again, except Rogerbot. Still not sure how possible would that be.
Mostly, my question is, how do you work on your site before wanting to be indexed from Google or similar, independently if you use or not some CMS? Is there some other way to perform it?
I would love to have my site ready and crawled before launching it and avoid fixing issues afterwards...Thanks in advance.
-
Great, thanks.
With those 2 recommendations I have more than enough for the next crawler. Thank you both!
-
Hi, thanks for answering
Well, it looks doable. Will try t do it on next programmed crawler, trying to minimize exposed time.
Hw, your idea seems very compatible with my first approach, maybe I could also allow rogerbot through htaccess, limiting others and only for that day remove the security user/password restriction (from joomla) and leave only the htaccess limitation. (I know maybe I'm a bit paranoid just want to be sure to minimize any collateral effect...)
*Maybe could be a good feature for Moz to be able to access restricted sites...
-
Hi,
I ran into a similar issue while we were redesigning our site. This is what we did. We unblocked our site (we also had a user and password to avoid Google indexing it). We added the link to a Moz campaign. We were very careful not to share the URL (developing site) or put it anywhere where Google might find it quickly. Remember Google finds links from following other links. We did not submit the developing site to Google webmaster tools or Google analytics. We watched and waited for the Moz report to come in. When it did, we blocked the site again.
Hope this helps
Carla
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why am I not getting my allowance of 10,000 inbound links in csv download file? 370 out of 4700??
Hi, I'm desparately trying to audit my backlinks to remove a penguin penalty on my site livefit.co.uk When I do the inbound link report i'm not getting all the links in the download. I know there is a limit of 25 links from each linking site so we get the full picture of links bu: I have 4700 links so why does it need to limit it when we are supposed to see up to 10,000? When you check the link profile on the report it doesn't seem there are many sites with anything close to 25, so surely that rule is invalid as an explanation here? Should I just work off OSE? But there is less useful info than on the csv.. I'd be very grateful for your thoughts. Thanks! James
Moz Pro | | LiveFit0 -
HTC access 301 redirect rules regarding pagination and striped category base (wp)
I am an admin of a wordpress.org blog and I used to use "Yoast All in one SEO" plugin. While I was using this plugin it stripped the category base from my blog post URL's. With yoast all in one seo: Site.com/topic/subtpoic/page/#
Moz Pro | | notgwenevere
Without yoast all in one seo: Site.com/category/topic/subtopic/page/# Now, that I have switched to another plugin, I am trying to manage the page crawl errors which are tremendous somewhere around 1800, mostly due to pagination. Rather than redirecting each URL individually I would like to develop HTC access 301 redirects rules. However all instructions on how to create these HTC access 301 redirect rules are regarding the suffix rather than the category base. So my question is, can HTC access 301 redirects rules work to fix this problem? Including pagination? And if so, what would this particular HTC access 301 redirect look like? Especially regarding pagination? And do I really have to write a 301 redirect for each pagination page?0 -
Why are the rankings of my keywords different to rankings from other programmes such as 'Access my Lan' and 'Firefly'? Does SEOMoz count 'Google Places and Maps' in their rank count? Im thinking this could explain the difference
Why are the rankings of my keywords different to rankings from other programmes such as 'Access my Lan' and 'Firefly'? Does SEOMoz count 'Google Places and Maps' in their rank count? Im thinking this could explain the difference
Moz Pro | | john_Digino0 -
Rogerbot not showing in logs
Hi All Rogerbot has recently thrown up 403 errors for all our pages - no changes had been made to the site so I asked our ISP for assistance. They wanted to have a look at what rogerbot was doing and so went to the logs but rogerbot was not listed anywhere in the logs by name - any ideas why? Regards Craig
Moz Pro | | CraigWiltshire0 -
How do i get rid of a duplicate page error when you can not access that page?
How do i get rid of a duplicate page error when you can not access that page? I am using yahoo store manager. And i do not know code. The only way i can get to this page is by copying the link that the error message gives me. This is the duplicate that i can not find in order to delete. http://outdoortrailcams.com/busebo.html
Moz Pro | | tom14cat140 -
SEOmoz API? – "Limited access is included ... PRO membership." ?
Can someone expand on what you actually get with your pro membership on the Site Intelligence API. API page. Thanks
Moz Pro | | josey0 -
Confounding "Accessible to Engines" error?
Most of the pages on our site "Accessible to Engines" test in the SEOmoz reports. We cannot find any problem with the code and it's largely identical to the few pages that come up with an "A" score. One item that may be a reason is that we use meta http-equiv="refresh" content="600; For example in www.weatherzone.com.au/nsw/sydney/sydney We use this to fresh dynamic content on our site. Do search engines penalise pages that use this form of page refresh? Alternatively, is there a known bug in the SEOmoz "Accessible to Engines" report? Many thanks
Moz Pro | | weatherzone0