Crawling password protected sites such as dev or staging areas to look at sites b4 going live ?
-
Hi
Ive instructed clients to password protect dev areas so dont get crawled and indexed but how do we set up Moz crawl software so we can crawl theses sites for final check of any issues before going live ?
Is there an option i havnt seen to add logins/passwords for crawl software to access ?
cheers
dan
-
ok thanks Chiaryn
is that the actual name of the moz crawler (to allow in Robots) simply rogerbot ? or any other characters etc ?
Also is it not the case that even when blocked by robots.txt G can still crawl/index it once password removed, think i read few comments somewhere on Moz that can still happen somehow ?
Please advise asap ?
Many Thanks
Dan
-
Hey Dan,
Unfortunately, our crawler is not able to access password protected content on your site. If you create a staging subdomain that is not password protected, you could use the robots.txt file to allow rogerbot and block other crawlers, but I'm afraid our crawler will not crawl anything that a normal search engine crawl would not be able to crawl so we cannot crawl password protected pages.
I hope this helps.
Chiaryn
-
i dont suppose either of you are able to help at all with this related question:
http://moz.com/community/q/site-crawl-errors-download-list-of-all-urls
-
i dont suppose either of you are able to help at all with this related question:
http://moz.com/community/q/site-crawl-errors-download-list-of-all-urls
-
Hi Andy
Screaming Frog does have password access feature for your info i have just tried it
All Best
Dan
-
Thanks Matt
I have got screaming frog and can confirm that it has password access feature, but i really want Moz to be able to access too, i would have thought they should have this option somewhere. Are you saying Moz crawls have more info than SF (re 'moz level' analysis) ?
Dev site better password prtected than robots arnt they i think ?
Cheers
Dan
-
Hi Dan
I was about to ask the exact same question, so will keep an eye out for an answer.
I hope it is possible, but I couldn't work it out.
-
I don't know if there's a way to do this in Moz but you could always get Screaming Frog & tell it to ignore robots.txt - that will definitely crawl it. You can check titles, descriptions, canonicals, H1s, etc. that way. It doesn't give the Moz level analysis but it's a start that def works. You can also see if you have parameter issues that way.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
If we put the disavow links in google, does MOZ crawl the same links?
I have put bad or spam links in disavow file, but still showing in MOZ backlinks. So, I want to know that Why is MOZ not removing the spam links from their system?
Moz Bar | | insidewebanalytics0 -
Has anyone had to deal with Moz crawl issues on their Zendesk support site?
If so - how did you end up resolving them? For instance we have 85 "temporary redirect" errors from our Zendesk support site in our crawl error report and we don't have access to the robots.txt file through Zendesk.
Moz Bar | | zspace0 -
Canonical in Moz crawl report
I'm wondering if the moz bot is seeing my rel="canonical" on my pages. There are 2 notices that are bothering me: Overly Dynamic URL Rel Canonical Overly Dynamic URL - This notice is being generated by urls with query strings. On the main page I have the rel="canonical" tag in the header. So every page with the query string has the canonical tag that points to the page that should be indexed. So my question...Why the notice? Isn't this being handled properly with the canonical tag? I know I can use my robots.txt or the tool in Google search console but is it really necessary when I have the canonical on every page? Here is one of the links that has the "Overly Dynamic URL" notice, as you can see the the canonical in the header points to the page without the query string: https://www.vistex.com/services/training/traditional-classroom/registration-form/?values=true&course-title=DMP101 – Data Maintenance Pricing – Business Processes&date=March 14, 2016 Rel Canonical - Every page in my report has this notice "Using rel=canonical suggests to search engines which URL should be seen as canonical". I'm using the rel="canonical" tag on all of my pages by default. Is the report suggesting that I don't do this? Or is it suggesting that I should? Again...why the notice?
Moz Bar | | Brando160 -
4 days waiting for a Moz Crawl - How quick are yours?
Hi there Please could anyone say how long they have been waiting for crawl results. I requested a crawl on a 20 page website and I have been waiting 4 days since last weekend. I checked Moz Health and there have been no related issues there: http://health.moz.com/ Your response would be welcome. Thanks
Moz Bar | | SEOguy10 -
Weird 404 in Crawl Diagnostics
I'am getting a lot of 404 errors (196 to be precise ) - but their pattern is weird.
Moz Bar | | oorbo
The page that the crawler is trying to find is (e.g):
http://www.oorbo.com/item/asufa-israeli-design-shop**/www.oorbo.com.
the linking page is** http://www.oorbo.com/item/asufa-israeli-design-shop meaning it adds to the end of the link the root URL - /www.oorbo.com. This happens in all 196 cases - trying to find a page http://www.oorbo.com/some-page/www.oorbo.com from a refferer page http://www.oorbo.com/some-page. Obviously this pages do not exist, and it's getting a 404. I've look into the pages themselves and digged into their code - It doesn't seem that the bad link is any where on the page. Did anyone came across this kind of issue? any one can point me to a solution ?0 -
Moz Tool Bar Annoyance - How do I make the green keyword difficulty box go away?
A highly annoying light green keyword difficulty box just started showing up to the right of the search box. How do I make this go away? ulBzP9W
Moz Bar | | JenKeller0 -
Crawl Diagnostics: Exlude known errors and others that have been detected by mistake? New moz analytics feature?
I'm curious if the new moz analytics will have the feature (filter) to exclude known errors from the crwal diagnostics. For example, the attached screenshot shows the URL as 404 Error, but it works fine: http://en.steag.com.br/references/owners-engineering-services-gas-treatment-ogx.php To maintain a better overview which errors can't be solved (so I just would like to mark them as "don't take this URL into account...") I will not try to fix them again next time. On the other hand I have hundreds of errors generated by forums or by the cms that I can not resolve on my own. Also these kind of crawl errors I would like to filter away and categorize like "errors to see later with a specialist". Will this come with the new moz analytics? Anyway is there a list that shows which new features will still be implemented? knPGBZA.png?1
Moz Bar | | inlinear0 -
Where is the Crawl Tool we had before Research?
Hello, I can't seem to find the crawl tool for other domains that aren't in our campaigns. We've lost it 😞 Thanks, Romeo.
Moz Bar | | RomeoMadrid1