Why did Moz crawl our development site?
-
In our Moz Pro account we have one campaign set up to track our main domain. This week Moz threw up around 400 new crawl errors, 99% of which were meta noindex issues.
What happened was that somehow Moz found the development/staging site and decided to crawl that. I have no idea how it was able to do this - the robots.txt is set to disallow all and there is password protection on the site. It looks like Moz ignored the robots.txt, but I still don't have any idea how it was able to do a crawl - it should have received a 401 Forbidden and not gone any further.
How do I a) clean this up without going through and manually ignoring each issue, and b) stop this from happening again?
Thanks!
-
@multitimemachine a noindex tag only really applied to Bing/Google other crawlers etc.. You said you blocked (via wildcard) all robots, are you sure you've not gotten e.g. meta robots that might be different?
help@moz.com might be your best bet for a quick resolution for 'cleaning' the report though I'm still slightly lost as to how your main domain and dev/staging were confused as normally there is a subdomain in the way from my experience, even stranger as bots can't by-pass passwords unless it's your sitemap.xml?sorry I can't get you a direct response but without seeing the site or similar it's hard to diagnose though I'm sure the team at Moz can point you in the right direction .
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Unsolved How to concat moz.com
Hello,
Moz Pro | | GrzegorzZ
I wanted to contact moz.com. I started my 30 day trial to test service.
After 2-3 days totally forgot about it. I remembered about moz.com when i received invoice saying they're gonna charge me.
I immediately wrote email to them that I do not want this service. I forgot and only put credit card data because i was required to. And would like a refund since it was 2-3 hours after bill. Unfortunately sending messages via their contact form is not an option. There is no confirmation on my email they received message nor return message from them.0 -
Moz Pro subscription
We have Pro subscription in the name of Convonix and we are not able to perform more than 10 tests on https://analytics.moz.com/pro/link-explorer/ Also, we are not able to see all 10 rows for "Top Followed Links" Request you to help us in this case.
Moz Pro | | Convonix0 -
What to do with a site of >50,000 pages vs. crawl limit?
What happens if you have a site in your Moz Pro campaign that has more than 50,000 pages? Would it be better to choose a sub-folder of the site to get a thorough look at that sub-folder? I have a few different large government websites that I'm tracking to see how they are fairing in rankings and SEO. They are not my own websites. I want to see how these agencies are doing compared to what the public searches for on technical topics and social issues that the agencies manage. I'm an academic looking at science communication. I am in the process of re-setting up my campaigns to get better data than I have been getting -- I am a newbie to SEO and the campaigns I slapped together a few months ago need to be set up better, such as all on the same day, making sure I've set it to include www or not for what ranks, refining my keywords, etc. I am stumped on what to do about the agency websites being really huge, and what all the options are to get good data in light of the 50,000 page crawl limit. Here is an example of what I mean: To see how EPA is doing in searches related to air quality, ideally I'd track all of EPA's web presence. www.epa.gov has 560,000 pages -- if I put in www.epa.gov for a campaign, what happens with the site having so many more pages than the 50,000 crawl limit? What do I miss out on? Can I "trust" what I get? www.epa.gov/air has only 1450 pages, so if I choose this for what I track in a campaign, the crawl will cover that subfolder completely, and I am getting a complete picture of this air-focused sub-folder ... but (1) I'll miss out on air-related pages in other sub-folders of www.epa.gov, and (2) it seems like I have so much of the 50,000-page crawl limit that I'm not using and could be using. (However, maybe that's not quite true - I'd also be tracking other sites as competitors - e.g. non-profits that advocate in air quality, industry air quality sites - and maybe those competitors count towards the 50,000-page crawl limit and would get me up to the limit? How do the competitors you choose figure into the crawl limit?) Any opinions on which I should do in general on this kind of situation? The small sub-folder vs. the full humongous site vs. is there some other way to go here that I'm not thinking of?
Moz Pro | | scienceisrad0 -
Get into Google : New Sites
I have a brand new website. It was created 10 days ago. How long would it take for it to show up in search results? I understand that since the site is new, there are no sites sending it backlinks. Also, i have optimized the page for my keyword "xyz" and it received an A grade. The site does not figure even in the top 50 results. Please help me out. It is a one page web application that needs to drive traffic to survive.
Moz Pro | | dl_s0 -
Is it possible to block Moz from crawling sites?
Hi, is it possible to stop Moz from crawling a site at the server level? Not that I am looking to do this or anything, but here's why I'm asking. I have been crawling a site that is managed (currently by 2 parties), and I noticed that this week pages crawled went from 80 (last week) to 1 page!! I know, what? See my image attached... and the issues all went to zero "0"....! So is it possible that someone can't prevent Moz from crawling the site at the server level? I checked the robots.txt file on the site, but nothing there. I'm curious. dYNUwjd.jpg
Moz Pro | | co.mc0 -
Moz is officially demigod-approved!
So, I was reading Percy Jackson & the Olympians: The Battle of the Labyrinth tonight, and found a description of a graffiti tag in the Labyrinth that says "MOZ RULZ". What do you think? Is one of the Mozzers a demigod who has explored and survived the Labyrinth? 🙂 (See attachment for book quote.) F1WmT5r
Moz Pro | | AdamThompson4 -
How long does a crawl take?
A crawl of my site started on the 8th July & is still going on - is there something wrong???
Moz Pro | | Brian_Worger1 -
Any tools for scraping blogroll URLs from sites?
This question is entirely in the whitehat realm... Let's say you've encountered a great blog - with a strong blogroll of 40 sites. The 40-site blogroll is interesting to you for any number of reasons, from link building targets to simply subscribing in your feedreader. Right now, it's tedious to extract the URLs from the site. There are some "save all links" tools, but they are also messy. Are there any good tools that will a) allow you to grab the blogroll (only) of any site into a list of URLs (yeah, ok, it might not be perfect since some sites call it "sites I like" etc.) b) same, but export as OPML so you can subscribe. Thanks! Scott
Moz Pro | | scottclark0