Why did Moz crawl our development site?
-
In our Moz Pro account we have one campaign set up to track our main domain. This week Moz threw up around 400 new crawl errors, 99% of which were meta noindex issues.
What happened was that somehow Moz found the development/staging site and decided to crawl that. I have no idea how it was able to do this - the robots.txt is set to disallow all and there is password protection on the site. It looks like Moz ignored the robots.txt, but I still don't have any idea how it was able to do a crawl - it should have received a 401 Forbidden and not gone any further.
How do I a) clean this up without going through and manually ignoring each issue, and b) stop this from happening again?
Thanks!
-
@multitimemachine a noindex tag only really applied to Bing/Google other crawlers etc.. You said you blocked (via wildcard) all robots, are you sure you've not gotten e.g. meta robots that might be different?
help@moz.com might be your best bet for a quick resolution for 'cleaning' the report though I'm still slightly lost as to how your main domain and dev/staging were confused as normally there is a subdomain in the way from my experience, even stranger as bots can't by-pass passwords unless it's your sitemap.xml?sorry I can't get you a direct response but without seeing the site or similar it's hard to diagnose though I'm sure the team at Moz can point you in the right direction .
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
WEbsite cannot be crawled
I have received the following message from MOZ on a few of our websites now Our crawler was not able to access the robots.txt file on your site. This often occurs because of a server error from the robots.txt. Although this may have been caused by a temporary outage, we recommend making sure your robots.txt file is accessible and that your network and server are working correctly. Typically errors like this should be investigated and fixed by the site webmaster. I have spoken with our webmaster and they have advised the below: The Robots.txt file is definitely there on all pages and Google is able to crawl for these files. Moz however is having some difficulty with finding the files when there is a particular redirect in place. For example, the page currently redirects from threecounties.co.uk/ to https://www.threecounties.co.uk/ and when this happens, the Moz crawler cannot find the robots.txt on the first URL and this generates the reports you have been receiving. From what I understand, this is a flaw with the Moz software and not something that we could fix form our end. _Going forward, something we could do is remove these rewrite rules to www., but these are useful redirects and removing them would likely have SEO implications. _ Has anyone else had this issue and is there anything we can do to rectify, or should we leave as is?
Moz Pro | | threecounties0 -
403 error for a member site
Perhaps a stupid question but SEOmoz registers 403 errors for pages behind a membersite (ie. they are restricted on purpose). Should I noindex these pages or just let SEOmoz register these "errors"?
Moz Pro | | Crunchii0 -
Site Redesign Launch - How Can I crawl for immediate review
Just redesigned my site and want to have a crawl done to check for errors or any items which need to be cleaned up. Anyone know how I can do this as SEOMoz only crawls once per week. Thanks!
Moz Pro | | creativemobseo0 -
Has any on else experienced a spike in crawl errors?
Hi, Since the last time our sites were crawled in SEOmoz they are all showing a spike in Errors. (Mainly duplicate page titles and duplicate content). We haven't changed anything to the structure of the sites but they are all using the same content management system. The image is an example of what we are witnessing for all our sites based on the same system. Is anyone else experiencing anything similar? or does anyone know of any changes that SEOmoz has implemented which may be affecting this? Thanks in advance, Anthony. WzdQV WzdQV WzdQV.jpg WzdQV.jpg
Moz Pro | | BallyhooLtd1 -
Site Not On Google, SEOmoz shows as 43
The site I'm helping with was at one time a page 1, even #1 on page 1. Lots of changes, problems when someone else did something and dropped to page 4 for keyword. After some recent tweaking I did, it no longer shows on rankings. No penalty on Google Webmaster, in Webmaster tools it shows the sitemap processed with no errors. SEOmoz still shows it at #43 in Rankings. Why does SEOmoz see it, but I don't? Site is www.plussizeplum.com (sorry, plus size women's lingerie) and keywords are "plus size lingerie".
Moz Pro | | dlcohen0 -
Not all pages are being crawled
I am set up on the PRO plan, I was under the impression that it would crawl up to 10,000 pages. My site has just over 200 pages, but whenever I am crawled it only crawls 121 pages. Is this normal? It's hard to know how reliable my data is because a significant amount of pages are missing.
Moz Pro | | KristinHarding0 -
Amount of Pages Crawled Dropped Significantly
I am just wondering if something changed with the SEOMoz crawler. I was always getting 10,000 or near 10,000 pages crawled. After the last two crawls I am ending up around 2500 pages. Has anything changed that I would need to look at it see if I am blocking the crawler or something else?
Moz Pro | | jeffmace0 -
Links not appearing on Open Site Explorer
My site gained several new inbound links during December and only two of them are not all showing up on the latest Linkscape update. It seems to be the links that were created at the end of the month which are showing up, whereas a handful at the beginning of the month are nowhere to be seen. All the linking pages have been indexed by Google the links are do-follow, and one of the sites in particular is not obsure and has a DA in the 90's. I appreciate the Linkscape doesn't index everything, but I would have thought that more tof the results of my efforts would have shown up in OSE. I'd be really grateful if anyone could explain this to me please. Thanks Ben
Moz Pro | | atticus70