Still Cant Crawl My Site
-
I've removed all blocks but two from our htaccess. They are for amazonaws.com to block amazon from crawling us.
I did a fetch as google in our WM tools on our robots txt with success.
SEOMoz crawler here hit's our site and gets a 403. I've looks in our blocked request logs and amazon is the only one in there.
What is going on here?
-
Hey Joel,
Happy Friday!
Sha
-
Hi Dana,
No problem. Glad you have sorted the problem now.
Have an awesome weekend
Sha
-
Hey Dana,
We've been corresponding in email, but I just wanted to update your thread here as well.
We don't use Amazon's bot, we use Amazon Web Service to host our crawler. If you are no longer blocking AWS you should be able to crawl OK moving forward.
Thanks!
Joel. -
Wish someone would've pointed that out days ago.
Thank you soooooo much for your great answer.
I don't understand though how or why seomoz is using amazons bot...
What if I don't want amazon accessing our site ( i dont). That means we can't use seomoz then??
-
we'll see how this goes. I've removed the blocks for amazonaws...
Thanks .
-
Hi Dana,
I believe SEOmoz utilizes Amazonaws services for crawling, (or at least they did a few months ago) so that may well be your problem.
The best (and quickest) way to confirm this is to go to the SEOmoz Help Hub and click the button at the top of the page to contact the Help Team directly.
Hope that helps,
Sha
-
Whats the web address?
Issa
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why did Moz crawl our development site?
In our Moz Pro account we have one campaign set up to track our main domain. This week Moz threw up around 400 new crawl errors, 99% of which were meta noindex issues. What happened was that somehow Moz found the development/staging site and decided to crawl that. I have no idea how it was able to do this - the robots.txt is set to disallow all and there is password protection on the site. It looks like Moz ignored the robots.txt, but I still don't have any idea how it was able to do a crawl - it should have received a 401 Forbidden and not gone any further. How do I a) clean this up without going through and manually ignoring each issue, and b) stop this from happening again? Thanks!
Moz Pro | | MultiTimeMachine0 -
moz crawl is stopped?
moz stopped indexing the links due to some updates? can some one confirm me thanks
Moz Pro | | 42409300125323700 -
Open Site Explorer and link numbers
I know this question has been asked many times in this forum but I still can't work it out. Why does this link: http://www.opensiteexplorer.org/links?page=1&site=www.bookpal.com.au&sort=page_authority&filter=&source=external&target=subdomain&group=0 Which is showing all links, external, to pages "on this sub domain" show 1,935 external links but this link: http://www.opensiteexplorer.org/links?page=1&site=www.bookpal.com.au&sort=page_authority&filter=follow&source=external&target=subdomain&group=0 which is exactly the same but this time shoing followed + 301 links, says "showing 1 - 50 external links) but won't show the total links (and I know the mouse-over on the question mark says it's won't show the total links, but I don't understand why it can't show the total links when it could show the total links when I requested to see "all links" instead of just "followed+301" links.) but it actually lists 700 links (14 pages, 50 results each page). I know the link list is limited to 25 links per domain but then it means you can NEVER know the total link count unless you download the full report. This makes using OSE to know numbers of links (internal, external, or otherwise) impossible. And if anyone uses the API, why the API (external+follow) returns 1,451 links? I'm sure it's an ongoing issue with people trying to get their head around all of this and I've never really been able to. Any insight would be much appreciated!
Moz Pro | | eatyourveggies0 -
Crawl Report Warnings
How much notice should be paid to the warnings on the SEO Moz crawl reports? We manage a fairly large property site and a lot of the errors on the crawl reports relate to automated responses. As a matter of priority which of the list below will have negative affects with the search engines? Temporary RedirectToo Many On-Page LinksOverly-Dynamic URLTitle Element Too Long (> 70 Characters)Title Missing or EmptyDuplicate Page ContentDuplicate Page TitleMissing Meta Description Tag
Moz Pro | | SoundinTheory0 -
Can I prevent some pages from being crawled from SEOMoz spider and still not affect Google Spider?
Well, basically that's the question 😄 Can I prevent some pages from being crawled from SEOMoz spider and still not affect Google Spider? This is, I have more than 10.000 pages on the website, and I am not interested in having reports for many of them, but I still wanna get SEO visits on them, so I want Google to crawl it easily... Thanks!
Moz Pro | | MattDG0 -
Are organic site links excluded in the rankings report?
If I'm pulling a rankings report for a specific subdomain and a different subdomain appears at the main result and the specific subdomain is a site link beneath it, does that show up in the results? It doesn't seem to in my report.
Moz Pro | | mattiasantin0 -
Do crawl reports see canonical tags?
Greetings, I just redesigned my site, www.funderstanding.com, and have the old site pointing to the new site via canonical URLs. I had a new crawl test run and it showed a large amount of duplicate content. Does the SEO Moz crawl tool validate canonical urls and adjusts the duplicate content count or is this note considered? FYI, I sent from no duplicate content to having 865 errors since the redesign went up so that seems suspicious. I would think though that assuming the canonical tag were used properly, and I hope it is?, that this would not be a problem?? All help with this is most appreciated. Eric
Moz Pro | | Ericc220