Crawl Diagnostics 403 on home page...
-
In the crawl diagnostics it says oursite.com/ has a 403. doesn't say what's causing it but mentions no robots.txt. There is a robots.txt and I see no problems. How can I find out more information about this error?
-
Hi Dana,
Thanks for writing in. The robots.txt file would not cause a 403 error. That type of error is actually related to the way the server responds to our crawler. Basically, this means the server for the site is telling our crawler that we are not allowed to access the site. Here is a resource that explains the 403 http status code pretty thoroughly: http://pcsupport.about.com/od/findbyerrormessage/a/403error.htm
I looked at both of the campaigns on your account and I am not seeing a 403 error for either site, though I do see a couple of 404 page not found errors on one of the campaigns, which is a different issue.
If you are still seeing the 403 error message on one of your crawls, you would just need to have the webmaster update the server to allow rogerbot to access the site.
I hope this helps. Please let me know if you have any other questions.
-Chiaryn
-
Okay, so I couldn't find this thread and started a new one. Sorry...
... The problem persists.
RECAP
I have two blocks in my htaccess both are for amazonaws.com.
I have gone over our server block logs and see only amazon addresses and bot names.
I did a fetch as google with our WM Tools and fetch it did. Success!
Why isn't thiscrawler able to access? Many other bots are crawling right now.
Why can I use the seomoz on-page feature to crawl a single page but the automatic crawler wont access the site? Just took a break from typing this to try the on-page on our robots.txt, worked fine. Use the keyword "Disallow" and it gave me a C. =0)
... now if we could just crawl the rest of the site...
any help on this would be greatly appreciated.
-
I think I do. I just (a few minutes ago) went through a 403 problem being reported by another site trying access an html file for verification. Apparently they are connecting with an ip that's blocked by our htaccess. I removed the blocks told them to try again and it worked no problem. I see that SEOMoz has only crawled 1 page. Off to see if I can trigger a re-crawl now...
-
hmmm... not sure why this is happening. maybe add this line to the top of your robots.txt and see if it fixes by next week. it certainly won't hurt anything:
User-agent: * Allow: /
-
No problem. Looking at my Google WM Tools , crawl stats don't show any errors.
Thanks
User-Agent: *
Disallow: /*?zenid=
Disallow: /editors/
Disallow: /email/
Disallow: /googlecheckout/
Disallow: /includes/
Disallow: /js/
Disallow: /manuals/ -
OH this is only in SEOmoz's crawl diagnostics that you're seeing this error. That explains why robots.txt could be affecting it. I misread this earlier and thought you were finding the 403 on your own in-browser.
Can you paste the robots.txt file into here so we can see it? I would imagine that has everything to do with it now that I've correctly read your post --my apologies
-
apache
-
a 403 is a Forbidden code usually pertaining to Security and Permissions.
Are you running your server in an Apache or IIS environment? Robots.txt shouldn't affect a site's visibility to the public it only talks to site crawlers.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
"On-Page Report Card"- why is still showing " F grade" after introducing the keyword in page and title.
Hello, "On-Page Report Card"- why is still showing " F grade" after introducing the keyword in page and title. After changing the title and putting the keyword inside the title, in this section, "Exact Keyword Usage in Page Title", it shows the first title, without updating my changes. I have updated several times. In some cases worked, in this case doesn't. For example "online project management software" grades F, and "project management software" grades A, even if I've put the "online" word in title an so on. Now I have the same issue with "stock management software" which grades F. "stock management" grades A, even if i've put exactly "stock management software" thanks.
Moz Pro | | directspark0 -
403 error for a member site
Perhaps a stupid question but SEOmoz registers 403 errors for pages behind a membersite (ie. they are restricted on purpose). Should I noindex these pages or just let SEOmoz register these "errors"?
Moz Pro | | Crunchii0 -
Rogerbot did not crawl my site ! What might be the problem?
When I saw the new crawl for my site I wondered why there are no errors, no warning and 0 notices anymore. Then I saw that only 1 page was crawled. There are no Error Messages or webmasters Tools also did not report anything about crawling problems. What might be the problem? thanks for any tips!
Moz Pro | | inlinear
Holger rogerbot-did-not-crawl.PNG0 -
Help with URL parameters in the SEOmoz crawl diagnostics Error report
The crawl diagnostics error report is showing tons of duplicate page titles for my pages that have filtering parameters. These parameters are blocked inside Google and Bing webmaster tools. I do I block them within the SEOmoz crawl diagnostics report?
Moz Pro | | SunshineNYC0 -
Crawl Diagnostics Summary
Is there a way to view the charts in the crawl diagnostics summary on a monthly view (or export the monthly figures)?
Moz Pro | | RikkiD220 -
Canonical tags and SEOmoz crawls
Hi there. Recently, we've made some changes to http://www.gear-zone.co.uk/ to implement canonical tags to some dynamically generated pages to stop duplicate content issues. Previously, these were blocked with robots.txt. In Webmaster Tools, everything looks great - pages crawled has shot up, and overall traffic and sales has seen a positive increase. However the SEOmoz crawl report is now showing a huge increase in duplicate content issues. What I'd like to know is whether SEOmoz registers a canonical tag as preventing a piece of duplicate content, or just adds to it the notices report. That is, if I have 10 pages of duplicate content all with correct canonical tags, will I still see 10 errors in the crawl, but also 10 notices showing a canonical has been found? Or, should it be 0 duplicate content errors, but 10 notices of canonicals? I know it's a small point, but it could potentially have a big difference. Thanks!
Moz Pro | | neooptic0 -
What the . . ! Duplicate Pages and Titles WAY up?
My duplicate pages went up 50 plus in the past week, and my duplicate page titles went over more then 100. We recently redesigned the website, but it has been up for several weeks now. The only change I made specifically last week or late the week before was to get my 301 redirects done to get the www. version and the non www version pointing to the same place (as well as a couple other sites that point to it). I'm sure this is not enough info to figure out what went wrong . . . I'd love some help in figuring this out though.
Moz Pro | | damon12120 -
On Page Optimization Reports - Huh?
I've been working hard to use this EXCELLENT tool for optimize some of what I consider my most important pages . . . But the automatic tool that pulls pages and grades them (the "summary" of the "on page" report) . . . I don't get it. It only graded three of my pages, and I don't understand how it chose what keywords to grade it for? I'm just very confused. I don't understand how it chose the pages to grade, not the words it chose to grade it against. 😞
Moz Pro | | damon12120