Crawl Diagnostics 403 on home page...
-
In the crawl diagnostics it says oursite.com/ has a 403. doesn't say what's causing it but mentions no robots.txt. There is a robots.txt and I see no problems. How can I find out more information about this error?
-
Hi Dana,
Thanks for writing in. The robots.txt file would not cause a 403 error. That type of error is actually related to the way the server responds to our crawler. Basically, this means the server for the site is telling our crawler that we are not allowed to access the site. Here is a resource that explains the 403 http status code pretty thoroughly: http://pcsupport.about.com/od/findbyerrormessage/a/403error.htm
I looked at both of the campaigns on your account and I am not seeing a 403 error for either site, though I do see a couple of 404 page not found errors on one of the campaigns, which is a different issue.
If you are still seeing the 403 error message on one of your crawls, you would just need to have the webmaster update the server to allow rogerbot to access the site.
I hope this helps. Please let me know if you have any other questions.
-Chiaryn
-
Okay, so I couldn't find this thread and started a new one. Sorry...
... The problem persists.
RECAP
I have two blocks in my htaccess both are for amazonaws.com.
I have gone over our server block logs and see only amazon addresses and bot names.
I did a fetch as google with our WM Tools and fetch it did. Success!
Why isn't thiscrawler able to access? Many other bots are crawling right now.
Why can I use the seomoz on-page feature to crawl a single page but the automatic crawler wont access the site? Just took a break from typing this to try the on-page on our robots.txt, worked fine. Use the keyword "Disallow" and it gave me a C. =0)
... now if we could just crawl the rest of the site...
any help on this would be greatly appreciated.
-
I think I do. I just (a few minutes ago) went through a 403 problem being reported by another site trying access an html file for verification. Apparently they are connecting with an ip that's blocked by our htaccess. I removed the blocks told them to try again and it worked no problem. I see that SEOMoz has only crawled 1 page. Off to see if I can trigger a re-crawl now...
-
hmmm... not sure why this is happening. maybe add this line to the top of your robots.txt and see if it fixes by next week. it certainly won't hurt anything:
User-agent: * Allow: /
-
No problem. Looking at my Google WM Tools , crawl stats don't show any errors.
Thanks
User-Agent: *
Disallow: /*?zenid=
Disallow: /editors/
Disallow: /email/
Disallow: /googlecheckout/
Disallow: /includes/
Disallow: /js/
Disallow: /manuals/ -
OH this is only in SEOmoz's crawl diagnostics that you're seeing this error. That explains why robots.txt could be affecting it. I misread this earlier and thought you were finding the 403 on your own in-browser.
Can you paste the robots.txt file into here so we can see it? I would imagine that has everything to do with it now that I've correctly read your post --my apologies
-
apache
-
a 403 is a Forbidden code usually pertaining to Security and Permissions.
Are you running your server in an Apache or IIS environment? Robots.txt shouldn't affect a site's visibility to the public it only talks to site crawlers.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Listing of all Google Indexed Pages
I started managing a site that has about 391,000 indexed pages. I want to get to the bottom of why there are so many in preparation for a ecommerce Migration and improving SEO. Anyone know of a tool? Many tools I have came across can only take 100 at a time. I would love to get them in excel or a database. I look forward to the suggestions.
Moz Pro | | dhands170 -
Crawl Diagnostics - Historical Summary
As we've been fixing errors on our website, the crawl diagnostic graphs have been showing great results (top left to bottom right for errors). The problem is the graphs themselves aren't very pretty. I can't use them in my internal reports (all internal reports are standardised colours/formats). Is there anyway of exporting the top level summary with historic data so the graphs can be recreated in company colours? I don't want the detailed CSV breakdown of what errors occurred, but rather than on X date there were Y errors, the next month Z errors and so forth. The data must already be in the SEOMoz system in order to create the graphs themselves - I was hoping this can be made available to us if it isn't already? Does anyone know if there is already a way of doing this? I've tried to 'inspect element' and find the underlying data in the source code but to no avail, and can't see any exports that would do this. Thanks in advance Dean
Moz Pro | | FashionLux0 -
Missing Page Titles On The Comptetive Link Comparison Page
Hello, When I do a Link Analysis using the SEOmoz tools I have noticed that most of the pages listed on the Top Pages tab show [No Data] for page title. Any idea why that could be? The page source of those pages have one and only one <title>tag.</p> <p>Thanks!</p></title>
Moz Pro | | andersvin0 -
Crawl Diagnostics - Canonical Question
On one of my sites I have 61 notices for Rel Canonical. Is it bad to have these or is this just something that's informative?
Moz Pro | | kadesmith0 -
Third crawl of my sites back to 250 pages
Hi all, I've been waiting some days for the third crawl of my sites, but SEOMOZ only crawled 277 pages. The next phrase appeared on my crawl report: Pages Crawled: 277 | Limit: 250 My last 2 crawls were of about 10K limit. Any idea? Kind regards, Simon.
Moz Pro | | Aureka0 -
Hyphens in Page Titles?
We are using a combination of keywords using our brand name. So the keyword is structure as: brand name - word (separated by a hyphen) When I run a report on the page for the keywords that have the above format, the report tells me that I need to use the keyword in the title of the page. Is it okay to have hyphens in Page Titles? I assume not, but I want to double check. Thanks, Alex
Moz Pro | | costarica.com0 -
Issue in number of pages crawled
i wanted to figure out how our friend Roger Bot works. On the first crawl of one of my large sites, the number of pages crawled stopped at 10000 (due to the restriction on the pro account). However after a few weeks, the number of pages crawled went down to about 5500. This number seemed to be a more accurate count of the pages on our site. Today, it seems that Roger Bot has completed another crawl and the number is up to 10000 again. I know there has been no downtime on our site, and the items that we fixed on our site did not reduce or increase the number of pages we had. Just making sure there are no known issues with Roger Bot before I look deeper into our site to see if there is an issue. Thanks!
Moz Pro | | cchhita0 -
Dismiss crawl diagnostics error
Hello everyone, Is there a way to dismiss some errors in the Crawl Diagnostics tool so they don't appear again? It happens so that some of the errors are never going to be fixed because of their nature. For example, 'Title too long' errors that point to some of the threads on my forum - it doesn't make sense to change the title of a thread posted by user just for the sake of the error disappearing from the 'Crawl Diagnostics' tool. 🙂 Otherwise the CD interface gets a little bit cluttered with errors which I will never fix anyway. I wonder how others deal with this problem. Thanks.
Moz Pro | | MaratM0