Crawl Diagnostics 403 on home page...
-
In the crawl diagnostics it says oursite.com/ has a 403. doesn't say what's causing it but mentions no robots.txt. There is a robots.txt and I see no problems. How can I find out more information about this error?
-
Hi Dana,
Thanks for writing in. The robots.txt file would not cause a 403 error. That type of error is actually related to the way the server responds to our crawler. Basically, this means the server for the site is telling our crawler that we are not allowed to access the site. Here is a resource that explains the 403 http status code pretty thoroughly: http://pcsupport.about.com/od/findbyerrormessage/a/403error.htm
I looked at both of the campaigns on your account and I am not seeing a 403 error for either site, though I do see a couple of 404 page not found errors on one of the campaigns, which is a different issue.
If you are still seeing the 403 error message on one of your crawls, you would just need to have the webmaster update the server to allow rogerbot to access the site.
I hope this helps. Please let me know if you have any other questions.
-Chiaryn
-
Okay, so I couldn't find this thread and started a new one. Sorry...
... The problem persists.
RECAP
I have two blocks in my htaccess both are for amazonaws.com.
I have gone over our server block logs and see only amazon addresses and bot names.
I did a fetch as google with our WM Tools and fetch it did. Success!
Why isn't thiscrawler able to access? Many other bots are crawling right now.
Why can I use the seomoz on-page feature to crawl a single page but the automatic crawler wont access the site? Just took a break from typing this to try the on-page on our robots.txt, worked fine. Use the keyword "Disallow" and it gave me a C. =0)
... now if we could just crawl the rest of the site...
any help on this would be greatly appreciated.
-
I think I do. I just (a few minutes ago) went through a 403 problem being reported by another site trying access an html file for verification. Apparently they are connecting with an ip that's blocked by our htaccess. I removed the blocks told them to try again and it worked no problem. I see that SEOMoz has only crawled 1 page. Off to see if I can trigger a re-crawl now...
-
hmmm... not sure why this is happening. maybe add this line to the top of your robots.txt and see if it fixes by next week. it certainly won't hurt anything:
User-agent: * Allow: /
-
No problem. Looking at my Google WM Tools , crawl stats don't show any errors.
Thanks
User-Agent: *
Disallow: /*?zenid=
Disallow: /editors/
Disallow: /email/
Disallow: /googlecheckout/
Disallow: /includes/
Disallow: /js/
Disallow: /manuals/ -
OH this is only in SEOmoz's crawl diagnostics that you're seeing this error. That explains why robots.txt could be affecting it. I misread this earlier and thought you were finding the 403 on your own in-browser.
Can you paste the robots.txt file into here so we can see it? I would imagine that has everything to do with it now that I've correctly read your post --my apologies
-
apache
-
a 403 is a Forbidden code usually pertaining to Security and Permissions.
Are you running your server in an Apache or IIS environment? Robots.txt shouldn't affect a site's visibility to the public it only talks to site crawlers.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Unsolved Moz can't crawl my site
Moz is being blocked from crawling the following site - https://www.cleanchain.com. When looking at Robot.txt, the following is disallowing access but don't know whether this is preventing Moz from crawling too? User-agent: *
Moz Pro | | danhart2020
Disallow: /adeci/
Disallow: /core/
Disallow: /connectors/
Disallow: /assets/components/ Could something else be preventing the crawl?0 -
Duplicate Page
I just Check Crawl the status error with Duplicate Page Content. As Mentioned Below. Songs.pk | Download free mp3, Hindi Music, Indian Mp3 Songs http://www.getmp3songspk.com Songs.pk | Download free mp3, Hindi Music, Indian Mp3 Songs http://getmp3songspk.com and then i added these lines to my htaccess file RewriteBase /
Moz Pro | | Getmp3songspk
RewriteCond %{HTTP_HOST} !^www.getmp3songspk.com$ [NC]
RewriteRule ^(.*)$ http://www.getmp3songspk.com/$1 [L,R=301] But Still See that error again when i crawl a new test.0 -
All ranked pages on Googles SERP only links to home
I got a problem regarding my website called musik.dk, and hope you guys are able to help. I just got my first ranking results from Moz. My question is: All my keywords are linking to the home page, and not the artist page? For examples if I were to search on Rihanna on Google, then when musik.dk appears on the SERP, it only links to the home, and not musik.dk/rihanna.. This problem applies to all the given artists ranked on Googles SERP, it never shows the artist page itself, only links to home. Let me know if you need any information, and I gladly supply, this is kind of frustrating to me.. As a note: all pages besides the home has a PA of 1, and it doesn't really seem to change. UT3OBHr kVp7z38
Moz Pro | | Morten_Hjort0 -
Still Cant Crawl My Site
I've removed all blocks but two from our htaccess. They are for amazonaws.com to block amazon from crawling us. I did a fetch as google in our WM tools on our robots txt with success. SEOMoz crawler here hit's our site and gets a 403. I've looks in our blocked request logs and amazon is the only one in there. What is going on here?
Moz Pro | | martJ0 -
On-Page SEO Fixes - Are They Relative?
So, I'm implementing on-page fixes for a site that my company runs SEO services for (www.ShadeTreePowersports.com). However, I was wondering if there was a way to rank a pages' SEO quality, in general? As of now, it seems like the only way your recommendations can be consumed and altered is on a keyword basis. However, this seems be the reason I have a good amount of my F-Grades. Since my website sells powersports apparel and accessories, we cover a variety of applicable (but different) keywords like 'Motorcycle parts' or 'snow tubes,' because we sell so many different types of products. But, when I look at my F-Grades - SEOMoz is telling me my homepage is ranking poorly for a multitude of those pertinent keywords - but only because my page isn't catered specifically to each of them (IE: 'Snowmobile Parts' - 'Water Sport Apparel') But, with so many different types of products, catering to a specific one is impossible and would be detrimental. Is there a way to see how a page ranks, without factoring in those keywords? Or a better way that I can use these recommendations more efficiently? Thanks guys!
Moz Pro | | BrandLabs0 -
How To Solve Too Many On-Page Links In Blogger?
Hi, I Have An Issue Too Many On-Page Links In My Site And I Saw That There Are More Than 300 On Page Links On My Home Page URL. My Site Is Hosted On Blogger. So Please Tell Me How To Fix This Problem In Blogger.
Moz Pro | | MaherHackers0 -
Crawl Diagnostics Summary
Is there a way to view the charts in the crawl diagnostics summary on a monthly view (or export the monthly figures)?
Moz Pro | | RikkiD220 -
Crawl Rate for Lower Page Authority Websites
Hi,At thumbtack.com we get tons of links from low (or no) page authority websites, and I'm wondering what the crawl rate of those links looks like. I know Google pulls in the web at an astonishing rate, but I'd imagine they aren't re-crawling lower PA very frequently.Are they discovering these links a week after they're posted? A month? More? I spent a while looking around for histograms of actual crawl rates and found surprisingly little. I'd love to see average crawl rate by Domain or Page Authority if that exists anywhere.
Moz Pro | | Thumbtack
Thanks!-MichaelP.S. Here are some random examples of the types of pages with inbound links I'm talking about. Normally we wouldn't spend too much time thinking about these, but there's just so many of them we can't ignore it!- http://www.majestic-cleaners.webs.com/- http://domchieraphotography.blogspot.com/- http://charlottepiano.musicteachershelper.com/- http://pin-upgirlphotography.vpweb.com/default.html- http://jfaithful.weebly.com/0