Crawl Diagnostics 403 on home page...
-
In the crawl diagnostics it says oursite.com/ has a 403. doesn't say what's causing it but mentions no robots.txt. There is a robots.txt and I see no problems. How can I find out more information about this error?
-
Hi Dana,
Thanks for writing in. The robots.txt file would not cause a 403 error. That type of error is actually related to the way the server responds to our crawler. Basically, this means the server for the site is telling our crawler that we are not allowed to access the site. Here is a resource that explains the 403 http status code pretty thoroughly: http://pcsupport.about.com/od/findbyerrormessage/a/403error.htm
I looked at both of the campaigns on your account and I am not seeing a 403 error for either site, though I do see a couple of 404 page not found errors on one of the campaigns, which is a different issue.
If you are still seeing the 403 error message on one of your crawls, you would just need to have the webmaster update the server to allow rogerbot to access the site.
I hope this helps. Please let me know if you have any other questions.
-Chiaryn
-
Okay, so I couldn't find this thread and started a new one. Sorry...
... The problem persists.
RECAP
I have two blocks in my htaccess both are for amazonaws.com.
I have gone over our server block logs and see only amazon addresses and bot names.
I did a fetch as google with our WM Tools and fetch it did. Success!
Why isn't thiscrawler able to access? Many other bots are crawling right now.
Why can I use the seomoz on-page feature to crawl a single page but the automatic crawler wont access the site? Just took a break from typing this to try the on-page on our robots.txt, worked fine. Use the keyword "Disallow" and it gave me a C. =0)
... now if we could just crawl the rest of the site...
any help on this would be greatly appreciated.
-
I think I do. I just (a few minutes ago) went through a 403 problem being reported by another site trying access an html file for verification. Apparently they are connecting with an ip that's blocked by our htaccess. I removed the blocks told them to try again and it worked no problem. I see that SEOMoz has only crawled 1 page. Off to see if I can trigger a re-crawl now...
-
hmmm... not sure why this is happening. maybe add this line to the top of your robots.txt and see if it fixes by next week. it certainly won't hurt anything:
User-agent: * Allow: /
-
No problem. Looking at my Google WM Tools , crawl stats don't show any errors.
Thanks
User-Agent: *
Disallow: /*?zenid=
Disallow: /editors/
Disallow: /email/
Disallow: /googlecheckout/
Disallow: /includes/
Disallow: /js/
Disallow: /manuals/ -
OH this is only in SEOmoz's crawl diagnostics that you're seeing this error. That explains why robots.txt could be affecting it. I misread this earlier and thought you were finding the 403 on your own in-browser.
Can you paste the robots.txt file into here so we can see it? I would imagine that has everything to do with it now that I've correctly read your post --my apologies
-
apache
-
a 403 is a Forbidden code usually pertaining to Security and Permissions.
Are you running your server in an Apache or IIS environment? Robots.txt shouldn't affect a site's visibility to the public it only talks to site crawlers.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
'Duplicate Page Content' for dissimilar pages
I'm using Moz's Crawl Diagnostics to try and clean up some SEO priorities for our website (http://www.craftcompany.co.uk) HOWEVER, virtually all of the pages that are being categorised as duplicate content are not the same, or indeed similar. For instance, these three pages have been deemed duplicated pages; http://www.craftcompany.co.uk/pme-rose-leaf-veined-plunger.html http://www.craftcompany.co.uk/double-faced-satin-ribbon-black-25mm-wide.html http://www.craftcompany.co.uk/double-faced-satin-maroon-10mm-wide-25mt.html Can anyone give me an insight into why this is? Many Thanks! http://www.craftcompany.co.uk/
Moz Pro | | The_Craft_Company0 -
I need an interlinking report for my site, is there a report in Moz or another application that tell me how all of my pages are linked to other pages on my site?
I am in the process of doing a redesign for one of my sites. I need an interlinking report for my site. Is there a report in Moz or another application that tell me how all of my pages are linked to other pages on my site?
Moz Pro | | seoflorida0 -
My Campaign only crawled 3 pages on my site
On my first crawl of a new campaign, the software only crawled 3 pages. XXXaceXXXscholarships.org any ideas?
Moz Pro | | Santaur0 -
How to find page with the link that returns a 404 error indicated in my crawl diagnostics?
Hi Newbie here - I am trying to understand what to do, step by step, after getting my initial reports back from seomoz. The first is regarding the 404 errors shown as high priority to fix, in crawl diagnostics. I reviewed the support info help on the crawl diagnostics page referring to 404 errors, but still did not understand exactly what I am supposed to do...same with the Q&A section when I searched how to fix 404 errors. I just could not understand exactly what anyone was talking about in relation to my 404 issues. It seems I would want to find the page that had the bad link that sent a visitor to a page not found, and then correct the problem by removing the link, or correcting and re-uploading the page being linked to. I saw some suggestions that seemed to indicate that seomoz itself will not let me find the page where the bad link is and that I would need to use some external program to do this. I would think that if seomoz found the bad page, it would also tell me what page the link(s) to the bad page exists on. A number of suggestions were to use a 301 redirect somehow as the solution, but was not clear when to do this versus, just removing the bad link, or repairing the page the link was pointing to. I think therefore my question is how do I find the links that lead to 404 page not founds, and fix the problem. Thanks Galen
Moz Pro | | Tetruss0 -
How to Optimize a Home Page?
Can anyone advise on what is the best way to optimise the home page? Many websites I see include the target(focus) keywords from other pages on the home page. Is this self competition? "Avoid Keyword Self-Cannibalization- It's a best practice in SEO to target each keyword with a single page on your site" If an e-commerce website has 4 levels in the SEO architecture. Home Page, Brand Pages(Category), Range pages(sub-category) and individual product pages. It would seem that the main keywords would be within the category(brands) level of a website. However many websites I see include these on the home page as well. Could someone please clarify this? I do not recall seeing the report card saying Self-Cannibalization when these keywords(category level) are listed on the home page. However I could be wrong, Please could someone advise on what SEO MOZs recommendations are for best practise on optimizing a home page so that it does not compete with category pages, or other pages on the website. Thanks
Moz Pro | | WMA0 -
Tools that crawl 2 million page sites
Our site is about 2million pages deep, 50% of which is stale content. Yes, I know - OMG #unhygienic. Even if we get approval to get rid of half of it. SEOMoz Pro Elite only crawls 20k deep - what can i do to crawl and diagnose the whole site. Are there any tools anyone can suggest. SEOMoz??
Moz Pro | | ilhaam0 -
What do i do when all pages are grade A?
I've used the on page grade and now have all my pages at a grade A for relevant keywords. Most of them are cool, achieveing first page rankings apart from a few massive keywords. So the question is, what's next? What do i do now that I'm at grade A, but perhaps not #1 yet... Cheers -dan
Moz Pro | | spytunes0 -
Question about when new crawls start
Hi everyone, I'm currently using the trial of seomoz and I absolutely love what I'm seeing. However, I have 2 different websites (one has over 10,000 pages and one has about 40 pages). I've noticed that the smaller website is crawled every few days. However, the larger site hasn't been crawled in a few days. Although both campaigns state that the sites won't be crawled until next Monday, is there any way to get the crawl to start sooner on the large site? The reason that I've asked is that I've implemented some changes that will likely decrease the amount of pages that are crawled simply based upon the recommendations on this site. So, I'm excited to see the potential changes. Thanks, Brian
Moz Pro | | beeneeb0