Google webmaster tools says access denied for 77 urls
-
Hi i am looking in google webmaster tools and i have seen a major problem which i hope people can help me sort out.
The problem is, i am being told that 77 urls are being denied access. The message when i look for more information says the below
Googlebot couldn't crawl your URL because your server either requires login to access the page, or is blocking Googlebot from accessing your site.
the responce code is 403
here is a couple of examples
http://www.in2town.co.uk/Entertainment-Magazine
http://www.in2town.co.uk/Weight-Loss-Hypnotherapy-helped-woman-lose-3-stone
i think the problem could be that i have sent them to another url in my httaccess file using the 403 re-direct but why would it bring up that google bot could not crawl them
any help would be great
-
Yup, deleted.
-
I have now deleted the old version can you check on this and make sure you can no longer see it.
-
You have a fairly complex .htaccess file (hint: I looked up your OLD .htaccess file - you should delete old htaccess files or something so people can't access them via a web browser), so I'm guessing the problem will be within your .htaccess file.
If possible, put a plain and simple .htaccess file on, test it with Google Webmaster Tools and see if the error still persists.
hi thanks for that. i will delete the old one now
-
In Webmaster Tools, you can "fetch as google bot" meaning you can enter one of those 77 URLs, and see what the Google "bot" sees when going to that URL.
You can also use:
http://www.dnsqueries.com/en/googlebot_simulator.php
For the URL: http://www.in2town.co.uk/Entertainment-Magazine
the Google Bot Simulator says:
HTTP CODE = HTTP/1.1 301 Moved Permanently
Location = http://www.in2town.co.uk/Showbiz-Gossip
and for: http://www.in2town.co.uk/Weight-Loss-Hypnotherapy-helped-woman-lose-3-stone
HTTP CODE = HTTP/1.1 301 Moved Permanently
Location = http://www.in2town.co.uk/Weight-Loss-Hypnotherapy
Interestingly, both the NEW URLs work fine although http://www.in2town.co.uk/Weight-Loss-Hypnotherapy doesn't look too good (at least in my web browser) but that's another issue.
You have a fairly complex .htaccess file (hint: I looked up your OLD .htaccess file - you should delete old htaccess files or something so people can't access them via a web browser), so I'm guessing the problem will be within your .htaccess file.
If possible, put a plain and simple .htaccess file on, test it with Google Webmaster Tools and see if the error still persists.
Adam
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to Remove /feed URLs from Google's Index
Hey everyone, I have an issue with RSS /feed URLs being indexed by Google for some of our Wordpress sites. Have a look at this Google query, and click to show omitted search results. You'll see we have 500+ /feed URLs indexed by Google, for our many category pages/etc. Here is one of the example URLs: http://www.howdesign.com/design-creativity/fonts-typography/letterforms/attachment/gilhelveticatrade/feed/. Based on this content/code of the XML page, it looks like Wordpress is generating these: <generator>http://wordpress.org/?v=3.5.2</generator> Any idea how to get them out of Google's index without 301 redirecting them? We need the Wordpress-generated RSS feeds to work for various uses. My first two thoughts are trying to work with our Development team to see if we can get a "noindex" meta robots tag on the pages, by they are dynamically-generated pages...so I'm not sure if that will be possible. Or, perhaps we can add a "feed" paramater to GWT "URL Parameters" section...but I don't want to limit Google from crawling these again...I figure I need Google to crawl them and see some code that says to get the pages out of their index...and THEN not crawl the pages anymore. I don't think the "Remove URL" feature in GWT will work, since that tool only removes URLs from the search results, not the actual Google index. FWIW, this site is using the Yoast plugin. We set every page type to "noindex" except for the homepage, Posts, Pages and Categories. We have other sites on Yoast that do not have any /feed URLs indexed by Google at all. Side note, the /robots.txt file was previously blocking crawling of the /feed URLs on this site, which is why you'll see that note in the Google SERPs when you click on the query link given in the first paragraph.
Technical SEO | | M_D_Golden_Peak0 -
Google Sitelinks
Hello, Good afternoon. I am having a site issue with Sitelinks. For some reason when I search Google for the brand I represent "California Olive Ranch" Sitelinks are not being generated. When I search for "Cal Olive Ranch" our site links are being generated. Our domain is Californiaoliveranch.com. Is there a way to tell Google to to change the site links to match our domain and brand name? Is this something that can be done in Google Webmasters? Thank you very much for your help. Adam P
Technical SEO | | apost40 -
How often should I upload a new sitemap in google webmasters?
So I have a real estate website that is regularly changing listings, photos, data. Every time a new listing is added it creates a page for that listing. My question is how frequently should I be recreating a new xml sitemap and uploading it to google webmasters? Thanks in advance.
Technical SEO | | jackaveli0 -
Persistent Unnatural Links in Webmaster tools
We recently were notified about unnatural links from two websites (totalling a few thousands links each). We went to the websites and asked them to remove the links, which they apparently did. After this we applied for reconsideration to Google, explaining the situation, however they came back and said we still have links. We noticed there were still links, however there were less than before, and so we once again asked the sites to remove all the links. Now we are sure all the links are gone as when we click a random link and view the page source there is no reference to our site, however WebMaster tools is not updating the link list, claiming we still have thousands of links. Do we have to apply for another reconsideration request to get them to re-crawl the sites to get rid of the links, or should it happen automatically?
Technical SEO | | eXia0 -
Webmaster Tools finding phantom 404s?
We recently (three months now!) switched over a site from .co.uk to .com and all old urls are re-directing to the new site. However, Google Webmaster tools is flagging up hundreds of 404s from the old site and yet doesn't report where the links were found, i.e. in the 'Linked From' tab there is no data and the old links are not in the sitemap. SEOmoz crawls do not report any 404s. Any ideas?
Technical SEO | | Switch_Digital0 -
Should we redirect 404 errrors seen in webmaster tools with ... (dot.dot,dot) ?
Lately I have seen lots of 404 errors showing in webmaster tools that are not really links. Many of them from shammy pages. (I did not put them there) One of the most common types is ones that show the link ending in ... ( dot, dot, dot) The appearance of the link is being sent from pages like this http://www.the-pick.com/00_fahrenheit,2.html For example a link like this would show up in webmaster tools as a 404 error. http://www.ehow.com/how_2352088_easily-... Are these worth redirecting? So far I have redirected some of them and found that is was not helpful and possibly harmful. Anyone else had the same experience? Also getting lots of partial urls showing up from pages that reference my site but the url is cut off and the link is not active. Does Google really count these as links? Is redirecting a link from a spammy page acknowledging acceptance and could it count against you?
Technical SEO | | KentH0 -
Strange Top URLs for Keywords in Google Webmaster Tools
When we click on one of our keywords under the keywords section of Google Webmaster Tools it shows our top URLs for that keyword. The problem is that it is giving us some very strange URLs that we have searched high and low to try to find but we don't know where they came from. Here is a screenshot: http://bit.ly/pl6mB3 Do you know where this type of URL string could have originated and how to fix it?
Technical SEO | | Hakkasan0 -
Should I 301 my non-www accesses to www accesses?
We have external links pointing to both mydomain.com and www.mydomain.com. I read this: http://www.stepforth.com/resources/web-marketing-knowledgebase/non-www-redirect/ and wondered if I should add this to my .htaccess file: RewriteCond %{HTTP_HOST} ^mydomain.com
Technical SEO | | scanlin
RewriteRule (.*) http://www.mydomain.com/$1 [R=301,L] so that the link juice all flows to the www version of the site? Any reason not to do it?0