Robots.txt being blocked
-
I think there is an issue with this website I'm working on here is the URL: http://brownieairservice.com/
In Google Webmaster tools I am seeing this in the Robots.txt tester:
User-agent: *
Crawl-delay: 1
Disallow: /wp-content/plugins/
Disallow: /wp-admin/Also when I look at "blocked resources" in the webmaster tools this is showing to be blocked:
http://brownieairservice.com/wp-content/plugins/contact-form-7/includes/js/jquery.form.min.js?ver=3.51.0-2014.06.20It looks like the form plug in is giving the issues but I don't understand this.
There are no site errors or URL errors so I don't understand what this crawl delay means or how to fix it. Any input would be greatly appreciated. Thank you
-
Hi Matt,
Thank you for checking back. I did change the robot.txt in the dashboard as people suggested but when I go here: http://brownieairservice.com/robots.txt
It is still showing the disallow. I need to load this:
User-agent: *
Disallow:to the root folder and I'm not sure how to do that if I need to FTP it or how I do that so that's where I'm at now.
Anybody have any thoughts? I have googled this question on how to do it and I keep getting put into this loop of information that does not address this questions directly.
Thank you
-
Hi Wendy! Did this get worked out?
-
Thanks Dirk for your input I will look at this too and respond back.
-
Thank you for your answer. I went in and uploaded this plug in: WP Robots Txt Now I can see the robots.txt content. This is what I see:
User-agent: *
Disallow: /wp-admin/
Disallow: /wp-includes/I don't see this as I see in Webmaster tools:
User-agent: *
Crawl-delay: 1
Disallow: /wp-content/plugins/
Disallow: /wp-admin/My question now is this: is the Disallow:/wp-includes/ the same as Disallow:/wp-content/plugins/
so if I do this: allow:/wp-includes/ then that should solve my issue?
I'm still going through your other suggestions so will type back later on that. Thank you for your help.
Wendy
-
To add to the previous comment - crawl delay is ignored by Googlebot. Check http://tools.seobook.com/robots-txt/
It can be used to limit the speed for the bots - it is however not part of the original robots.txt specification. Since this value is not part of the standard, its interpretation is dependent on the crawler reading it
Yandex: https://yandex.com/support/webmaster/controlling-robot/robots-txt.xml#crawl-delay
Didn't find more info for Bing (they mention it here but do not provide additional info: https://www.bing.com/webmaster/help/how-to-create-a-robots-txt-file-cb7c31ec
If you want to limit the speed for Google bot you have to do it in Webmastertools.
Dirk
-
Wendy,
Google likes to have access to all your css and js. Plugins can contain these files, as seen with your blocked resources message.
The way to fix this would be by removing the Disallow: /wp-content/plugins/ line from your robots.txt file, and thus allowing google full access.
Another solution as provided by a useful article on moz: https://moz.com/blog/why-all-seos-should-unblock-js-css
"How to unblock your JavaScript and CSS
For most users, it's just a case of checking the robots.txt and ensuring you're allowing all JavaScript and CSS files to be crawled. For Yoast SEO users, you can edit your robots.txt file directly in the admin area of Wordpress.
Gary Illyes from Google also shared some detailed robots.txt changes on Stack Overflow. You can add these directives to your robots.txt file in order to allow Googlebot to crawl all Javascript and CSS.
To be doubly sure you're unblocking all JavaScript and CSS, you can add the following to your robots.txt file, provided you don't have any directories being blocked in it already:
User-Agent: Googlebot Allow: .js Allow: .css
If you have a more specialized robots.txt file, where you're blocking entire directories, it can be a bit more complicated.
In these cases, you also need to allow the .js and.css for each of the directories you have blocked.
For example:
User-Agent: Googlebot Disallow: /deep/ Allow: /deep/*.js Allow: /deep/*.css
Repeat this for each directory you are blocking in robots.txt."
Hope this helps.
-
What problem is it causing Wendy?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Reason for robots.txt file blocking products on category pages?
Hi I have a website with thosands of products. On the category pages, all the products are linked to with the code “?cgid” in the URL. But “?cgid” is also blocked in the robots.txt file for some reason. So I'm thinking it's stopping all my products getting crawled by Google. Am I right here? Is there any reason why a website would want to limit so many URL's? I'm only here a week and the sites getting great traffic, so don't want to go breaking it!!! Thanks
Web Design | | Frankie-BTDublin0 -
Disallow: /sr/ and Disallow: /si/ - robots.txt
Hello Mozzers - I have come across the two directives above in a robots.txt file of a website - the web dev isn't sure what they meant although he implemented robots.txt - I think just legacy stuff that nobody has analysed for years - I vaguely recall sr means search request but can't remember. If any of you know what these directives do, then please let me know.
Web Design | | McTaggart0 -
Google tag manager on blocked beta site - will it phone home to Google and cause site to get indexed?
We want to develop a beta site, in a directory with the robots.txt blocking bots. We want to include the Google Tag Manager tags and event layer tracking code on this beta site. My question is that by including the Google Tag Manager code, that phones home to Google, will it cause Google to index this beta site when we don't want it indexed?
Web Design | | CFSSEO0 -
Is anyone using Humans.txt in your websites? What do you think?
http://humanstxt.org Anyone using this on their websites and if so have you seen and positive benefits of doing so? Would be good to see some examples of sites using it and potentially how you're using the files. I'm considering adding this to my checklist for launching sites
Web Design | | eseyo1 -
Search directory - How to apply robots
Hi. On the site I'm working on, we use a search directory to display our search results. It displays as follows - Mydomain.com/search-results/# With the dynamic search results appearing after the hash tag. Because of the structure of the website, many of the lefthand nav defers back to this directory. I know that most websites "noindex, nofollow" the search results pages, but due to the ease of customers generating them, I'm afraid that if I do this, we'll miss out on the inevitable links customers will provide...and, even though it's just the main search directory, these links will still help my domain. The search is all java-generated so there's nothing for spiders to follow within this directory - save the standard category nav. How should I handle this? Thanks.
Web Design | | Blenny0 -
Is it too late to change an IP from the linking c-block?
My main web development company is linked to many of our clients and our clients link back to us using footer links back. We obviously have a high volume of c-block relations. If I change my main site's location to a different server will it make any difference or is it too late?
Web Design | | sanchez19600 -
Should /dev folder be blocked?
I have been experiencing a ranking drop every two months, so I came upon a new theory this morning... Does Google do a deep crawl of your site say every 60-90 days and would they penalize a site if they crawled into your /dev area which would contain pretty the exact same urls and content as your production environment and therefore penalize you for duplicate content? The only issue I see with this theory is that I have been penalized only for specific keywords on specific pages, not necessarily across the board. Thoughts? What would be the best way to block out your /dev area?
Web Design | | BoulderJoe0 -
IP block in Google
Our office has a number of people performing analysis and research on keyword positions, volume, competition etc. We have 1 external static IP address. We installed the static IP so we can filter out our visits in Google Analytics. However by 10 AM we get impssible CAPTCHA's or even get blocked in Google. Do you have any experience with such an issue? Any solutions you can recommend? Any help would be appreciated! SXI5A.png
Web Design | | Partouter0