Google insists robots.txt is blocking... but it isn't.
-
I recently launched a new website. During development, I'd enabled the option in WordPress to prevent search engines from indexing the site.
When the site went public (over 24 hours ago), I cleared that option. At that point, I added a specific robots.txt file that only disallowed a couple directories of files. You can view the robots.txt at http://photogeardeals.com/robots.txt
Google (via Webmaster tools) is insisting that my robots.txt file contains a "Disallow: /" on line 2 and that it's preventing Google from indexing the site and preventing me from submitting a sitemap. These errors are showing both in the sitemap section of Webmaster tools as well as the Blocked URLs section.
Bing's webmaster tools are able to read the site and sitemap just fine.
Any idea why Google insists I'm disallowing everything even after telling it to re-fetch?
-
Hi Aaron - You have a couple of solid answers here. Has your issue been resolved in GWT?
-
24 hours is a short time and probably google did not reindex or even looked at your new robot.txt
Webmaster tools is way slower than bing tools, so be patient.
As a rule of thumb, I wait at least a week with google before worrying (my 2 cents)
-
Hi Aaron,
I identify with your frustration, but want to lead my response with the caveat that I am not a developer so there may be people here with much more technical SEO expertise than me who might have a better answer.
What I do know id that Google Webmaster Tools data is not real time and can often take days to weeks to update. It could be that the reason GWT is showing something different about your robots.txt file is because it's old information that hasn't updated yet.
When I looked at your robots.txt file, I found two sitemaps, one with 2 URLs and one with 8 URLs. This is pretty tiny. Even in the old days, conventional wisdom was that it took at least 20 content pages in order for Google to take note and index the site.
Have you tried posting the URLs of your new site on Google+? I have heard that this is a great indexing tool in addition to the Fetch as Googlebot in GWT. Just a thought!
You know, there was a time when it took 6-8 weeks for a new site to get indexed. Google has definitely sped up to the point where I think we are all expecting instant results and sometimes that just doesn't happen.
I think this just might be a matter of patience. However, I am always willing to admit that I could be wrong and am interested to know what others think!
Dana
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why isn't my homepage number #1 when searching my brand name?
Hi! So we recently (a month ago) lunched a new website, we have great content that updates everyday, we're active on social platforms, and we did all that's possible, at the moment, when it comes to on site optimization (a web developer will join our team this month and help us fix all the rest). When I search for our brand name all our social profiles come up first, after them we have a few inner pages from our different news sections, but our homepage is somewhere in the 2nd search page... What may be the reason for that? Is it just a matter of time or is there a problem with our homepage I'm unable to find? Thanks!
Technical SEO | | Orly-PP0 -
No descripton on Google/Yahoo/Bing, updated robots.txt - what is the turnaround time or next step for visible results?
Hello, New to the MOZ community and thrilled to be learning alongside all of you! One of our clients' sites is currently showing a 'blocked' meta description due to an old robots.txt file (eg: A description for this result is not available because of this site's robots.txt) We have updated the site's robots.txt to allow all bots. The meta tag has also been updated in WordPress (via the SEO Yoast plugin) See image here of Google listing and site URL: http://imgur.com/46wajJw I have also ensured that the most recent robots.txt has been submitted via Google Webmaster Tools. When can we expect these results to update? Is there a step I may have overlooked? Thank you,
Technical SEO | | adamhdrb
Adam 46wajJw0 -
Disallow: /search/ in robots but soft 404s are still showing in GWT and Google search?
Hi guys, I've already added the following syntax in robots.txt to prevent search engines in crawling dynamic pages produce by my website's search feature: Disallow: /search/. But soft 404s are still showing in Google Webmaster Tools. Do I need to wait(it's been almost a week since I've added the following syntax in my robots.txt)? Thanks, JC
Technical SEO | | esiow20130 -
WMT - Googlebot can't access your site
Hi On our new website which is just a few weeks old upon logging into Webmaster tools I am getting the following message Googlebot can't access your site - The overall error rate for DNS queries is 50% What do I need to do to resolve this, I have never had this problem before with any of the sites - where the domains are with Fasthosts (UK) and hosting is with Dreamhosts. What is the recommended course of action Google mention contacting your host in my case Dreamhost - but what do you need to ask them in a support ticket. When doing a fetch in WMT the fetch status is a success?
Technical SEO | | ocelot0 -
Javascript to manipulate Google's bounce rate and time on site?
I was referred to this "awesome" solution to high bounce rates. It is suppose to "fix" bounce rates and lower them through this simple script. When the bounce rate goes way down then rankings dramatically increase (interesting study but not my question). I don't know javascript but simply adding a script to the footer and watch everything fall into place seems a bit iffy to me. Can someone with experience in JS help me by explaining what this script does? I think it manipulates the reporting it does to GA but I'm not sure. It was supposed to be placed in the footer of the page and then sit back and watch the dollars fly in. 🙂
Technical SEO | | BenRWoodard1 -
Robots txt
We have a development site that we want google and other bots to stay out of but we want roger to have access. Currently our robots.txt looks like this: User-agent: *
Technical SEO | | LadyApollo
Disallow: /cgi-bin/
Disallow: /development/ What would i need to addd or change to let him through? Thank you.0 -
Site: search doesn't return homepage first
When searching for site:myclient.com their homepage doesn't appear first. I know some SEOs have reported this was a warning sign that there was a penalty. Here is what I've checked/found: Toolbar pagerank remains strong. Homepage is indexed. SEO traffic is falling, but its been gradually falling for a year now, mainly due to the client neglecting any type of marketing campaigns or link building, I believe. There was not a specific drop that could be tied to a penalty. Site remains well indexed. 62,742 of 63,021 URLs in the sitemap are indexed. Site is a large ecommerce site, so many pages are duplicate content (product descriptions). Homepage does rank #1 when searching for string of text present on the homepage. Nothing unusual in Google Webmaster Tools Search for myclient.com returns homepage with 6 expanded sitelinks under it. Google safe browsing check shows no malware. Anything else I should check?
Technical SEO | | AdamThompson0 -
Robots.txt and robots meta
I have an odd situation. I have a CMS that has a global robots.txt which has the generic User-Agent: *
Technical SEO | | Highland
Allow: / I also have one CMS site that needs to not be indexed ever. I've read in various pages (like http://www.jesterwebster.com/robots-txt-vs-meta-tag-which-has-precedence/22 ) that robots.txt always wins over meta, but I have also read that robots.txt indicates spiderability whereas meta can control indexation. I just want the site to not be indexed. Can I leave the robots.txt as is and still put NOINDEX in the robots meta?0