Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Google insists robots.txt is blocking... but it isn't.
-
I recently launched a new website. During development, I'd enabled the option in WordPress to prevent search engines from indexing the site.
When the site went public (over 24 hours ago), I cleared that option. At that point, I added a specific robots.txt file that only disallowed a couple directories of files. You can view the robots.txt at http://photogeardeals.com/robots.txt
Google (via Webmaster tools) is insisting that my robots.txt file contains a "Disallow: /" on line 2 and that it's preventing Google from indexing the site and preventing me from submitting a sitemap. These errors are showing both in the sitemap section of Webmaster tools as well as the Blocked URLs section.
Bing's webmaster tools are able to read the site and sitemap just fine.
Any idea why Google insists I'm disallowing everything even after telling it to re-fetch?
-
Hi Aaron - You have a couple of solid answers here. Has your issue been resolved in GWT?
-
24 hours is a short time and probably google did not reindex or even looked at your new robot.txt
Webmaster tools is way slower than bing tools, so be patient.
As a rule of thumb, I wait at least a week with google before worrying (my 2 cents)
-
Hi Aaron,
I identify with your frustration, but want to lead my response with the caveat that I am not a developer so there may be people here with much more technical SEO expertise than me who might have a better answer.
What I do know id that Google Webmaster Tools data is not real time and can often take days to weeks to update. It could be that the reason GWT is showing something different about your robots.txt file is because it's old information that hasn't updated yet.
When I looked at your robots.txt file, I found two sitemaps, one with 2 URLs and one with 8 URLs. This is pretty tiny. Even in the old days, conventional wisdom was that it took at least 20 content pages in order for Google to take note and index the site.
Have you tried posting the URLs of your new site on Google+? I have heard that this is a great indexing tool in addition to the Fetch as Googlebot in GWT. Just a thought!
You know, there was a time when it took 6-8 weeks for a new site to get indexed. Google has definitely sped up to the point where I think we are all expecting instant results and sometimes that just doesn't happen.
I think this just might be a matter of patience. However, I am always willing to admit that I could be wrong and am interested to know what others think!
Dana
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why isn't my homepage number #1 when searching my brand name?
Hi! So we recently (a month ago) lunched a new website, we have great content that updates everyday, we're active on social platforms, and we did all that's possible, at the moment, when it comes to on site optimization (a web developer will join our team this month and help us fix all the rest). When I search for our brand name all our social profiles come up first, after them we have a few inner pages from our different news sections, but our homepage is somewhere in the 2nd search page... What may be the reason for that? Is it just a matter of time or is there a problem with our homepage I'm unable to find? Thanks!
Technical SEO | | Orly-PP0 -
Error: Missing Meta Description Tag on pages I can't find in order to correct
This seems silly, but I have errors on blog URLs in our WordPress site that I don't know how to access because they are not in our Dashboard. We are using All in One SEO. The errors are for blog archive dates, authors and just simply 'blog'. Here are samples: http://www.fateyes.com/2012/10/
Technical SEO | | gfiedel
http://www.fateyes.com/author/gina-fiedel/
http://www.fateyes.com/blog/ Does anyone know how to input descriptions for pages like these?
Thanks!!0 -
Why are Google search results different if you are log'd into Google or not?
I get different results when I'm log'd into my Google account associated with my website than if I'm not. The same country is occurring. So how can I rely on the google results I'm seeing? For instance my site is page 1 with the improvements I made based on SEOMOZ if I'm log'd in. Yet I'm not on the first 25 pages if I'm not logged in.
Technical SEO | | Romana0 -
Javascript to manipulate Google's bounce rate and time on site?
I was referred to this "awesome" solution to high bounce rates. It is suppose to "fix" bounce rates and lower them through this simple script. When the bounce rate goes way down then rankings dramatically increase (interesting study but not my question). I don't know javascript but simply adding a script to the footer and watch everything fall into place seems a bit iffy to me. Can someone with experience in JS help me by explaining what this script does? I think it manipulates the reporting it does to GA but I'm not sure. It was supposed to be placed in the footer of the page and then sit back and watch the dollars fly in. 🙂
Technical SEO | | BenRWoodard1 -
OK to block /js/ folder using robots.txt?
I know Matt Cutts suggestions we allow bots to crawl css and javascript folders (http://www.youtube.com/watch?v=PNEipHjsEPU) But what if you have lots and lots of JS and you dont want to waste precious crawl resources? Also, as we update and improve the javascript on our site, we iterate the version number ?v=1.1... 1.2... 1.3... etc. And the legacy versions show up in Google Webmaster Tools as 404s. For example: http://www.discoverafrica.com/js/global_functions.js?v=1.1
Technical SEO | | AndreVanKets
http://www.discoverafrica.com/js/jquery.cookie.js?v=1.1
http://www.discoverafrica.com/js/global.js?v=1.2
http://www.discoverafrica.com/js/jquery.validate.min.js?v=1.1
http://www.discoverafrica.com/js/json2.js?v=1.1 Wouldn't it just be easier to prevent Googlebot from crawling the js folder altogether? Isn't that what robots.txt was made for? Just to be clear - we are NOT doing any sneaky redirects or other dodgy javascript hacks. We're just trying to power our content and UX elegantly with javascript. What do you guys say: Obey Matt? Or run the javascript gauntlet?0 -
Robots.txt and canonical tag
In the SEOmoz post - http://www.seomoz.org/blog/robot-access-indexation-restriction-techniques-avoiding-conflicts, it's being said - If you have a robots.txt disallow in place for a page, the canonical tag will never be seen. Does it so happen that if a page is disallowed by robots.txt, spiders DO NOT read the html code ?
Technical SEO | | seoug_20050 -
Is blocking RSS Feeds with robots.txt necessary?
Is it necessary to block an rss feed with robots.txt? It seems they are automatically not indexed (http://googlewebmastercentral.blogspot.com/2007/12/taking-feeds-out-of-our-web-search.html) And, google says here that it's important not to block RSS feeds (http://googlewebmastercentral.blogspot.com/2009/10/using-rssatom-feeds-to-discover-new.html) I'm just checking!
Technical SEO | | nicole.healthline0 -
Should I set up a disallow in the robots.txt for catalog search results?
When the crawl diagnostics came back for my site its showing around 3,000 pages of duplicate content. Almost all of them are of the catalog search results page. I also did a site search on Google and they have most of the results pages in their index too. I think I should just disallow the bots in the /catalogsearch/ sub folder, but I'm not sure if this will have any negative effect?
Technical SEO | | JordanJudson0