Do I have a robots.txt problem?
-
I have the little yellow exclamation point under my robots.txt fetch as you can see here- http://imgur.com/wuWdtvO
This version shows no errors or warnings- http://imgur.com/uqbmbug
Under the tester I can currently see the latest version. This site hasn't changed URLs recently, and we haven't made any changes to the robots.txt file for two years. This problem just started in the last month. Should I worry?
-
Today it has a green check mark, and absolutely no changes were made to the website since I asked this question.
-
It could be that your server had a hard time when Google tried to view your robots.txt file that's why it wouldn't be able to fetch it. As long as this issue doesn't prevent Google anymore in the future it's not much to worry about.
-
That would make me feel more confident of a false error being reported. Time to closely monitor the crawl logs, look at server stats, and keep an eye on GWT for a change in the reporting/indexing. I would also go into the GWT forums and post, see if anyone is reporting a similar error these past couple days.
-
I can't post the domain but I know it is accessible.
When I go to the tester it shows the live robots.txt with no problems. I also can look at the server logs and see that it is being crawled, but being crawled less then Bing Crawls. Also the Bing Webmaster Tools is showing no problems.
-
Can you post your domain? Manually checking the robots.txt file would help.
I've checked many of my GWT accounts and I am not showing a sudden robots.txt error. It could be a false error, but I would take anything with the robots.txt file seriously. You'll want to make sure that it is in fact accessible to all the crawlers desired.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Role of Robots.txt and Search Console parameters settings
Hi, wondering if anyone can point me to resources or explain the difference between these two. If a site has url parameters disallowed in Robots.txt is it redundant to edit settings in Search Console parameters to anything other than "Let Googlebot Decide"?
Technical SEO | | LivDetrick0 -
Duplicate content problem
Hi there, I have a couple of related questions about the crawl report finding duplicate content: We have a number of pages that feature mostly media - just a picture or just a slideshow - with very little text. These pages are rarely viewed and they are identified as duplicate content even though the pages are indeed unique to the user. Does anyone have an opinion about whether or not we'd be better off to just remove them since we do not have the time to add enough text at this point to make them unique to the bots? The other question is we have a redirect for any 404 on our site that follows the pattern immigroup.com/news/* - the redirect merely sends the user back to immigroup.com/news. However, Moz's crawl seems to be reading this as duplicate content as well. I'm not sure why that is, but is there anything we can do about this? These pages do not exist, they just come from someone typing in the wrong url or from someone clicking on a bad link. But we want the traffic - after all the users are landing on a page that has a lot of content. Any help would be great! Thanks very much! George
Technical SEO | | canadageorge0 -
Staging & Development areas should be not indexable (i.e. no followed/no index in meta robots etc)
Hi I take it if theres a staging or development area on a subdomain for a site, who's content is hence usually duplicate then this should not be indexable i.e. (no-indexed & nofollowed in metarobots) ? In order to prevent dupe content probs as well as non project related people seeing work in progress or finding accidentally in search engine listings ? Also if theres no such info in meta robots is there any other way it may have been made non-indexable, or at least dupe content prob removed by canonicalising the page to the equivalent page on the live site ? In the case in question i am finding it listed in serps when i search for the staging/dev area url, so i presume this needs urgent attention ? Cheers Dan
Technical SEO | | Dan-Lawrence0 -
We have set up 301 redirects for pages from an old domain, but they aren't working and we are having duplicate content problems - Can you help?
We have several old domains. One is http://www.ccisound.com - Our "real" site is http://www.ccisolutions.com The 301 redirect from the old domain to the new domain works. However, the 301-redirects for interior pages, like: http://www.ccisolund.com/StoreFront/category/cd-duplicators do not work. This URL should redirect to http://www.ccisolutions.com/StoreFront/category/cd-duplicators but as you can see it does not. Our IT director supplied me with this code from the HT Access file in hopes that someone can help point us in the right direction and suggest how we might fix the problem: RewriteCond%{HTTP_HOST} ccisound.com$ [NC] RewriteRule^(.*)$ http://www.ccisolutions.com/$1 [R=301,L] Any ideas on why the 301 redirect isn't happening? Thanks all!
Technical SEO | | danatanseo0 -
Rel Canonical problem or SEOmoz bug ?
Hello all, I hope that sombody out there could help me with my question. I am very new in SEO and in SEOmoz community. I am not familiar with coding. I am goining to start learning soon enough but till now I now only basics. At the website where I am trying to optimize for SEO I am reciving this Crawl Diagnostic Programme. Issue: Rel Canonical (Notice) not Error I searched and lerned what it is. So I contact the developers of the website. Build in wordpress and ask them how to corrected ? They told me that they are using Canonical Tags to all their pages and have no idea why SEOmoz keep identifining it as a "notice" They also tel me to check the source code of page to see the canonical tag. I did and their is actuall a canonical tag there. Cjeck please here www.costanavarinogolf.com So do you have any idea why this is happening ? could you help me explaiin to developers what they should do to overcome this ? Or it's just a bug of SEOmoz and not a reall problem exist ? Thank you very much for your time
Technical SEO | | grzontan0 -
How can you manually diagnose the canonical problem
Good Monrning from snow dusted minus 3 degrees C Wetherby UK... Is there a quick way to diagnose wether or not a website has a canonical problem or not? So far Ive been doing this for example: Typing a full web address then one without the w's and seeing if a 301 redirect has been set up. But I'm not confident this is the best way to diagnose if there is a canonical problem with a site. I would like to ad that I want to see if a canonical problem exists with any site and webmanster tools is not available. Any insights welcome 🙂
Technical SEO | | Nightwing1 -
Subdomain Removal in Robots.txt with Conditional Logic??
I would like to see if there is a way to add conditional logic to the robots.txt file so that when we push from DEV to PRODUCTION and the robots.txt file is pushed, we don't have to remember to NOT push the robots.txt file OR edit it when it goes live. My specific situation is this: I have www.website.com, dev.website.com and new.website.com and somehow google has indexed the DEV.website.com and NEW.website.com and I'd like these to be removed from google's index as they are causing duplicate content. Should I: a) add 2 new GWT entries for DEV.website.com and NEW.website.com and VERIFY ownership - if I do this, then when the files are pushed to LIVE won't the files contain the VERIFY META CODE for the DEV version even though it's now LIVE? (hope that makes sense) b) write a robots.txt file that specifies "DISALLOW: DEV.website.com/" is that possible? I have only seen examples of DISALLOW with a "/" in the beginning... Hope this makes sense, can really use the help! I'm on a Windows Server 2008 box running ColdFusion websites.
Technical SEO | | ErnieB0 -
Is robots.txt a must-have for 150 page well-structured site?
By looking in my logs I see dozens of 404 errors each day from different bots trying to load robots.txt. I have a small site (150 pages) with clean navigation that allows the bots to index the whole site (which they are doing). There are no secret areas I don't want the bots to find (the secret areas are behind a Login so the bots won't see them). I have used rel=nofollow for internal links that point to my Login page. Is there any reason to include a generic robots.txt file that contains "user-agent: *"? I have a minor reason: to stop getting 404 errors and clean up my error logs so I can find other issues that may exist. But I'm wondering if not having a robots.txt file is the same as some default blank file (or 1-line file giving all bots all access)?
Technical SEO | | scanlin0