Recovering from robots.txt error
-
Hello,
A client of mine is going through a bit of a crisis. A developer (at their end) added Disallow: / to the robots.txt file. Luckily the SEOMoz crawl ran a couple of days after this happened and alerted me to the error. The robots.txt file was quickly updated but the client has found the vast majority of their rankings have gone.
It took a further 5 days for GWMT to file that the robots.txt file had been updated and since then we have "Fetched as Google" and "Submitted URL and linked pages" in GWMT.
In GWMT it is still showing that that vast majority of pages are blocked in the "Blocked URLs" section, although the robots.txt file below it is now ok.
I guess what I want to ask is:
- What else is there that we can do to recover these rankings quickly?
- What time scales can we expect for recovery?
- More importantly has anyone had any experience with this sort of situation and is full recovery normal?
Thanks in advance!
-
Great info Rikki
thats goid news!
-
Hi Antonio,
I would take a look at your entire site using
One of my very favorite tools this tool will crawl your site and tell you if you have no follow's or other issues that would cause Google bot have trouble indexing your site.
Simply put your sites URL in the box presented in the tool you can find in the link here
http://www.feedthebot.com/tools/spider/
Then use link 2
Displays amount of links (internal, external, nofollow, image, etc.) found on webpage.
http://www.feedthebot.com/tools/linkcount/
You can then see if there is a no follow that might be creating a real problem inside of a page using the two URLs you should be a will to get about of this.
Check this much of your site is you possibly can with this as it will show you A lot of information that would be very relevant as to if your site can be crawled correctly or not
This third tool Will show you if your robots.txt file is still blocking all or part of your website the nice thing about this tool is is is built to make her about star text files however if you simply put your URL in the top and hit the upload button it will pull your robots.txt file this is very helpful when making comparisons between changes that have been made or you wish to make
http://www.internetmarketingninjas.com/seo-tools/robots-txt-generator/
Two check out your robot.txt file against what could be something blocking it I think that will
http://moz.com/blog/interactive-guide-to-robots-txt
http://moz.com/learn/seo/robotstxt
http://tools.seobook.com/robots-txt/
http://yoast.com/x-robots-tag-play/
https://developers.google.com/webmasters/control-crawl-index/docs/robots_meta_tag?hl=de
http://www.searchenginejournal.com/x-robots-tag-simple-alternate-robots-txt-meta-tag/67138/
A citation that I hope will help you is the not too noticeable difference between allowing everything and not allowing everything simply having a / after disallow: Will tell Google that you do not want to be showing up in their search engine results
Simply put I have the information below websites by default are set up with
Allow: /
Example Robots.txt Format
Allow indexing of everything
User-agent: *
Disallow:
or
User-agent: *
Allow: /
Disallow indexing of everything
User-agent: *
Disallow: /Disallow indexing of a specific folder
User-agent: *
Disallow: /folder/Please remember there are multiple ways to block a website for instance
PHP-based websites are extremely popular and if you're using a WordPress or agenda for many other
header("X-Robots-Tag: noindex", true);
I want to remind you what Tom Roberts said in the first response about using Twitter I have quoted him here however you can read it at the top of the Page below the first question
The most frequently crawled domain on the web is Twitter. If you could legitimately get your key URLs tweeted, either by yourselves or others, this may encourage the Google crawler to revisit the URLs, and consequently re index them. There won't be any harm SEO wise in sending tweets with your URLs, it's a quick and free method and so may be worth giving it a shot
Hope This Helps,
Thomas
-
Hi Antonio,
Sorry to hear you have had the same problem, due to our clients nature this error by the developer cost them a load of lost revenue.
In answer to your questions:
-
It took 19 days in total to recover
-
We took everyone's advice and implemented them but I am unsure what actually helped. I think working work GWMT is the best thing for it. Make sure you submit for a re-crawl as soon as possible and see what is still blocked
I know how scary the situation is but things will go back to normal. Its just a matter of playing the waiting game really, sorry I couldn't be of more help.
Rikki
-
-
Hi Rikki,
I know it's been some time since your post, however I just found it because a couple of weeks ago my developer did exactly the same.
It's been 2 weeks now and our traffic is still divided by 4 compared with what it used to be. My questions are:
1/ How long it finally took you to completely recover your previous traffic levels (if you finally did)
2/ Did you apply any of the advices from other bloggers? What would you recommend to do from your experience?
Thanks in advance. I am really worried at this moment, since we've got a peak campaign coming on very soon.
Regards,
Antonio (Citricamente)
-
Hi Rikki,
I really want to say great job though with those numbers. It's always good to see somebody pulling positive ROI. Good work!If I may ask what type of development do specialize in if you have a specialty?
My reason for asking is there are some excellent hosts that will allow you to run a staging server that changes everything like robots.txt back to follow and index when you hit the production button. Other hosts have similar methods.
In fact, that might be an idea that's worth a little bit of money. A nice WordPress plug-in that gives you a constant reminder here in the development phase and does the swap then deletes itself?
Or use a managed WordPress host if it's WordPress.
You can do so many cool things would git these days.
I am extremely happy you have found out there's nothing to worry about if it is simply the tags you will have your rank back before you know it.you can also use Webmaster tools on the manual setting and put it to Max I have done it on test sites, and the site was indexed just as well I would simply make sure I had a reminder telling me to return it to normal after.
You should set the rel="canonical as well/
Glad I was able to help,
Thomas
-
Hi guys,
Thanks very much for the responses. I guess my gut feeling was right that everything would come back to normal but just needed some reassurance.
I have made real progress with this client going from an online brought in revenue of £15k per month at the start of the year to £105k last month but it is all phone based so at the moment his call centre is like a ghost town - its a shame that can happen when a developer is trying to block his own dev sub domain and ends up blocking the whole thing. Just hope it doesn't take too long.
We will certainly try the social media route to see if that speeds things along.
-
please look and see that I updated my response I did I copied from a dictation software's writing pad and only copied a part of it when I meant to copy all of it
please read and let me know if I can be of help
sincerely,
Thomas
-
Please forgive my 1st comment I the button too early and use the dictation software so I save it to one page then paste to another I am sincerely sorry I got this part on there without the entire thing.
Send me the domain either privately if you can or through this chat I would be more than happy to look into it for you. I can tell you I have made the no follow no index mistake myself showing a intern something on our own site and talk about it here below.
However if you are still getting problems you may want to download
screaming frog SEO spider
it only will check for 500 links for free however it gives you invaluable insight
It is a download and works on Mac, Windows and Linux
http://www.screamingfrog.co.uk/seo-spider/
if you want to try something web-based
http://www.internetmarketingninjas.com/tools/
http://www.internetmarketingninjas.com/broken-links-tool/
http://www.internetmarketingninjas.com/seo-tools/robots-txt-generator/
http://www.internetmarketingninjas.com/seo-tools/google-sitemap-generator/
I would also not hesitate to use their DNS tool to check that everything there is okay
Another tool or tools I would strongly recommend and you can access for free are the excellent Internet marketing ninjas
The words used in the metadata tags, in body text and in anchor text in external and internal links all play important roles in on page search engine optimization (SEO). The On-Page Optimization Analysis Free SEO Tool lets you quickly see the important SEO content on your webpage URL the same way a search engine spider views your data. This free SEO onpage optimization tool is multiple onpage SEO tools in one, helpful for reviewing the following onpage optimization information in the source code on the page:
- Metadata tool: Displays text in title tags and meta elements
- Keyword density tool: Reveals onpage SEO keyword statistics for linked and unlinked content
- Keyword optimization tool: Analyzes on page optimization by showing the number of words used in the content, including anchor text of internal and external links
- Link Accounting tool: Displays the number and types of links used
- Header check tool: Shows HTTP Status Response codes for links
- Source code tool: Provides quick access to on-page HTML source code
if you are talking about just the index and no follow
I can now happily say I have done this identical thing.
I have done the exact same thing. I can tell you I was showing somebody how to use the WordPress SEO plug-in when I got distracted and simply did not change the settings back to follow and index. So approximately 2 to 3 days later I noticed a huge loss in ranking year for the company brand name.
(Luckily this was mine not a clients)
It took approximately two days after I changed the settings back to normal follow and index them submitted my entire website to Google's Webmaster tools even clicking yes when asked the index all large change
before I knew it all the rankings had returned back to normal literally the keywords I was tracking returned within the normal fluctuation I see as they were in many cases sometimes better & sometimes little bit worse what I had feared they never would come back at all.Sincerely,
Thomas
Believe me when I say I was extremely thankful for this and don't see why you will not get the same results with your site.
I hope this is a simple a mistake of just that one problem like mine that's the only thing I can give you a testimony of. I would say you have nothing to worry about. But remember to tell Google Webmaster tools I also did tell Bing but that's up to you
-
Should be as quick as google re-crawls the robots.txt.
Best thing you can do is add a couple of links to sites that are crawled daily, to encourage google to visit your clients site as soon as possible
Could be:
- newspaper sites - comments
- and the like
-
Hey there
I've seen this before and in almost all cases the rankings were returned to their previous state, give or take maybe 1 or 2 places (which would be normal SERP flux).
Unfortunately, I've found that this can often take weeks and there's no real sure-fire way of getting Google to update it quicker. Theoretically, to speed things up you want to get the crawler revisiting the URLs more and more often. Fresh backlinks would do this, but obviously you can't game that sort of thing for web spam reasons. You could also try pinging devices, such as GooglePing, but I'm not convinced by their effectiveness.
The most frequently crawled domain on the web is Twitter. If you could legitimately get your key URLs tweeted, either by yourselves or others, this may encourage the Google crawler to revisit the URLs, and consequently reindex them. There won't be any harm SEO wise in sending tweets with your URLs, it's a quick and free method and so may be worth giving it a shot.
Hope this helps you - I've often found you can't control these things but hopefully some of these theories might work. In the long-run, however, the rankings will return and so for normal SEO purposes, create content and links as per usual.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Robots.txt & Disallow: /*? Question!
Hi, I have a site where they have: Disallow: /*? Problem is we need the following indexed: ?utm_source=google_shopping What would the best solution be? I have read: User-agent: *
Intermediate & Advanced SEO | | vetofunk
Allow: ?utm_source=google_shopping
Disallow: /*? Any ideas?0 -
SEO Best Practices regarding Robots.txt disallow
I cannot find hard and fast direction about the following issue: It looks like the Robots.txt file on my server has been set up to disallow "account" and "search" pages within my site, so I am receiving warnings from the Google Search console that URLs are being blocked by Robots.txt. (Disallow: /Account/ and Disallow: /?search=). Do you recommend unblocking these URLs? I'm getting a warning that over 18,000 Urls are blocked by robots.txt. ("Sitemap contains urls which are blocked by robots.txt"). Seems that I wouldn't want that many urls blocked. ? Thank you!!
Intermediate & Advanced SEO | | jamiegriz0 -
Site Has Not Recovered (Still!) from Penguin
Hello, I have a site that just keeps getting hit with a ton of bad, unnatural backlinks due to the sins from previous SEO companies they've hired. About every quarter, I have to add more bad links to their disavow file... still. Is it time to move them to a new domain? Perhaps a .net? If so, do we just completely trash the old domain & not redirect it? I've never had a client like this in the past but they still want to maintain their branded name. Thanks for your feedback!
Intermediate & Advanced SEO | | TinaMumm0 -
Low on Google ranking despite error-free!?
Hi all, I'm following up on a recent post i've made about our indexing and especially ranking problems in Google: http://moz.com/community/q/seo-impact-classifieds-website Thanks to all good comments we managed to get rid of most of our crawl errors and as a result our high priority /duplicated content decreased from +22k to 270. In short, we created canonical urls, run an xml sitemap, used url parameters in GWT, created h1 and meta description for each ad posted by users etc. I then used google fetch a few times (3 weeks ago and last week) both for desktop and mobile version for re-approval. Nothing really improves in google rankings (all our core keywords are ranked +50)since months now: yet yahoo and bing organic traffic went up and is 3x higher than google's. In the meanwhile we're running paid campagins on facebook and adwords since months already to keep traffic consistent, yet this is eating up our budget, even though our ctr and conversion rates are good. I realize we might have to create more content on-site and through social media, but right now our social media traffic is already around 50% and we are using more of twitter and google+ as well since recently. Our organic traffic is only 14%; with google only a third of that. In the end, I believe this breakdown should look more something like organic 50%-70%, (paid)social,referral and direct traffic. 50%-30%... I can't believe we are hit by a penalty although this looks like it is the case. Especially while yahoo and bing traffic goes up and google does not. Should I wait for a signal once our site is "approved" again through GWT fetch? Or am i missing something that i need to check as well to improve these rankings? Thanks for your help! Ivor ps: ask me for additional stats or info in a pm if needed!
Intermediate & Advanced SEO | | ivordg0 -
Robots.txt
Hi all, Happy New Year! I want to block certain pages on our site as they are being flagged (according to my Moz Crawl Report) as duplicate content when in fact that isn't strictly true, it is more to do with the problems faced when using a CMS system... Here are some examples of the pages I want to block and underneath will be what I believe to be the correct robots.txt entry... http://www.XYZ.com/forum/index.php?app=core&module=search&do=viewNewContent&search_app=members&search_app_filters[forums][searchInKey]=&period=today&userMode=&followedItemsOnly= Disallow: /forum/index.php?app=core&module=search http://www.XYZ.com/forum/index.php?app=core&module=reports&rcom=gallery&imageId=980&ctyp=image Disallow: /forum/index.php?app=core&module=reports http://www.XYZ.com/forum/index.php?app=forums&module=post§ion=post&do=reply_post&f=146&t=741&qpid=13308 Disallow: /forum/index.php?app=forums&module=post http://www.XYZ.com/forum/gallery/sizes/182-promenade/small/ http://www.XYZ.com/forum/gallery/sizes/182-promenade/large/ Disallow: /forum/gallery/sizes/ Any help \ advice would be much appreciated. Many thanks Andy
Intermediate & Advanced SEO | | TomKing0 -
How to handle a blog subdomain on the main sitemap and robots file?
Hi, I have some confusion about how our blog subdomain is handled in our sitemap. We have our main website, example.com, and our blog, blog.example.com. Should we list the blog subdomain URL in our main sitemap? In other words, is listing a subdomain allowed in the root sitemap? What does the final structure look like in terms of the sitemap and robots file? Specifically: **example.com/sitemap.xml ** would I include a link to our blog subdomain (blog.example.com)? example.com/robots.xml would I include a link to BOTH our main sitemap and blog sitemap? blog.example.com/sitemap.xml would I include a link to our main website URL (even though it's not a subdomain)? blog.example.com/robots.xml does a subdomain need its own robots file? I'm a technical SEO and understand the mechanics of much of on-page SEO.... but for some reason I never found an answer to this specific question and I am wondering how the pros do it. I appreciate your help with this.
Intermediate & Advanced SEO | | seo.owl0 -
Recovering from Programmers Error
Hey Everybody! Last year one of my bigger sites hit a snaffu. I was getting about 300k + hits a day from google, and then, when a developper released an update with a robots.txt file that basically blocked google from the entire site. We didn't notice the bug until a few days later, but by then, it was already too late. My google traffic dropped to 30k a day and I've been having the hardest time coming back ever since. As a matter of fact, hundreds of sites that were aggregating my content started outranking me for my own terms. For over a year, I've been working on building what I lost back and everything seemed to be coming together. I was back at 100k+ hits a day Until today... My developpers repeated the exact same error as last year. They blocked google from crawling my site for over 5 days and now I'm down to 10k se hits a day. My question : Has anyone encountered this problem before and what did you do to come back?
Intermediate & Advanced SEO | | CrakJason0 -
Should we block urls like this - domainname/shop/leather-chairs.html?brand=244&cat=16&dir=ascℴ=price&price=1 within the robots.txt?
I've recently added a campaign within the SEOmoz interface and received an alarming number of errors ~9,000 on our eCommerce website. This site was built in Magento, and we are using search friendly url's however most of our errors were duplicate content / titles due to url's like: domainname/shop/leather-chairs.html?brand=244&cat=16&dir=asc&order=price&price=1 and domainname/shop/leather-chairs.html?brand=244&cat=16&dir=asc&order=price&price=4. Is this hurting us in the search engines? Is rogerbot too good? What can we do to cut off bots after the ".html?" ? Any help would be much appreciated 🙂
Intermediate & Advanced SEO | | MonsterWeb280