Do robot.txts permanently affect websites even after they have been removed?
-
A client has a Wordpress blog to sit alongside their company website. They kept it hidden whilst they were developing what it looked like, keeping it un-searchable by Search Engines. It was still live, but Wordpress put a robots.txt in place. When they were ready they removed the robots.txt by clicking the "allow Search Engines to crawl this site" button.
It took a month and a half for their blog to show in Search Engines once the robot.txt was removed.
Google is now recognising the site (as a "site:" test has shown) however, it doesn't rank well for anything. This is despite the fact they are targeting keywords with very little organic competition.
My question is - could the fact that they developed the site behind a robot.txt (rather than offline) mean the site is permanently affected by the robot.txt in the eyes of the Search Engines, even after that robot.txt has been removed?
Thanks in advance for any light you can shed on the situation.
-
No problem! Good Luck!
-
That is a very fair point. It is a completely new site and I hadn't even thought about things like the domain age. It does show up under a "site:http://www.____.com" search, I was just wondering if this is one of those things Google keeps a memory of, if that makes sense.
Thanks for your response Mike.
-
That is a very good suggestion. I'll try it (a useful URL also so thanks for sharing).
Thanks for the response Matthew.
-
I think the much more likely culprit is that it is a new site. What do you get when you enter "site:http://www._____.com" in google? If the pages are indexed, one can't blame for the robots file for lack of rank.
Good luck!
Mike
-
Have you submitted the updated robots.txt to google? This is separate from updating the sitemap. Here is a google page to help you do this.
https://support.google.com/webmasters/answer/6078399?hl=en
Best!
Matthew
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Robots.txt and Magento
HI, I am working on getting my robots.txt up and running and I'm having lots of problems with the robots.txt my developers generated. www.plasticplace.com/robots.txt I ran the robots.txt through a syntax checking tool (http://www.sxw.org.uk/computing/robots/check.html) This is what the tool came back with: http://www.dcs.ed.ac.uk/cgi/sxw/parserobots.pl?site=plasticplace.com There seems to be many errors on the file. Additionally, I looked at our robots.txt in the WMT and they said the crawl was postponed because the robots.txt is inaccessible. What does that mean? A few questions: 1. Is there a need for all the lines of code that have the “#” before it? I don’t think it’s necessary but correct me if I'm wrong. 2. Furthermore, why are we blocking so many things on our website? The robots can’t get past anything that requires a password to access anyhow but again correct me if I'm wrong. 3. Is there a reason Why can't it just look like this: User-agent: * Disallow: /onepagecheckout/ Disallow: /checkout/cart/ I do understand that Magento has certain folders that you don't want crawled, but is this necessary and why are there so many errors?
Technical SEO | | EcomLkwd0 -
Bad reviews coming next to the company website, how to remove those ??
My website name www.commonsite.com (duplicate name), if i search in google with keyword common site, next to that i'm getting mouthshut bad reviews. I tried various methods till now i didnt get any improvement. Finally my doubt is my site has 6 sitelinks in the search engine. What will happen if i delete that ?? Can i get those pages results next to my main website home page results. Please clarify my doubt about sitelinks.
Technical SEO | | MadhukarSV0 -
New website, to www or not
I was just wondering if there are any advantages to using the www instead of just the domain name for seo. Can these be elaborated on?
Technical SEO | | simvegas1 -
Robots.txt for subdomain
Hi there Mozzers! I have a subdomain with duplicate content and I'd like to remove these pages from the mighty Google index. The problem is: the website is build in Drupal and this subdomain does not have it's own robots.txt. So I want to ask you how to disallow and noindex this subdomain. Is it possible to add this to the root robots.txt: User-agent: *
Technical SEO | | Partouter
Disallow: /subdomain.root.nl/ User-agent: Googlebot
Noindex: /subdomain.root.nl/ Thank you in advance! Partouter0 -
How does robots.txt affect aliased domains?
Several of my sites are aliased (hosted in subdirectories off the root domain on a single hosting account, but visible at www.theSubDirectorySite.com) Not ideal, I know, but that's a different issue. I want to block bots from viewing those files that are accessible in subdirectories on the main hosting account, www.RootDomain.com/SubDirectorySite/, and force the bots to look at www.SubDirectorySite.com instead. I utilized the canonical meta tag to point bots away from the sub directory site, but I am wondering what will happen if I use robots.txt to block those files from within the root domain. Will the bots, specifically Google bot, still index the site at its own URL, www.AnotherSite.com even if I've blocked that directory with Disallow: /AnotherSite/ ? THANK YOU!!!
Technical SEO | | michaelj_me0 -
Robots.txt blocking site or not?
Here is the robots.txt from a client site. Am I reading this right --
Technical SEO | | 540SEO
that the robots.txt is saying to ignore the entire site, but the
#'s are saying to ignore the robots.txt command? See http://www.robotstxt.org/wc/norobots.html for documentation on how to use the robots.txt file To ban all spiders from the entire site uncomment the next two lines: User-Agent: * Disallow: /0 -
New website with slightly new urls
Hi we recently designed our website in work and changed some of the urls. the old site used to be http://www.example.ie/contact-us.htm now it's is http://example.ie/get-in-touch The problem we are having is with sitelinks (the ones auto generate in the serp) ie: about, contact us, team etc etc. Once cliked on, these OLD links are all going to 404 pages because of the change of url. Help with this would be greatly appreciated - I was thinking of blocking these old sitelinks in google web master.
Technical SEO | | GlenBOB0