Multiple robots.txt files on server
-
Hi!
I have previously hired a developer to put up my site and noticed afterwards that he did not know much about SEO. This lead me to starting to learn myself and applying some changes step by step.
One of the things I am currently doing is inserting sitemap reference in robots.txt file (which was not there before). But just now when I wanted to upload the file via FTP to my server I found multiple ones - in different sizes - and I dont know what to do with them? Can I remove them? I have downloaded and opened them and they seem to be 2 textfiles and 2 dupplicates. Names:
robots.txt (original dupplicate)
robots.txt-Original (original)
robots.txt-NEW (other content)
robots.txt-Working (other content dupplicate)Would really appreciate help and expertise suggestions. Thanks!
-
So what's the best policy if a site uses an e-commerce platform like Magento, which has a robots file, but also has a Wordpress blog installed to another folder. eg: /blog and uses a plugin like YOAST which generated a robots file of the Wordpress installation.
Then you have 2 robots files, is this detrimental or no big deal?
-
Thanks very much for the help!
-
Thanks very much for the help!
-
Keep a backup and remove them.
Search engines are only going to look at the file which is exactly called robots.txt variations of file name will be ignored.
Do make sure the entries are correct in the main one though, you don't want Google crawling admin pages or other confidential areas of the site.
-
Hi, thanks for the answer and help!
Well, I only have one domain that has a webpage and no subdomains active (no blog-subdomain or similar) - so how can I configure that to the situation? Can I just remove all and upload the one I want, maybe?
-
That's a good question, EMS. The robots.txt protocol can get kind of
confusing when you think about it too long, and it sounds like you've
thought about this a bit. However, in this case, it might help to
look at robots.txt from the perspective of the spider.When a spider finds a URL, it takes the whole domain name (everything
between 'http://' and the next '/'), then sticks a '/robots.txt' on
the end of it and looks for that file. If that file exists, then the
spider should read it to see where it is allowed to crawl.In your case, Googlebot, or any other spider, should try to access
three URLs: domainA.com/robots.txt, domainB.domainA.com/robots.txt,
and domainB.com/robots.txt. The rules in each are treated as
separate, so disallowing robots from domainA.com/ should result in
domainA.com/ being removed from search results while
domainB.domainA.com/ remains unaffected, which does not sound like not
something you want.The problem you might have with the setup you have described is this--
in order to keep domainB.domainA.com out of the results, you would
need to have domainB.domainA.com/robots.txt exclude robots, while
domainB.com/robots.txt welcomes them. This means that you would need
to have a way to make domainB.domainA.com/ and domainB.com/ serve
different information, and judging from what you've described, you
have not set up your server to do so yet.Of course, it is always possible that I have assumed to much about
your situation, so it is a good idea to use Google's robots.txt
analysis tool (see http://www.google.com/support/webmasters/bin/topic.py?topic=8475
) to see if your robots.txt files already produce the results you
want.If using robots.txt files doesn't solve the problem, and assuming that
you want to continue hosting all of your content on domainA.com, one
strategy you really should look into would be setting up a 301
redirect from the pages on domainB.domainA.com/ to domainB.com/ . If
you need more advice on how to do this with your server software, your
hosting company's tech support would definitely be the best place to
start, but this group is here to help if more isues arise.Hope that helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Robots.txt vs. meta noindex, follow
Hi guys, I wander what your opinion is concerning exclution via the robots.txt file.
Technical SEO | | AdenaSEO
Do you advise to keep using this? For example: User-agent: *
Disallow: /sale/*
Disallow: /cart/*
Disallow: /search/
Disallow: /account/
Disallow: /wishlist/* Or do you prefer using the meta tag 'noindex, follow' instead?
I keep hearing different suggestions.
I'm just curious what your opinion / suggestion is. Regards,
Tom Vledder0 -
301 redirect file question
Hi Everyone, I am creating a list of 301 redirects to give to a developer to put into Magento. I used Screaming Frog to crawl the site, but I have noticed that all of their urls 302 to another page. I am wondering if I should 301 the first URL to the url on the new site, or the second. I am thinking the first, but would love some confirmation. Thank you!
Technical SEO | | mrbobland0 -
301 Multiple Sites to Main Site
Over the past couple years I had 3 sites that sold basically the same products and content. I later realized this had no value to my customers or Google so I 301 redirected Site 2 and Site 3 to my main site (Site 1). Of course this pushed a lot of page rank over to Site 1 and the site has been ranking great. About a week ago I moved my main site to a new eCommerce platform which required me to 301 redirect all the url's to the new platform url's which I did for all the main site links (Site 1). During this time I decided it was probably better off if I DID NOT 301 redirect all the links from the other 2 sites as well. I just didn't see the need as I figured Google realized at this point those sites were gone and I started fearing Google would get me for Page Rank munipulation for 301 redirecting 2 whole sites to my main site. Now I am getting over 1,000 404 crawl errors in GWT as Google can no longer find the URL's for Site 2 and Site 3. Plus my rankings have dropped substantially over the past week, part of which I know is from switching platforms. Question, did I make a mistake not 301 redirecting the url's from the old sites (Site 2 and Site 3) to my new ecommerce url's at Site 1?
Technical SEO | | SLINC0 -
Googlebot does not obey robots.txt disallow
Hi Mozzers! We are trying to get Googlebot to steer away from our internal search results pages by adding a parameter "nocrawl=1" to facet/filter links and then robots.txt disallow all URLs containing that parameter. We implemented this late august and since that, the GWMT message "Googlebot found an extremely high number of URLs on your site", stopped coming. But today we received yet another. The weird thing is that Google gives many of our nowadays robots.txt disallowed URLs as examples of URLs that may cause us problems. What could be the reason? Best regards, Martin
Technical SEO | | TalkInThePark0 -
Hosting sitemap on another server
I was looking into XML sitemap generators and one that seems to be recommended quite a bit on the forums is the xml-sitemaps.com They have a few versions though. I'll need more than 500 pages indexed, so it is just a case of whether I go for their paid for version and install on our server or go for their pro-sitemaps.com offering. For the pro-sitemaps.com they say: "We host your sitemap files on our server and ping search engines automatically" My question is will this be less effective than my installing it on our server from an SEO perspective because it is no longer on our root domain?
Technical SEO | | design_man0 -
Implications of multiple page descriptions?
Are there any implications of using two page description properties, i.e. meta name="<a class="attribute-value">description</a>" and meta property="<a class="attribute-value">og:description</a>"
Technical SEO | | Nathan.Smith0 -
Delete 301 redirected pages from server after redirect is in place?
Should I remove the redirected old pages from my site after the redirects are in place? Google is hating the redirects and we have tanked. I did over 50 redirects this week, consolidating content and making one great page our of 3-10 pages with very little content per page. But the old pages are still visible to google's bot. Also, I have not put a rel canonical to itself on the new pages. Is that necessary? Thanks! Jean
Technical SEO | | JeanYates0 -
Does server location impact SEO?
Hi, I am about to purchase hosting for my WordPress site which is primarily targeting the UK and wondered if server location still has an impact on SEO? Also can anyone recommend a reliable hosting provider with CPanel?
Technical SEO | | Wallander0