Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Does Bing ignore robots txt files?
-
Bonjour from "Its a miracle is not raining" Wetherby Uk

Ok here goes... Why despite a robots text file excluding indexing to site
http://lewispr.netconstruct-preview.co.uk/ is the site url being indexed in Bing bit not Google?
Does bing ignore robots text files or is there something missing from http://lewispr.netconstruct-preview.co.uk/robots.txt I need to add to stop bing indexing a preview site as illustrated below.
http://i216.photobucket.com/albums/cc53/zymurgy_bucket/preview-bing-indexed.jpg
Any insights welcome

-
Thanks Clever PHD - we are now adding your recommendations to our preview sites

-
I know this does not sound related, but Matt Cutts explains this same situation on Google. It is probably the same reasoning for Bing.
http://www.mattcutts.com/blog/robots-txt-remove-url/
Looking at your screen shot, it looks as if all that is being shown in Bing is just the URL, no title tag, description, no other information.
What Matt says is that they did not technically crawl the url, but they are aware that it exists. Example, there is another page linking to it with related content or the anchor tag on the link relates to the keyword search you are performing.
You are searching for the URL specifically and so it makes sense that they would show the URL as it relates to that search, but they are not showing any information from the page as they do not have it as they did not spider it, again, they are just aware of the URL. Kind of like talking to a lawyer eh?
If you search for any other keywords does this excluded site show up? Probably not. If the do, then they are probably only showing the URL like in the example above.
The video has more details. Here are the solutions he gives, I will outline them as well
-
Use the Bing URL removal tool - bing bang boom. Done.
-
(my new favorite) Let the page / site be indexed but then show an noindex nofollow meta tag on the page / site. There is a subtle but important difference in the meta tag vs the robot.txt file. The spiders have to be able to crawl the page to be able to see what they are supposed to do with it.
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=93710
"When we see the noindex meta tag on a page, Google will completely drop the page from our search results, even if other pages link to it."
The thing is, if you have a robots.txt file that says don't crawl the site, then the spider never gets to the noindex meta tag to know to delete the page from the index. It sounds a little backwards, but when the page is already in the search index, you have to let the spider crawl it to then see the noindex tag so that the search engine will know to remove it from the index.
Here is what you can do as this seems to only be an issue with Bing and just with the home page. Open up the robots.txt to allow Bing to crawl the site. Restrict the crawling to the home page only and exclude all the other pages from the crawl.
On the home page that you allow Bing to crawl, add the noindex no follow meta tag and you should be set.
All of that said.
If you have a single URL listed in bing with no meta data, it may not be worth all the above effort as you are not ranking for any valuable key words, but that is your call 
It is always interesting to see how the spiders and engines think so I wanted to pass this along.
Cheers!
PS - If you have a ton of pages like this - then you just would allow Bing to crawl them all and add the noindex nofollow tag to all of them.
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Japanese URL-structured sitemap (pages) not being indexed by Bing Webmaster Tools
Hello everyone, I am facing an issue with the sitemap submission feature in Bing Webmaster Tools for a Japanese language subdirectory domain project. Just to outline the key points: The website is based on a subdirectory URL ( example.com/ja/ ) The Japanese URLs (when pages are published in WordPress) are not being encoded. They are entered in pure Kanji. Google Webmaster Tools, for instance, has no issues reading and indexing the page's URLs in its sitemap submission area (all pages are being indexed). When it comes to Bing Webmaster Tools it's a different story, though. Basically, after the sitemap has been submitted ( example.com/ja/sitemap.xml ), it does report an error that it failed to download this part of the sitemap: "page-sitemap.xml" (basically the sitemap featuring all the sites pages). That means that no URLs have been submitted to Bing either. My apprehension is that Bing Webmaster Tools does not understand the Japanese URLs (or the Kanji for that matter). Therefore, I generally wonder what the correct way is to go on about this. When viewing the sitemap ( example.com/ja/page-sitemap.xml ) in a web browser, though, the Japanese URL's characters are already displayed as encoded. I am not sure if submitting the Kanji style URLs separately is a solution. In Bing Webmaster Tools this can only be done on the root domain level ( example.com ). However, surely there must be a way to make Bing's sitemap submission understand Japanese style sitemaps? Many thanks everyone for any advice!
Technical SEO | | Hermski0 -
Why does Bing bot crawl so aggressively?
We observer that the Bing bot is crawling our site very aggressively. We set Bing's crawl control so that it should not crawl us during heavy traffic hours, but that did not change a thing. Does anyone have the problem and even better a solution?
Technical SEO | | Roverandom1 -
Is it important to include image files in your sitemap?
I run an ecommerce business that has over 4000 product pages which, as you can imagine, branches off into thousands of image files. Is it necessary to include those in my sitemap for faster indexing? Thanks for you help! -Reed
Technical SEO | | IceIcebaby0 -
Blocking Affiliate Links via robots.txt
Hi, I work with a client who has a large affiliate network pointing to their domain which is a large part of their inbound marketing strategy. All of these links point to a subdomain of affiliates.example.com, which then redirects the links through a 301 redirect to the relevant target page for the link. These links have been showing up in Webmaster Tools as top linking domains and also in the latest downloaded links reports. To follow guidelines and ensure that these links aren't counted by Google for either positive or negative impact on the site, we have added a block on the robots.txt of the affiliates.example.com subdomain, blocking search engines from crawling the full subddomain. The robots.txt file is the following code: User-agent: * Disallow: / We have authenticated the subdomain with Google Webmaster Tools and made certain that Google can reach and read the robots.txt file. We know they are being blocked from reading the affiliates subdomain. However, we added this affiliates subdomain block a few weeks ago to the robots.txt, but links are still showing up in the latest downloads report as first being discovered after we added the block. It's been a few weeks already, and we want to make sure that the block was implemented properly and that these links aren't being used to negatively impact the site. Any suggestions or clarification would be helpful - if the subdomain is being blocked for the search engines, why are the search engines following the links and reporting them in the www.example.com subdomain GWMT account as latest links. And if the block is implemented properly, will the total number of links pointing to our site as reported in the links to your site section be reduced, or does this not have an impact on that figure?From a development standpoint, it's a much easier fix for us to adjust the robots.txt file than to change the affiliate linking connection from a 301 to a 302, which is why we decided to go with this option.Any help you can offer will be greatly appreciated.Thanks,Mark
Technical SEO | | Mark_Ginsberg0 -
How can I find my Webmaster Tools HTML file?
So, totally amateur hour here, but I can't for the life of me find our HTML verification file for webmaster tools. I see nowhere to look at it in Google Webmaster Tools console, I tried a site:, I googled it, all the info out there is about how to verify a site. Ours is verified, but I need the verification file code to sync up with the Google API and no one seems to have it. Any thoughts?
Technical SEO | | healthgrades0 -
301 Redirect on a PDF, DOCX files?
Hi, I have to rename many pdf and docx files. How can I implement 301 redirect on them as they are linked from 'n' number of places? Regards, Shailendra Sial
Technical SEO | | IM_Learner1 -
Invisible robots.txt?
So here's a weird one... Client comes to me for some simple changes, turns out there are some major issues with the site, one of which is that none of the correct content pages are showing up in Google, just ancillary (outdated) ones. Looks like an issue because even the main homepage isn't showing up with a "site:domain.com" So, I add to Webmaster Tools and, after an hour or so, I get the red bar of doom, "robots.txt is blocking important pages." I check it out in Webmasters and, sure enough, it's a "User agent: * Disallow /" ACK! But wait... there's no robots.txt to be found on the server. I can go to domain.com/robots.txt and see it but nothing via FTP. I upload a new one and, thankfully, that is now showing but I've never seen that before. Question is: can a robots.txt file be stored in a way that can't be seen? Thanks!
Technical SEO | | joshcanhelp0 -
Does Google index XML files?
Does Google or other search engines include XML files in their index? More specifically, I am wondering how Google knows the difference between an xml filetype and an RSS feed.
Technical SEO | | nicole.healthline0