Baidu Spider appearing on robots.txt

IceIcebaby

Hi, I'm not too sure what to do about this or what to think of it.

This magically appeared in my companies robots.txt file (literally magically appeared/text is below)

User-agent: Baiduspider
User-agent: Baiduspider-video
User-agent: Baiduspider-image
Disallow: /

I know that Baidu is the Google of China, but I'm not sure why this would appear in our robots.txt all of a sudden. Should I be worried about a hack? Also, would I want to disallow Baidu from crawling my companies website?

Thanks for your help,
-Reed

IceIcebaby

Thanks for your help Travis, that was a really solid answer.

Travis_Bailey

There's a possibility someone in your company saw suspicious traffic from an actor spoofing the Baidu user agent. It can get so aggressive that it will eventually bog down your response time through sheer number of requests. But the problem is that same actor, or someone else with malicious intent can simply spoof another user agent or IP.

But the main problem is, the site is straight e-commerce. It could get international business, so why take such a ham fist approach? Even if blocking Baidu gave the desired result, the dev/admin would still have to block individual IP blocks as they come in. It would make more sense to invest in server resources so it can handle the load, or look into DDos Mitigation.

So yeah, it's strange. Though it's more likely a lack of understanding than anything malicious.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Moz Q&A is closed.

Baidu Spider appearing on robots.txt

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

After hack and remediation, thousands of URL's still appearing as 'Valid' in google search console. How to remedy?

Homepage appearing instead of subpage

Will disallowing URL's in the robots.txt file stop those URL's being indexed by Google

URL Injection Hack - What to do with spammy URLs that keep appearing in Google's index?

"noindex, follow" or "robots.txt" for thin content pages

Avoiding Duplicate Content with Used Car Listings Database: Robots.txt vs Noindex vs Hash URLs (Help!)

My website (non-adult) is not appearing in Google search results when i have safe search settings on. How can i fix this?

Meta NoIndex tag and Robots Disallow

Products

Moz Solutions

Free SEO Tools

Resources

About Moz

Why Moz

Get Involved