Bots Gone Bad: What We Can Learn From Microsoft’s Tay Malfunction

By March 29, 2016Big Ideas, Bots, Use Cases
microsoft tay

By now you’ve probably heard the story of Tay, the AI-powered chat bot Microsoft debuted last Wednesday. Things started off innocently enough before Tay went completely off the rails, going on racist rants and tweeting her support of Adolf Hitler before Microsoft pulled the plug.

But let’s start back at the beginning. Tay was created by “mining relevant public data and by using AI and editorial developed by a staff including improvisational comedians,” according to Microsoft’s Tay website. Tay was programmed to talk like a millennial and targeted at millennials (which apparently means lots of CAPS LOCK and poor grammar, but that’s a discussion for another time).

Unfortunately for Microsoft, people other than millennials found Tay easy enough to manipulate, goading her into spouting racist and sexist rhetoric through a “repeat after me” feature. However, as Fusion reports, “some of the tweets were clearly generated by the bot itself.”

Twitter user @geraldmellor captured some of her incendiary tweets, seen below:

Tay’s racist makeover didn’t go unnoticed, and Microsoft turned her off before things got more out of hand. But in Microsoft’s blog post commenting on what happened with Tay, Corporate VP of Microsoft Research Peter Lee pointed out this wasn’t their first foray into AI—Xiaolce, a chat bot Microsoft rolled out in 2014, hasn’t had any major issues. Lee also explained they stress-tested Tay and conducted user studies before her public debut.

Still, the very public meltdown of Microsoft’s AI chat bots left some people like @geraldmellor to wonder what exactly the future of AI might be like if bots go bad. But, looking at it from another perspective, what can bot designers do help ensure bots stay good?

First, you need to remember that bots just might get exposed to negative comments, as Tay was. As Microsoft’s Lee explains, “AI systems feed off of both positive and negative interactions with people. In that sense, the challenges are just as much social as they are technical.”

Second, in accordance with Lee’s suggestion, you need to test the bots with large groups of people, and to do it often. But perhaps last and most importantly, you need to make sure they know how to understand language.

Caroline Sinders, an interaction designer at IBM Watson, put it this way in her Medium post about Tay:

“These AIs have to be trained. Tay didn’t understand “Holocaust” as the Holocaust, it understood it as a blank word or an event. And the things before it as negative, so thus if asked “Do you think the Holocaust is real?” and then being told “Yes, it’s not real” or repeat after me “BLANK PERSON is a f—ing jerk,” it teaches the corpus those phrases and reinforces them as appropriate responses.”

Whenever Microsoft does decide to rerelease Tay, they’ll need to keep in mind positive and negative interactions, large amounts of user testing and language comprehension.

Image of Microsoft store courtesy of Mike Mozart / Flickr. CC2.0 License. Edited.

Leave a Reply