Do not always use representative training data

Drunk_Car_2.jpg

On March 23, 2016, Microsoft released an AI “chat bot” named Tay on Twitter. Within a few hours, the chatbot began posting racist and inflammatory content. Microsoft ended up shutting it down within 16 hours of its launch.

How did Tay learn to be racist? It did so from humans, because it was trained on data from Twitter, a social media platform that is very conducive to virtual public brawls and drunken bar fights.

A couple of years later, Pinar Yanardag, then a postdoctoral researcher in my lab at MIT, spearheaded an Internet art project, launched as an April Fools’ Day joke, to highlight the garbage-in-garbage-out problem. Pinar first took a regular image-captioning algorithm (a program that identifies what objects are in a given image), and subjected it to a Rorscharch inkblot test: a psychological test in which people’s perceptions of random inkblots are recorded and then analyzed. Much like healthy people, the standard algorithm would see a bird, or an umbrella.

Pinar then trained a separate algorithm using text captions from a Reddit page in which people share photos of people dying in gruesome circumstances. Contrary to what some media outlets reported, we did not use actual images of gore, for ethical reasons.

We dubbed the AI Norman, after Norman Bates, the character in Alfred Hitchcock’s film Psycho, and perhaps the most iconic psychopaths in popular culture. Unsurprisingly, Norman saw things differently. Where the regular AI saw a bird, Norman saw a man getting pulled into a dough machine. And where the regular AI saw a person holding an umbrella, Norman saw a man getting shot in front of his screaming wife. If Norman were a human, he would probably be diagnosed as a psychopath!

In some ways, AI systems are like children, who innocently mimic what they learn from adults. Speaking from experience, having children can have a civilizing effect, because of, for example, pressure to be more careful about your language, lest they learn your bad habits. Perhaps we can learn to be better role models for the AIs we build, because if we raise them well, they may be kind to us in our old age.

References

  • Wakefield, J. Microsoft chatbot is taught to swear on Twitter. BBC News (2016).

  • Wakefield, J. Are you scared yet? Meet Norman, the psychopathic AI. BBC News (2018).

Previous
Previous

Design good carrots and sticks

Next
Next

Use representative training data